SageMaker has built-in algorithms
BlazingText algorithm
Provides highly optimized implementations for Word2vec or text classification algorithms.
Word2vec algorithm is useful for many downstream natural-language processing (NLP), tasks such as sentiment analysis and named entity recognition. It can also be used to translate.
Map words to high-quality distributed vectors whose representation is known as word embeddings
Word embeddings capture semantic relationships between words.
Application that perform web searches, information retrieval and ranking, as well as document classification, must use text classification.
Provides the Skip-gram and continuous Bag-of-Words (CBOW), training architecturesDeepAR Forecasting algorithm
This algorithm uses recurrent neural network (RNN) to forecast scalar (one dimension) time series. It is supervised.
Use the trained model to forecast new time series similar to those it has been trained on. Factorization machine
This algorithm is general-purpose supervised and can be used for both regression and classification tasks.
Extension of a linear model to capture interactions among features in high-dimensional sparse datasets economicallyImage classification algorithm
A supervised learning algorithm that supports multilabel classification
Takes an image and outputs one or several labels
Uses a convolutional neural net (ResNet), which can be trained from scratch, or used for transfer learning when there are not enough training images.
Apache MXNet RecordIO is the preferred input format. IP Insights supports raw images in.jpg and.png formats.
This algorithm learns the usage patterns of IPv4 addresses using an unsupervised learning algorithm.
This algorithm is used to identify associations between IPv4 addresses, various entities, such user IDs and account numbersK-means algorithms.
Clustering is achieved using an unsupervised learning algorithm
This algorithm attempts to find discrete groups within data. It does this by ensuring that members of a group are as close as possible to one another, and as far as possible from other members.
This algorithm is index-based.
Uses a non-parametric method of classification or regression.
The algorithm queries the k closest points to the sample point for classification problems and returns the most commonly used label of their class as a predicted label.
Regression problems are solved by the algorithm querying the k nearest points to the sample point and returning the average of their feature value as the predicted value.
This algorithm is unsupervised and tries to describe a collection of observations as a mixture or distinct categories.
Linear Learner is used to find a user-specified amount of topics shared between documents in a text corpus.
These supervised learning algorithms are used to solve either regression or classification problemsNeural Topic Model (NTM Algorithm)
This algorithm uses unsupervised learning to organize a corpus into topics that contain word groups based on their statistical distribution.
Topic modeling can be used to classify or summarize documents based on the topics detected or to retrieve information or recommend content based on topic similarities.Object2Vec algorithm
This is a general-purpose neural embedding method that is highly customizable
It can be used to detect low-dimensional dense embeddedments of high-dimensional objects.
A single deep neural network detects and classifies objects in images.
This algorithm uses images as input to learn and identifies every instance of objects in the image scene.
This unsupervised machine-learning algorithm attempts to reduce the number of features in a dataset, while still maintaining as much information as possible. Random Cut Forest (RCF).
It is an unsupe
