AWS SageMaker

AWS SageMaker

AWS SageMaker
SageMaker is a fully managed machine-learning service that allows you to quickly build, train, deploy, and maintain machine learning (ML).
SageMaker takes the guesswork out of each step in the machine learning process, making it easier to create high-quality models.
SageMaker is designed to be highly available with no maintenance windows and scheduled downtimes
SageMaker APIs are run in Amazon’s high-availability data centres. Service stack replication is configured across three facilities in each AWS Region to provide fault tolerance in case of a server outage or server failure.
SageMaker offers a complete workflow, but users can still use their existing tools with SageMaker.
SageMaker supports Jupyter notebooks
SageMaker lets users select the type and number of instances used for the hosted notebook, training, and model hosting. SageMaker Machine LearningGenerate examples data
This involves preprocessing or “wrangling” example data in order to use it for model training.
Preprocessing data involves the following: Fetch the data
Clean the data
Prepare or transform the dataTrain and model
Training a model includes both training and evaluating it.
Training the model requires an algorithm. This depends on many factors.
Training requires compute resources
Evaluation of the modeldetermines if the inferences are accurate. File mode vs pipe mode
Optimized protobuf recordIO format is the best for training data.
RecordIO format allows algorithms use Pipe mode to train the algorithms that support it.
File mode loads all data from S3 to the training volume volumes
Pipe mode allows you to access the data for training job streams directly from S3.
Streaming can speed up the start of training jobs and provide better throughput.
Reduce the EBS volume size for training instances with Pipe mode. Pipe mode also requires less disk space. You can store your final model artifacts in Pipe mode.
File mode requires disk space to store both final model artifacts as well as the entire training dataset.
SageMaker has a number of machine learning algorithms built in that can be used to solve a variety problems.
SageMaker can be used to create a custom training script using a machine learning framework.
SageMaker allows you to bring your own model or algorithm to train or host in SageMaker. SageMaker also provides Docker images for the built-in algorithms and the deep learning frameworks that are used for inference and training.
Machine learning algorithms can be trained quickly and deployed reliably at any scale by using containers.
Use an algorithm you have subscribed to from AWS Marketplace.
Re-engineer a model, before integrating it to an application and deploying it.
Supports both batch transformHosting and hosting services
Provides an HTTPS endpoint that allows the machine learning model to be used to make inferences.
Supports Canary deployment with ProductionVariant, and multiple models can be deployed to the SageMaker HTTPS endpoint.
Automatic scaling is available for production variants. Automatic scaling dynamically adjusts how many instances are provisioned for a specific production variant in response o your workload
Batch transform
Batch transform is an alternative to hosting services for inferences on whole datasets.
SageMaker Security
SageMaker makes sure that ML model artifacts as well as other system artifacts, are encrypted at rest and transit.
SageMaker supports encrypted S3 buckets to store model artifacts or data. SageMaker notebooks, training jobs and endpoints can also pass a KMS key for SageMaker notebooks to enable encryption of the attached ML storage volume.
Secure (SSL) connections are used to send requests to the SageMaker API or console.
SageMaker stores can code in ML stora