Amazon SageMaker Studio, announced by CEO Andy Jassy on the second day of the AWS re:Invent conference, is envisaged as unifying all the tools needed for machine learning. Several other SageMaker products were launched alongside it.
SageMaker made its debut at re:Invent in 2017 as:a fully managed service for the machine learning (ML) process. It was upgraded at last year’s conference, which saw the addition of SageMaker Ground Truth, an automated data-labeling service.
SageMaker Studio is intended to make building models significantly more accessible to a wider range of developers. It is a web-based IDE for complete machine learning workflows which is designed to allow developers to build, train, tune and deploy their models from a single interface and to provide a single place for all ML tools and results.
- Organize, search and access notebooks, datasets, code and settings
- Create project folders to organize ML projects
- Share projects, folders and their contents
- Discuss notebooks and their results collaboratively
Jassy also announced SageMaker Notebooks (currently in preview), a managed service that lets you easily create and share Jupyter notebooks without having to provision instances yourself as they will be automatically provisioned when needed. This also means you can quickly switch from one hardware configuration to another.
The other new Amazon SageMaker products provided as fully managed services are:
- Experiments – Enables you to organize, track and compare thousands of ML jobs: these can be training jobs, or data processing and model evaluation jobs run with Amazon SageMaker Processing.
- Debugger – Enables you to debug and analyze complex training issues, and receive alerts. It automatically introspects your models, collects debugging data, and analyzes it to provide real-time alerts and advice on ways to optimize your training times, and improve model quality. All information is visible as your models are training.
- Model Monitor – Enables you to detect quality deviations for deployed models, and receive alerts. For example you can easily visualize issues like data drift that could be affecting your models.
- Autopilot – Enables you to build models automatically with full control and visibility. Algorithm selection, data preprocessing, and model tuning are taken care automatically, as well as all infrastructure.
Speaking at re:Invent about the way SageMaker Studio automates the whole process, Andy Jassy said:
“With AutoML, here’s what happens: You send us your CSV file with the data that you want a model for – you can just point to the S3 location and Autopilot does all the transformation of the model … it selects the right algorithm, and then it trains 50 unique models with [slightly] different configurations of the various variables because you don’t know which ones are going to lead to the highest accuracy. SageMaker Studio [then gives you] a model leaderboard where you can see all 50 models ranked in order of accuracy … and there is a notebook underneath every single one of these models, so that when you open the notebook, it has all the recipe of that particular model.”
This certainly promises to take the hard work out of machine learning, making it much more accessible, while at the same time still giving the user some control over the process and a good deal of feedback to enable them to understand and work with the results.