SageMaker machine learning updates shared at AWS re:Invent
Amazon Web Services, Inc. (AWS) has announced new capabilities for its machine learning service, Amazon SageMaker. The announcements, made at the virtual conference AWS re:Invent, bring together powerful capabilities like faster data preparation, a purpose-built repository for prepared data, workflow automation, and greater transparency into training data to mitigate bias and explain predictions.
Amazon SageMaker removes challenges from each stage of the machine learning process, making it easier and faster for developers and data scientists to build, train, and deploy machine learning models.
“Hundreds of thousands of everyday developers and data scientists have used our industry-leading machine learning service, Amazon SageMaker, to remove barriers to building, training, and deploying custom machine learning models. One of the best parts about having such a widely-adopted service like SageMaker is that we get lots of customer suggestions which fuel our next set of deliverables,” said Swami Sivasubramanian, Vice President, Amazon Machine Learning, Amazon Web Services, Inc.
“Today, we are announcing a set of tools for Amazon SageMaker that makes it much easier for developers to build end-to-end machine learning pipelines to prepare, build, train, explain, inspect, monitor, debug, and run custom machine learning models with greater visibility, explainability, and automation at scale.”
Amazon SageMaker is already being used by leading companies to accelerate their machine learning deployments, including 3M, AstraZeneca, Bayer, Capital One, Cerner, Fidelity Investments, GE Healthcare, JPMorgan Chase, Lenovo, T-Mobile, Thomson Reuters, and Vanguard.
New machine learning capabilities
The announcements made at AWS re:Invent included:
• Data Wrangler
Data Wrangler simplifies the process of data preparation and feature engineering. With Amazon SageMaker Data Wrangler, customers can choose the data they want from their various data stores and import it with a single click. Amazon SageMaker Data Wrangler contains over 300 built-in data transformers that can help customers normalize, transform, and combine features without having to write any code while managing all of the processing infrastructures under the hood.
• Feature Store
Feature Store provides a new repository that makes it easy to store, update, retrieve, and share machine learning features. This makes it simple and easy to organize and update large batches of features for training and smaller instantiations of them for inference. That way, there’s one consistent view of features for machine learning models to use and it becomes significantly easier to generate models that produce highly accurate predictions.
Pipelines is the first purpose-built, easy-to-use continuous integration and continuous delivery (CI/CD) service for machine learning. Developers can define each step of an end-to-end machine learning workflow. These workflows include the data-load steps, transformations from Amazon SageMaker Data Wrangler, features stored in Amazon SageMaker Feature Store, training configuration and algorithm set up, debugging steps, and optimization steps. Workflows can be shared and re-used between teams, either to recreate a model or to act as a starting point for making improvements.
Amazon SageMaker Clarify provides bias detection across the machine learning workflow, enabling developers to build greater fairness and transparency into their machine learning models. Developers can more easily detect statistical bias across the entire machine learning workflow and provide explanations for predictions their machine learning models are making.
• Deep Profiling for Amazon SageMaker Debugger
Deep Profiling for Amazon SageMaker Debugger now enables developers to train their models faster by automatically monitoring system resource utilization and providing alerts for training bottlenecks.
• Distributed Training on Amazon SageMaker
New Distributed Training on Amazon SageMaker makes it possible to train large, complex deep learning models up to two times faster than current approaches.
• Edge Manager
Edge Manager allows developers to optimize, secure, monitor, and maintain machine learning models deployed on fleets of edge devices. This extends capabilities that were previously only available in the cloud by sampling data from edge devices and sending it to Amazon SageMaker Model Monitor for analysis, so developers can continuously improve model quality.
JumpStart provides developers an easy-to-use, searchable interface to find solutions, algorithms, and sample notebooks. Developers new to machine learning will be able to select from complete end-to-end machine learning solutions (e.g. fraud detection, customer churn prediction, or forecasting) and deploy them directly in their Amazon SageMaker Studio environments.
HPE Acquires Determined AI to Accelerate ML Training
Determined AI is a four-year-old company, which only brought its product to market in 2020. It specialises in machine learning (ML), with the aim of training AI models quickly and at any scale. HPE will combine Determined AI’s unique software solution with its AI and high-performance computing (HPC) offerings to enable ML engineers to easily implement and train ML models to provide faster and more accurate insights from their data in almost every industry.
“As we enter the Age of Insight, our customers recognise the need to add machine learning to deliver better and faster answers from their data,” said Justin Hotard, senior vice president and general manager, HPC and Mission Critical Solutions (MCS), HPE. “AI-powered technologies will play an increasingly critical role in turning data into readily available, actionable information to fuel this new era. Determined AI’s unique open source platform allows ML engineers to build models faster and deliver business value sooner without having to worry about the underlying infrastructure. I am pleased to welcome the world-class Determined AI team, who share our vision to make AI more accessible for our customers and users, into the HPE family.”
Delivery AI at scale
According to IDC, the accelerated AI server market, which plays an important role in providing targeted capabilities for image and data-intensive training, is expected to grow by 28% each year and reach $18bn by 2024.
The computing power of HPC is also increasingly being used to train and optimise AI models, in addition to combining with AI to augment workloads such as modeling and simulation. Intersect360 Research notes that the HPC market will grow by more than 40%, reaching almost $55bn in revenue by 2024.
“Over the last several years, building AI applications has become extremely compute, data, and communication intensive. By combining with HPE’s industry-leading HPC and AI solutions, we can accelerate our mission to build cutting edge AI applications and significantly expand our customer reach.” said Evan Sparks, CEO of Determined AI.