Amazon SageMaker Clarify provides machine learning (ML) developers with purpose-built tools to gain greater insights into their ML training data and models. SageMaker Clarify detects and measures potential bias using a variety of metrics so that ML developers can address potential bias and explain model predictions.

SageMaker Clarify can detect potential bias during data preparation, after model training, and in your deployed model. For instance, you can check for bias related to age in your dataset or in your trained model and receive a detailed report that quantifies different types of potential bias. SageMaker Clarify also includes feature importance scores that help you explain how your model makes predictions and produces explainability reports in bulk or real time through online explainability. You can use these reports to support customer or internal presentations or to identify potential issues with your model.

Detect bias in your data and model predictions

Identify imbalances in data

With SageMaker Clarify, you can identify potential bias during data preparation without having to write your own code as part of Amazon SageMaker Data Wrangler. You specify input features, such as gender or age, and SageMaker Clarify runs an analysis job to detect potential bias in those features. SageMaker Clarify then provides a visual report with a description of the metrics and measurements of potential bias so that you can identify steps to remediate the bias. For example, in a financial dataset that contains only a few examples of business loans to one age group as compared to others, the bias metrics will indicate the imbalance so that you can address the imbalances in your dataset and potentially reduce the risk of having a model that is disproportionately inaccurate for a specific age group.

In case of imbalances, you can use SageMaker Data Wrangler to balance your data. SageMaker Data Wrangler offers three balancing operators: random undersampling, random oversampling, and SMOTE to rebalance data in your unbalanced datasets. Read our blog post to learn more.

Screenshot of bias metrics during data preparation in SageMaker Data Wrangler

Check your trained model for bias

After you’ve trained your model, you can run a SageMaker Clarify bias analysis through Amazon SageMaker Experiments to check your model for potential bias such as predictions that produce a negative result more frequently for one group than they do for another. You specify input features with respect to which you would like to measure bias in the model outcomes, such as age, and SageMaker runs an analysis and provides you with a visual report that identifies the different types of bias for each feature, such as whether older groups receive more positive predictions compared to younger groups.

AWS open-source method Fair Bayesian Optimization can help mitigate bias by tuning a model’s hyperparameters. Read our blog post to learn how to apply Fair Bayesian Optimization to mitigate bias while optimizing the accuracy of an ML model.

Screenshot of bias metrics in a trained model in SageMaker Experiments

Monitor your model for bias

SageMaker Clarify helps data scientists and ML engineers monitor predictions for bias on a regular basis. Bias can be introduced or exacerbated in deployed ML models when the training data differs from the live data that the model sees during deployment. For example, the outputs of a model for predicting home prices can become biased if the mortgage rates used to train the model differ from current mortgage rates. SageMaker Clarify bias detection capabilities are integrated into Amazon SageMaker Model Monitor so that when SageMaker detects bias beyond a certain threshold, it automatically generates metrics that you can view in Amazon SageMaker Studio and through Amazon CloudWatch metrics and alarms.

Screenshot of bias monitoring in SageMaker Model Monitor

Explain model predictions

Understand which features contributed the most to model prediction

SageMaker Clarify is integrated with SageMaker Experiments to provide scores detailing which features contributed the most to your model prediction on a particular input for tabular, natural language processing (NLP), and computer vision models. For tabular datasets, SageMaker Clarify can also output an aggregated feature importance chart which provides insights into the overall prediction process of the model. These details can help determine if a particular model input has more influence than expected on overall model behavior. For tabular data, in addition to the feature importance scores, you can also use partial dependence plots (PDPs) to show the dependence of the predicted target response on a set of input features of interest.

Screenshot of a feature importance graph for a trained model in SageMaker Experiments

Explain your computer vision and NLP models

SageMaker Clarify can also provide insights into computer vision and NLP models. For vision models, you can see which parts of the image the models found most important with SageMaker Clarify. For NLP models, SageMaker Clarify provides feature importance scores at the level of words, sentences, or paragraphs.

Monitor your model for changes in behavior

Changes in live data can expose a new behavior of your model. For example, a credit risk prediction model trained on the data from one geographical region could change the importance it assigns to various features when applied to the data from another region. SageMaker Clarify is integrated with SageMaker Model Monitor to notify you using alerting systems such as CloudWatch if the importance of input features shift, causing model behavior to change.

Screenshot of feature importance monitoring in SageMaker Model Monitor

Explain individual model predictions in real time

SageMaker Clarify can provide scores detailing which features contributed the most to your model’s individual prediction after the model has been run on new data. These details can help determine if a particular input feature has more influence on the model predictions than expected. You can view these details for each prediction in real time through online explainability or get a report in bulk that uses batch processing of all the individual predictions.

Use cases

Data science

Data scientists and ML engineers need tools to generate the insights required to debug and improve ML models through better feature engineering. These insights help them determine whether a model is making inferences based on noisy or irrelevant features and understand the limitations of their models and failure modes their models might encounter.

Business

The adoption of AI systems requires transparency. This is achieved through reliable explanations of the trained models and their predictions. Model explainability can be particularly important to certain industries with reliability, safety, and compliance requirements, such as financial services, human resources, healthcare, and automated transportation.

Compliance

Companies might need to explain certain decisions and take steps around model risk management. SageMaker Clarify can help detect any potential bias present in the initial data or in the model after training and help explain which model features contributed the most to an ML model’s prediction.

Customers

Bundesliga Match Facts, powered by AWS, provides a more engaging fan experience during soccer matches for Bundesliga fans around the world. With Amazon SageMaker Clarify, the Bundesliga can now interactively explain what some of the key, underlying components are in determining what led the ML model to predict a certain xGoals value. Knowing respective feature attributions and explaining outcomes helps in model debugging and increasing confidence in ML algorithms, which results in higher-quality predictions.

"Amazon SageMaker Clarify seamlessly integrates with the rest of the Bundesliga Match Facts digital platform and is a key part of our long-term strategy of standardizing our ML workflows on Amazon SageMaker. By using AWS’s innovative technologies, such as machine learning, to deliver more in-depth insights and provide fans a better understanding of the split-second decisions made on the pitch, Bundesliga Match Facts enables viewers to gain deeper insights into the key decisions in each match."

Andreas Heyden, Executive Vice President of Digital Innovations, DFL Group

Resources

Video

Understand ML model predictions and biases

Watch »

Webinar

Watch a 60-minute webinar on explaining ML models

Watch »

Tutorial

Follow step-by-step instructions

Explore »

Blog

Learn how Amazon SageMaker Clarify helps detect bias

Learn »

Example notebooks

Explore code samples

Explore »

Developer guide

Read the technical documentation

Deep dive into bias detection and model explainability

Learn more »

Whitepaper

Amazon SageMaker Clarify

Detect bias in ML data and models, and explain model predictions

How it works

Detect bias in your data and model predictions

Identify imbalances in data

Check your trained model for bias

Monitor your model for bias

Explain model predictions

Understand which features contributed the most to model prediction

Explain your computer vision and NLP models

Monitor your model for changes in behavior

Explain individual model predictions in real time

Use cases

Data science

Business

Compliance

Customers

Resources

Understand ML model predictions and biases

Watch a 60-minute webinar on explaining ML models

Follow step-by-step instructions

Learn how Amazon SageMaker Clarify helps detect bias

Explore code samples

Read the technical documentation

Deep dive into bias detection and model explainability

Fairness Measures for ML in Finance

What's new

Amazon SageMaker Clarify

Detect bias in ML data and models, and explain model predictions

How it works

Detect bias in your data and model predictions

Identify imbalances in data

Check your trained model for bias

Monitor your model for bias

Explain model predictions

Understand which features contributed the most to model prediction

Explain your computer vision and NLP models

Monitor your model for changes in behavior

Explain individual model predictions in real time

Use cases

Data science

Business

Compliance

Customers

Resources

Understand ML model predictions and biases

Watch a 60-minute webinar on explaining ML models

Follow step-by-step instructions

Learn how Amazon SageMaker Clarify helps detect bias

Explore code samples

Read the technical documentation

Deep dive into bias detection and model explainability

Fairness Measures for ML in Finance

What's new

Ending Support for Internet Explorer