Amazon SageMaker Clarify

Detect bias in ML data and models, and explain model predictions

Explain how input features contributed to your model predictions in real time.

Detect potential bias during data preparation, after model training, and in your deployed model.

Identify any shifts in bias and feature importance after deployment.

Amazon SageMaker Clarify provides machine learning (ML) developers with purpose-built tools to gain greater insights into their ML training data and models. SageMaker Clarify detects and measures potential bias using a variety of metrics so that ML developers can address potential bias and explain model predictions.

SageMaker Clarify can detect potential bias during data preparation, after model training, and in your deployed model. For instance, you can check for bias related to age in your dataset or in your trained model and receive a detailed report that quantifies different types of potential bias. SageMaker Clarify also includes feature importance scores that help you explain how your model makes predictions and produces explainability reports in bulk or real time through online explainability. You can use these reports to support customer or internal presentations or to identify potential issues with your model.

How it works

Detect bias in your data and model predictions

Identify imbalances in data

With SageMaker Clarify, you can identify potential bias during data preparation without having to write your own code as part of Amazon SageMaker Data Wrangler. You specify input features, such as gender or age, and SageMaker Clarify runs an analysis job to detect potential bias in those features. SageMaker Clarify then provides a visual report with a description of the metrics and measurements of potential bias so that you can identify steps to remediate the bias. For example, in a financial dataset that contains only a few examples of business loans to one age group as compared to others, the bias metrics will indicate the imbalance so that you can address the imbalances in your dataset and potentially reduce the risk of having a model that is disproportionately inaccurate for a specific age group.

In case of imbalances, you can use SageMaker Data Wrangler to balance your data. SageMaker Data Wrangler offers three balancing operators: random undersampling, random oversampling, and SMOTE to rebalance data in your unbalanced datasets. Read our blog post to learn more.

Check your trained model for bias

After you’ve trained your model, you can run a SageMaker Clarify bias analysis through Amazon SageMaker Experiments to check your model for potential bias such as predictions that produce a negative result more frequently for one group than they do for another. You specify input features with respect to which you would like to measure bias in the model outcomes, such as age, and SageMaker runs an analysis and provides you with a visual report that identifies the different types of bias for each feature, such as whether older groups receive more positive predictions compared to younger groups.

AWS open-source method Fair Bayesian Optimization can help mitigate bias by tuning a model’s hyperparameters. Read our blog post to learn how to apply Fair Bayesian Optimization to mitigate bias while optimizing the accuracy of an ML model.

Monitor your model for bias

SageMaker Clarify helps data scientists and ML engineers monitor predictions for bias on a regular basis. Bias can be introduced or exacerbated in deployed ML models when the training data differs from the live data that the model sees during deployment. For example, the outputs of a model for predicting home prices can become biased if the mortgage rates used to train the model differ from current mortgage rates. SageMaker Clarify bias detection capabilities are integrated into Amazon SageMaker Model Monitor so that when SageMaker detects bias beyond a certain threshold, it automatically generates metrics that you can view in Amazon SageMaker Studio and through Amazon CloudWatch metrics and alarms.

Explain model predictions

Understand which features contributed the most to model prediction

SageMaker Clarify is integrated with SageMaker Experiments to provide scores detailing which features contributed the most to your model prediction on a particular input for tabular, natural language processing (NLP), and computer vision models. For tabular datasets, SageMaker Clarify can also output an aggregated feature importance chart which provides insights into the overall prediction process of the model. These details can help determine if a particular model input has more influence than expected on overall model behavior. For tabular data, in addition to the feature importance scores, you can also use partial dependence plots (PDPs) to show the dependence of the predicted target response on a set of input features of interest.

Explain your computer vision and NLP models

SageMaker Clarify can also provide insights into computer vision and NLP models. For vision models, you can see which parts of the image the models found most important with SageMaker Clarify. For NLP models, SageMaker Clarify provides feature importance scores at the level of words, sentences, or paragraphs.

Monitor your model for changes in behavior

Changes in live data can expose a new behavior of your model. For example, a credit risk prediction model trained on the data from one geographical region could change the importance it assigns to various features when applied to the data from another region. SageMaker Clarify is integrated with SageMaker Model Monitor to notify you using alerting systems such as CloudWatch if the importance of input features shift, causing model behavior to change.

Explain individual model predictions in real time

SageMaker Clarify can provide scores detailing which features contributed the most to your model’s individual prediction after the model has been run on new data. These details can help determine if a particular input feature has more influence on the model predictions than expected. You can view these details for each prediction in real time through online explainability or get a report in bulk that uses batch processing of all the individual predictions.

Use cases

Data science

Data scientists and ML engineers need tools to generate the insights required to debug and improve ML models through better feature engineering. These insights help them determine whether a model is making inferences based on noisy or irrelevant features and understand the limitations of their models and failure modes their models might encounter.

Business

The adoption of AI systems requires transparency. This is achieved through reliable explanations of the trained models and their predictions. Model explainability can be particularly important to certain industries with reliability, safety, and compliance requirements, such as financial services, human resources, healthcare, and automated transportation.

Compliance

Companies might need to explain certain decisions and take steps around model risk management. SageMaker Clarify can help detect any potential bias present in the initial data or in the model after training and help explain which model features contributed the most to an ML model’s prediction.

Customers

Bundesliga

Bundesliga Match Facts, powered by AWS, provides a more engaging fan experience during soccer matches for Bundesliga fans around the world. With Amazon SageMaker Clarify, the Bundesliga can now interactively explain what some of the key, underlying components are in determining what led the ML model to predict a certain xGoals value. Knowing respective feature attributions and explaining outcomes helps in model debugging and increasing confidence in ML algorithms, which results in higher-quality predictions.

"Amazon SageMaker Clarify seamlessly integrates with the rest of the Bundesliga Match Facts digital platform and is a key part of our long-term strategy of standardizing our ML workflows on Amazon SageMaker. By using AWS’s innovative technologies, such as machine learning, to deliver more in-depth insights and provide fans a better understanding of the split-second decisions made on the pitch, Bundesliga Match Facts enables viewers to gain deeper insights into the key decisions in each match."

Andreas Heyden, Executive Vice President of Digital Innovations, DFL Group

capcom

CAPCOM is a Japanese game company famous for game titles such as the Monster Hunter series and Street Fighter. In order to keep users' satisfaction, CAPCOM needed to assure game quality and identify possible churners and their trends.

"The combination of AutoGluon and Amazon SageMaker Clarify enabled our customer churn model to predict customer churn with 94% accuracy. SageMaker Clarify helps us understand the model behavior by providing explainability through SHAP values. With SageMaker Clarify, we reduced the computation cost of SHAP values by up to 50% compared to a local calculation. The joint solution gives us the ability to better understand the model and improve customer satisfaction at a higher rate of accuracy with significant cost savings."

Masahiro Takamoto, Head of Data Group, CAPCOM

DOMO

Domo is the Business Cloud, transforming the way business is managed by delivering Modern BI for All. With Domo, critical processes that took weeks, months, or more can now be done on-the-fly, in minutes or seconds, at unbelievable scale.

"Domo offers a scalable suite of data science solutions that are easy for anyone in an organization to use and understand. With Clarify, our customers are enabled with important insights on how their AI models are making predictions. The combination of Clarify with Domo helps to increase AI speed and intelligence for our customers by putting the power of AI into the hands of everyone across their business and ecosystems."

Ben Ainscough, Ph.D., Head of AI and Data Science, Domo

Varo

Varo Bank is a US-based digital bank and uses AI/ML to help make rapid, risk-based decisions to deliver its innovative products and services to customers.

"Varo has a strong commitment to the explainability and transparency of our ML models and we're excited to see the results from Amazon SageMaker Clarify in advancing these efforts."

Sachin Shetty, Head of Data Science, Varo Money

Resources

Video

Understand ML model predictions and biases

Webinar

Watch a 60-minute webinar on explaining ML models

Tutorial

Follow step-by-step instructions

Blog

Learn how Amazon SageMaker Clarify helps detect bias

Example notebooks

Explore code samples

Developer guide

Read the technical documentation

Whitepaper

Deep dive into bias detection and model explainability

Whitepaper

Fairness Measures for ML in Finance

What's new

Date (Newest to Oldest)
  • Date (Newest to Oldest)
No results found
1