For most of us, technology has become so central to our daily lives that we don’t question its accuracy. We rely on our smartphone maps to show us how to navigate through unfamiliar territory. We depend on our spreadsheet programs to handle complex mathematical calculations.
Not surprisingly, many organizations expect artificial intelligence and machine learning to objectively predict outcomes and guide their decision-making when facing complex business and social issues.
Over the past several years, however, that assumption has started to come into question. Many machine learning systems are not robust enough to handle real-world complexities, including biases that can lead to inequitable results. While research on creating fair, accountable, and transparent systems is now a vibrant subfield in the machine learning community, there is work that must be done to connect state-of-the-art research trends to emerging regulations (for example, the EU AI Act) and compliance with those regulations.
Keystone has found in its practice that unfair outcomes can be driven by three common sources of bias in developing machine learning systems:
Each of these requires some explaining to understand, so let’s take them one at a time.
Bias encoded in data
Bias can also be encoded in data. For example, a machine learning model that is trained to predict the likelihood that an inmate being released is likely to commit a crime in the future (i.e., recidivism risk) uses arrest data because ground truth data doesn’t exist on who commits crimes. However, research shows that arrest data reflects biases due to historical prejudices.
Another case of such bias: Word embeddings are a core component of Natural Language Processing (NLP) models, which are mathematical representations of words that allow computers to capture semantic relationships. For example, word embeddings trained on large amounts of text data outputs that king – man + woman = queen, which demonstrates that embeddings can capture gender relationships. But what happens when the words are “doctor,” “man,” “woman,” and “nurse?” Given the way biases are reflected in the real world, it wouldn’t be surprising that machine learning systems built on top of word embeddings reflect the same biases.
Bias in the learning process
The standard machine learning pipeline optimizes for accuracy across all populations in the training or validation dataset. The effect of this is that, naturally, the machine learning model weighs the majority population more than a minority population in the data, leading to a model that is more accurate for members of the majority population. This can be especially problematic when different populations have different relationships with the label (i.e., outcome) that the machine learning model is trying to predict.
Bias in the action-feedback sequence
Finally, many machine learning applications deployed in the real world have a feedback loop. Feedback loops occur when the training data used to train a model depends on the output or action of a model, which may further depend on the actions of users that engage with the model. For example, users of a social media platform are only given the opportunity to engage with the content that is recommended by a machine learning model, which is then used by the model to learn engagement patterns and further recommend new content based on these learnings. Without the necessary interventions in place, recommender systems can rapidly zero in on narrow taste clusters of users and lead to issues such as echo chambers and filter bubbles.
Regulatory considerations of AI/ML bias
The challenge in identifying and mitigating bias in machine learning systems is that bias is often nuanced and hard to define. It is crucial, however, especially as there are more and more machine learning systems being deployed into production at scale. These systems are being used in such critical applications as providing healthcare, determining access to credit, and informing sentencing decisions. While there has been significant progress in creating frameworks and principles for what is called “Responsible AI,” companies must now start to operationalize these principles for their specific context and use case.
Keystone believes that there are four essential requirements to ensure fair and responsible machine learning systems:
Implementing Responsible AI is crucial for all organizations that currently deploy machine learning systems with real-world implications or plan to do so soon. Should you need assistance in evaluating biases in machine learning systems or developing and operationalizing responsible machine learning systems, Keystone is happy to discuss these issues. Contact us through our website, www.keystone.ai.