How to remove and detect bias to keep AI fair
Humans may be biased, but that doesn’t mean AI has to be. Algorithms learn how to behave mainly based on the kind of data they are fed. If the data has underlying bias characteristics – which are more prevalent than you might think – the AI model will learn to act on them. Despite best intentions, data scientists can easily let these biases creep in and build up over time – unless of course they are vigilant about keeping their models as fair as possible.
The problems with data
Some types of data have a higher potential to be used (likely inadvertently) to discriminate against certain groups – such as information on race, gender, or religion. But seemingly “safe” data such as someone’s zip code can also be used by AI to form biased opinions.
For example, if a bank typically doesn’t approve many loans to people in a minority neighborhood, it could learn not to market loans to other people fitting the characteristics of that zip code – thus introducing racial bias into the AI model through a back door. So even with race out of the equation, AI could still find a way to discriminate without the bank realizing it was happening.
Businesses need to carefully scrutinize the data ingested by AI. If they don’t, irresponsibly-used AI can proliferate, creating unfair treatment of certain populations – like unduly limiting loans, insurance policies, or product discounts to those who really need them. This isn’t just ethically wrong, it becomes a serious liability for organizations that are not diligent about preventing bias in the first place.
When bias gets real
Over the past year, several high-profile incidents have highlighted the risks of unintentional bias and the damaging effect it can have on a brand and its customers, especially in the current climate when so many companies are struggling to stay afloat. Discrimination comes at a cost: lost revenue, loss of trust among customers, employees, and other stakeholders, regulatory fines, damaged brand reputation, and legal ramifications.
For instance, the American criminal justice system relies on dozens of algorithms to determine a defendant’s propensity to become a repeat offender. In one several years ago, analyzed Northpointe’s Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) tool. It revealed that African American defendants were much more likely to be incorrectly categorized as having a higher risk of becoming a repeat offender, while white defendants were inaccurately judged as having a lower risk. The problem was that the algorithm relied on data that already existed in the justice system and was inherently biased against African Americans. This kind of built-in bias is more prevalent than people realize. Particularly during these times of heightened awareness of social injustice, organizations must make a concerted effort to monitor their AI to ensure that it’s fair.
How to identify and prevent bias
Marketing to specific groups with similar characteristics is not always necessarily biased. For instance, sending offers to parents of young children with promotions on diapers, college savings plans, life insurance, etc. is perfectly acceptable if the company is offering something of value to that group. Organizations shouldn’t waste marketing dollars on a target group that will have no interest or reason to purchase their products. Similarly, targeting senior citizens with Medicare health plans or a new retirement community is fine, as long the company is promoting something relevant and useful. This is not discrimination, it’s just smart marketing.
But targeting groups can quickly become a slippery slope. The onus is on organizations to incorporate bias detection technology into all AI models, particularly in regulated industries, like financial services and insurance where the ramifications of non-compliance can be severe. Bias detection should not just be done quarterly or even monthly. Companies must continuously monitor their self-learning AI models 24 x 7 to proactively flag and eliminate discriminatory behavior.
To prevent built-in biases, companies should start with “clean data sources” for building models. Class or behavioral identifiers like education level, credit score, occupation, employment status, language of origin, marital status, number of followers, etc. may also be inherently biased in certain situations. Organizations cannot always identify these issues without the help of technology built to watch for them.
Evaluating AI training data and simulating real-world scenarios before deploying AI will help flag potential biases before any damage can be done. This becomes even more critical with many fashionable flavors of machine learning, as potentially powerful opaque algorithms (algorithms that struggle to explain themselves) can easily conceal built-in biases.
Another trap to avoid is detecting bias only at the level of a predictive model and assuming it will all work out. But that fails to detect bias as a result of the interplay between multiple models (sometimes hundreds) and business rules that make up the company’s customer strategy. Bias should be tested at the final decision a company makes with regard to a customer, not just the underlying propensities.
While organizations need to protect their brand and treat their customers and prospects with respect, bias detection cannot be done manually at scale. Technology exists today that can help companies address the problem head-on and at a scale that no group of humans could ever hope to achieve. An always-on strategy for preventing bias in AI is critical as customers and media have become increasingly sensitive to the issue and will not be very forgiving.
Dr Rob Walker is vice president of decision management and analytics at Pegasystems