Machine Learning Needs Bias Training to Overcome Stereotypes

It’s no secret that there is a wide gender gap in the tech industry. According to the Center for the Study of the Workplace, women represent around 20 percent of engineering graduates, but just 11 percent of practicing software engineers. Unconscious bias is one of the primary drivers of this disparity, which has led many of Silicon Valley’s leading tech companies to introduce unconscious bias training to their employees. However, it’s fair to say that its machine learning algorithms need it more.

What is unconscious bias?

In humans, unconscious biases are ingrained assumptions about particular personal attributes (including race or gender) that can influence decision making without the decision maker being explicitly aware. These biases are universal because they are the result of “mental shortcuts” people make based on social norms and stereotypes - this group is like X, that group does Y.

Numerous studies, such as this one from the UNC Kenan-Flagler Business School, have shown that unconscious biases have a significant influence on important decisions, such as hiring or promotions. In an effort to improve diversity and create a more welcoming work environment, companies are working hard to train their employees about unconscious bias, its implications and how to counteract it.

More From Entrepreneur.com

For example, Google put a majority of its employees through workshops on how to understand and stop unconscious bias, and Facebook developed an internal training course called Managing Unconscious Bias, which it released to the public.

Can an algorithm be biased?

While these companies are taking admirable and necessary measures to actively educate its tech employees about unconscious biases, the systems they are building still seem vulnerable.

Consider personalized online advertising.

Carnegie Mellon University conducted experiments on Google ads, which found that significantly fewer women than men were shown online ads promising to help them get jobs paying more than $200,000. According to CMU, this raised questions about the fairness of targeting ads online, as a gender bias was clear.

Racial bias is also an issue. Latanya Sweeney, the former chief technologist at the Federal Trade Commission, uncovered racial bias on the basis of Google searches, according to a report in The Nation. Sweeney found that black-identifying names yielded a higher incidence of ads associated with “arrest” than white-identifying names.

In neither of these cases did a programmer sit down, and write an explicitly sexist or racist algorithm. Instead, these biases are the work of machine learning algorithms that learn patterns automatically on the basis of the large data sets they are presented with. Just like humans do. And just like humans, machine learning algorithms are susceptible to developing biases which, if not explicitly checked for and corrected, lead to discriminatory behavior.

How can an algorithm be biased?

There are many potential reasons why machine learning systems can learn discriminatory biases.

One is selection-bias in the training data. If the model is trained on a dataset that is not representative of the population, then it will make poor general inferences. For example, the miscategorization of a black man by Google Photos in 2015 led many to question whether the algorithm’s training data had predominately comprised white people.

Hidden variables are another factor. It might seem possible to avoid biased machine learning algorithms by making sure you don’t feed in data that could lead to such problems in the first place. If you remove race and gender data from the equation, how can the bias prevail?

Earlier this year, a study of Amazon Prime showed that predominantly black zipcode areas were conspicuously denied same-day delivery. Amazon won’t reveal the details of how it determines eligibility for same-day delivery, but the company almost certainly does not feed “race” into its models explicitly. The problem is most likely that race turned out to be a hidden variable behind the model, meaning there were other reasons why Amazon’s model excluded the zipcodes and race turned out to be highly correlated with those reasons.

Third, machine learning systems can discriminate by perpetuating existing social biases. Biases run rampant in our society. We know that women are heavily under-represented in the board room, and there are significant racial wealth gaps. If you train a machine learning algorithm on real data from the world we live in, it will pick up on these biases. And to make matters worse, such algorithms have the potential to perpetuate or even exacerbate these biases when deployed.

What can be done?

As machine learning expands into sensitive areas - such as credit scoring, hiring and even criminal sentencing -- it is imperative that we are careful and vigilant about keeping the algorithms fair.

Accomplishing this goal requires raising awareness about social biases in machine learning and the serious, negative consequences that it can have. Just as tech employees are educated about the negative implications of their own unconscious biases, so should they be educated about biases in the models they are building.

It also requires companies to explicitly test machine learning models for discriminatory biases and publish their results. Useful methods and datasets for performing such tests should be shared and reviewed publicly to make the process easier and more effective.

As an industry we need more research into how machine learning algorithms can be trained to avoid undesirable social biases. This issue is a relatively new phenomenon, and the examples we have seen so far are just the tip of the iceberg. More research needs to be done to better understand the problem and determine what technical solutions can be deployed to minimize the risk of unconscious bias creeping into machine learning systems.

It’s time for the risks of social bias to be embedded deeply in data science codes of ethics and education.

Recommended Videos

Recommended Articles

Weightlifter accused of collecting $100K in disability while showing off in the gym online

Police say Canadian woman slapped teen over Trump and ICE clothing before ICE detained her

Alabama sheriff details gruesome murders of veteran and wife inside their own home

Family stopped alleged child kidnapper as police say staffing shortage delayed response up to 40 minutes

Trump AG pick Todd Blanche faces Senate test and more top headlines

'The wind and drift patterns tell a story,' Cajun Navy official says of missing Mississippi teen found dead

Dem lawmaker says Constitution barred police from stopping her after alleged 100 mph speeding

Illegal immigrant sentenced after fiery California semitruck crash killed 3

ICE officer hailed as hero after saving driver injured in Missouri crash

Bank robbery suspect allegedly steals kitten, asks employee to hold it before demanding cash

Multiple rescued from pontoon boat near Alcatraz as search continues for missing persons

Man fleeing ICE officers in Florida is struck and killed by tractor-trailer, police say

California suspect's alleged $30K meth stash seized after he left backpack in rideshare: police

74-year-old becomes Florida's oldest inmate executed after 1982 killing of teen

Massive T. rex fossil roars to $50 million sale, becoming one of the priciest ever auctioned

Rescuers work to save nearly 50 bottlenose dolphins in historic Cape Cod stranding

'Wonderful' CA nurse and mother vanishes after going to casino, car found abandoned at mountain park: family

Arizona man accused of killing woman after social media first date ends in desert grave

Florida men indicted in armed home invasion where suspect posed as utility worker, 13-year-old zip-tied: DOJ

California inmate who spent 45 years on death row for murdering pregnant wife dies

'Gutfeld!': Make daylight saving time permanent

Seattle warehouse fire causes partial roof collapse

Richard Holliday, Ben Bishop, Jack Vaughn and Tommy Invincible end show with 'Curtain Call'

Jay Clayton set for DNI confirmation hearing

Trump's NATO envoy: Europe is finally stepping up on defense

Rep John McGuire introduces bill to tighten birthright citizenship

Todd Blanche faces tough AG confirmation hearing

Dem senate candidate MOCKED for 'AWKARD' campaign speech

Mourners gather around murdered American mother Jamey Carney's hearse outside Killarney cathedral

Texas imam faces House condemnation after Graham's death celebration post

These states charge the highest vape taxes in America

Med school official STUNS when pressed on 'pregnant women'

Senators push to pass Graham-backed Russia sanctions bill

Jon Buehler defends Scott Peterson murder conviction amid new evidence claims

Sen McCormick on powering America's defense through innovation

Sen McCormick on powering America's defense through innovation

Rep. Riley Moore calls for end to 'scam' visa programs replacing American workers

Texas Hill Country experiences heavy rain and flash flooding

ICE temporarily pauses vehicle stops after officer-involved shootings

Researchers hit milestone in quest for flaky pastries with less saturated fat

What is unconscious bias?

Can an algorithm be biased?

How can an algorithm be biased?

What can be done?