In 2017, society started taking AI bias seriously

In 2017, society started taking AI bias seriously
From Engadget - December 21, 2017

Now, we are seeing concrete pushback. The New York City Council recently passed what may be the US' first AI transparency bill, requiring government bodies to make public the algorithms behind its decision making. Researchers have launched new institutes to study AI prejudice (along with the ACLU) while Cathy O'Neil, author of Weapons of Math Destruction, launched an algorithmic auditing consultancy called ORCAA. Courts in Wisconsin and Texas have started to limit algorithms, mandating a "warning label" about its accuracy in crime prediction in the former case, and allowing teachers to challenge their calculated performance rankings in the latter.

"2017, perhaps, was a watershed year, and I predict that in the next year or two the issue is only going to continue to increase in importance," said Arvind Narayanan, an assistant professor of computer science at Princeton and data privacy expert. "What has changed is the realization that these are not specific exceptions of racial and gender bias. It's almost definitional that machine learning is going to pick up and perhaps amplify existing human biases. The issues are inescapable."

Narayanan co-authored a paper published in April analyzing the meaning of words according to an AI. Beyond their dictionary definitions, words have a host of socially constructed connotations. Studies on humans have shown they more quickly associate male names with words like "executive" and female names with "marriage" and the study's AI did the same. The software also perceived European American names (Paul, Ellen) as more pleasant than African American ones (Malik, Shereen).

The AI learned this from studying human texts -- the "common crawl" corpus of online writing -- as well as Google News. This is the basic problem with AI: Its algorithms are not neutral, and the reason they are biased is that society is biased. "Bias" is simply cultural meaning, and a machine cannot divorce unacceptable social meaning (men with science; women with arts) from acceptable ones (flowers are pleasant; weapons are unpleasant). A prejudiced AI is an AI replicating the world accurately.

"Algorithms force us to look into a mirror on society as it is," said Sandra Wachter, a lawyer and researcher in data ethics at London's Alan Turing Institute and the University of Oxford.

For an AI to be fair, then, it needs to not to reflect the world, but create a utopia, a perfect model of fairness. This requires the kind of value judgments that philosophers and lawmakers have debated for centuries, and rejects the common but flawed Silicon Valley rhetoric that AI is "objective." Narayanan calls this an "accuracy fetish" -- the way big data has allowed everything to be broken down into numbers which seem trustworthy but conceal discrimination.

The datafication of society and Moore's Law-driven explosion of AI has essentially lowered the bar for testing any kind of correlation, no matter how spurious. For example, recent AIs have tried to examine, from a headshot alone, whether a face is gay, in one case, or criminal, in another.

Then there was AI that sought to measure beauty. Last year, the company Beauty.AI held an online pageant judged by algorithms. Out of about 6,000 entrants, the AI chose 44 winners, the majority of whom were white, with only one having apparently dark skin. Human beauty is a concept debated since the days of the ancient Greeks. The idea that it could be number-crunched in six algorithms measuring factors like pimples and wrinkles as well as comparing contestants to models and actors is nave at best. Deeply human questions were at play -- what is beauty? Is every race beautiful in the same way? -- which the scientists alone were ill-equipped to wrestle with. So instead, perhaps unwittingly, they replicated the Western-centric standards of beauty and colorism that already exist.

The major question for the coming year is how to remove these biases.

First, an AI is only as good as the training data fed into it. Data that is already riddled with bias -- like texts that associate women with nurses and men with doctors -- will create a bias in the software. Availability often dictates what data gets used, like the 200,000 Enron emails made public by authorities while the company was prosecuted for fraud that reportedly have since been used in fraud detection software and studies of workplace behavior.

Second, programmers must be more conscious of biases while composing algorithms. Like lawyers and doctors, coders are increasingly taking on ethical responsibilities except with little oversight. "They are diagnosing people, they are preparing treatment plans, they are deciding if somebody should go to prison," said Wachter. "So the people developing those systems should be guided by the same ethical standards that their human counterparts have to be."


Continue reading at Engadget »