AI is not biased, nor is training data. Your questions are

Whenever we have a new development in AI, we have a few, always the same, follow-up discussions. Here they are:

You can notice that the last one is not posed as a question. Being of the opinion that new models are biased became accepted position, which vagueness allows it to be always true and usually one example is enough to prove it. As you can tell from my negative tone I am not a fan of this statement, and while I am not necessarily saying it is wrong, I think using it usually comes from interpreting AI output of ill-posed questions.

AI bias, training-data bias, question bias

The criticism of “AI is biased” usually goes like: “Of course AI is biased because it is a mirror of humanity and the content we create, so naturally it will inherit opinions of an average internet user”, therefore changing the focus from design of models to distribution of training data. In many cases this discussion is a valid one. If a model was designed to perform task X, and we use it to perform that task directly, we should expect the model to be well calibrated (matching model’s confidence with probability of correct answer). Good example of that is a study5, where authors test models for detecting AI-generated text and uncover big gap in performance for english and non-english inputs. Generally speaking, if distribution of training data doesn’t match distribution of test data, usually bad things will happen, some of which we can call “bias”, and this should definitely be fixed. This is not the case I want to discuss here.

More interesting case is when the task of the model is less straightforward, like with generative models. This usually happens, when the form of model output gives an unfortunate opportunity to the model’s user to interpret it’s output in more abstract way. Examples come in multiple domains:

Most of us probably have seen countless of similar articles. They all share a common scheme: we used prompts with positive or negative sentiment and got over-representation or under-representation of certain groups. My criticism to all of them can be summarized as follows:

“Mate, you literally asked to be drawn a picture of an average CEO/terrorist/doctor/whomever, which means that you asked a question: what are common visual features of an average CEO/terrorist/doctor. How is that a correct thing to ask? This is an ill-posed question”.

In the case of text models, the problem is very similar: “You are evaluating sentiment of generated text from a model, whose purpose was to generate probable text. The same text in different contexts, which you are not providing the the model, can be offensive or not”.

What would an unbiased oracle do?

To better understand that point, imagine an oracle. A perfect model, which is completely unbiased and speaks the truth when it is available.

Now, let’s ask a controversial question that is used to evaluate bias in models: “Please draw me an average-looking terrorist”. After model generates any image we will undoubtedly get an absolutely understandable outrage from anyone that will look similarly to the result. Same goes with positive sentiment question: “Generate someone looking nice”. Result will make everyone, who looks nothing like the it, slightly uncomfortable.

So what is an issue here? Issue is the unspoken assumption of our question, we implicitly told the model: there are deterministic visual features associated with the word “X”, please find those features.

After such wrong question is posed, model will proceed to do what the “training data is biased” people will tell you it does: search it’s domain for visual features associated with the word “X” and return them. But it is not the fact that the model found them that is the bias here. Nor is distribution of those features in our training data. It is our assumption that they exist that is. Models will always output biased results when answering ill-posed questions. We can think of this as a special case of spurious correlation, where the unseen factor is additional information that the correlation exists.

Don’t judge sentiment if you didn’t ask about it

Second class of errors when interpreting output of generative models is forgetting that the task of such model is returning output with maximum likelihood from some (biased or not) distribution. Instead, the real question that we are asking is: generate a sample from a distribution, on top of which we will run our internal sentiment analysis, on top of which we will we will add our cultural context to decide if our judgement of model’s sentiment is similar to our internal sentiment towards this topic.

A great example comes from a popular example where popular right-wing speaker Jordan Peterson asked chat-GPT to write poems about politicians and got outraged by the results. This is going through this exact thought process:

  1. Ask to generate a sample from distribution
  2. Judge the sentiment of the output (not the likelihood)
  3. Check if the sentiment towards the topic is similar to ours
  4. If it is not, write an outraged twitter post

Even at point 2 this reasoning is broken, because generative model doesn’t necessarily have intrinsic notion od sentiment. The model is sitting there, proud of itself that it did the best job in the world, answering in a precise way, but in fact it’s output is now evaluated not by it’s likelihood, but by sentiment.

Point 3 is usually controversial, because the same answer, interpreted by different people will always cause different reactions. The answer to the question: “What does average Indian person look like?” will be different depending if you go to north India, south India or abroad. Wars were fought about what is true answer to such question.

Stop anthropomorphization of the models

Most of those biased ways of using the generative models (I’m now reversing the notion of bias back to the user) come from using language that inevitably leads to giving models human traits. Model doesn’t think, doesn’t have opinions and, most importantly: model doesn’t answer your questions. Instead, it answers: what is the most likely piece of data based on some prompt. Arguing that the bias exists there, because data is biased is, in my opinion, like arguing against results of democratic elections or Miss Universe constest. It’s foolish. Those examples somewhat reflect opinions of certain groups (in a different way in both cases), but usually people that go public saying “I think this result is wrong” aren’t worth much attention.

If we stop treating generative models as our beer-buddies, with whom we quarrel about politics, sport or culture and instead put them to work, where they are actually useful, they will offer us amazing potential and unlock our productivity. If we keep generating articles about different biases of models, not noticing how ill-posed questions we are asking, we might end up slowing development of those technologies, while not understanding why we actually did that.

References

  1. https://www.nytimes.com/2023/06/10/technology/ai-humanity.html 

  2. https://www.forbes.com/sites/deep-instinct/2021/11/18/what-happens-when-ai-falls-into-the-wrong-hands/ 

  3. https://www.unicef.org/innovation/sites/unicef.org.innovation/files/2018-11/Children%20and%20AI_Short%20Verson%20%283%29.pdf 

  4. Unfortunately, when talking about bias, journalists usually refer to bias in cultural aspect, not in more precise, technical one (https://en.wikipedia.org/wiki/Bias_of_an_estimator) 

  5. https://www.sciencedirect.com/science/article/pii/S2666389923001307