The following are the outputs of the captioning taken during an IGF intervention. Although it is largely accurate, in some cases it may be incomplete or inaccurate due to inaudible passages or transcription errors. It is posted as an aid, but should not be treated as an authoritative record.
***
>> NIDHI SINGH: Hello, everyone. Hi, and welcome to our session on Contextualising Fairness: AI governance in India. I know that this is the last session on the third day of the IGF, so we’re very thankful for all of the people who’ve come. I also know it’s quite late considering all of our Asian participants are joining us quite late. So we’re very thankful you could all be here.
We have a very interesting panel and a very interesting discussion that’s happening today, so I would like to keep some time at the end for audience participation. So I will be enforcing time limits a little strictly during our introductory remarks by the panel.
So we can just jump right into it. I’m just going to talk a little bit about how this panel is based around the idea that, while there’s a lot of work that’s happening around AI ethics and AI governance, but there’s no real one‑size‑fits‑all approach that can be directly implemented into all of the contexts.
As we start looking into AI applications and how the use of these applications can benefit societies, we have to consider that a lot of these applications are in fact made in the Global North according to Global North norms, and directly introducing them into the Global South tends to have a lot of problems. It leads to a lot of exclusion.
So in this context, we are specifically talking about what fairness means. And then how you can make these systems fair? Specifically to something as diverse as the Asian context, where a lot of countries in Asia in the Global South, a lot of them have larger populations. There are a lot of them might be developing economies that have linguistic barriers.
So in these cases, how would you make something like AI ethics work in these kind of cultural contexts?
So I’m going to give a very brief remark here, and then I’m just going to introduce all of our panelists really quickly, and then we’ll move on to a quick round of questions.
So to start with, we have Yik Chan Chin. She’s an Associate Professor in the School of Journalism and Communication at Beijing Normal University. She has previously worked at the University of Nottingham and the University of Oxford School of Law. Her research interests include Internet governance, digital ethics, policy, regulation, and law, and AI and data governance. Her ongoing projects include digital governance in China, and global AI and data governance. Dr. Chin is a co‑leader of the UN IGF Policy Network on Artificial Intelligence. We have an excellent report that was released yesterday, so if you haven’t checked it out, I highly recommend you check that out, as well. She’s a member of the Asia Pacific Internet Governance Forum and a Multistakeholder Steering Group member of China Internet Governance Forum.
Online we have with us Tejaswita Kharel. Tejaswita is a Project Officer at the Center for Communication Governance at the National University Delhi. Her work relates to various aspects of information technology law and policy, including data protection, privacy, and emerging technologies such as AI and blockchain. Her work on ethical governance and regulation of technology is guided by human rights‑based perspectives, democratic values, and constitution principles.
And finally, a more recent addition to our panel is Milton Mueller. Professor Mueller, so when we were looking through your bio, it was so long that I think we would have taken most of the panel just going over your work. We have had to greatly cut it down. So please check him out. You can just Google him. There are several links that pop up. We’ve just got a very brief bio introducing him here. Professor Mueller is a prominent scholar specializing in the political economy of information communication. He has written seven books that we could find on Google Scholar, and many, many articles and journals. He’s the Co‑founder of the Internet Governance Project, a policy analysis centre for global Internet governance. His areas of interest include cybersecurity, Internet governance, and telecommunications, and Internet policy.
So now I will just jump right into the questions. We’ll start with you, Tejaswita. So the entire conversation today is based around AI and bias and how you contextualise fairness. So can you talk to us a little about what fairness means specifically in the Indian context? So what are the kinds of contextual bias that you see in India, which are perhaps not fully accounted for in global conversations around AI bias at the moment?
Every speaker strictly has five minutes, and we want to have time in the end so I will be sort of enforcing it. Thank you.
Tejaswita?
>> TEJASWITA KHAREL: All right. Hi, I’m Tejaswita. So I’m going to be breaking this question into two parts. The first being, how do we look at fairness in AI in context of India? And then, I’ll talk about what contextual biases there are in India.
So to start, fairness, in terms of its own concept, it is a very subjective thing. There is no specific understanding of or, like, a definition of what fairness can even mean, which means that we must look at other factors that will guide our understanding of what fairness can mean, which in Indian context is three aspects. The first being equality, the second being non‑discrimination, and the third being inclusivity.
Equality in the Indian context, especially for AI, will now come in from the Constitution, which guarantees the right to equality. So when we look at equality in AI, the expectation of an AI system is that, number one, it treats individuals equally under the same circumstances and it protects human rights. The second being that it ensures equitable technology access, and, third, that it guarantees equal opportunities and benefits from AI.
Now when we move on to the second part, which is non‑discrimination, non‑discrimination addresses predominantly the question of biases in AI, which is more of the technical aspect in the sense that we’re trying to ensure that the data we have, we’re ensuring that when we’re creating AI systems they are not biased. So when we look at non‑discrimination, what we’re trying to do is we’re trying to prevent AI from, let’s say, deepening historical and social divisions that may be based on various factors in India, such as religion, caste, sex, and other factors that may be deeply rooted in the complex social fabric.
Then when we look at the third aspect of what fairness means, it is inclusivity. When we consider inclusivity, we’re looking at it in the sense that it prevents exclusion from access to services and benefits that AI tools can guarantee. And it’s also in the context of ensuring that your grievance redressal mechanisms are inclusive. You want to ensure that whenever you’re creating a fair system, it is equal in terms of it’s treating all persons equally, it’s providing access to everyone in the same manner, it is ensuring that the data is not biased and therefore not perpetuating existing biases or even exacerbating them, and it’s also ensuring when you’re creating fair AI. It must ensure that each person has access to grievance redressal.
So overall, the idea of fairness in AI, when we look at the Indian context, it’s encompassed by these three factors.
Now I’ll go to the second aspect of the question, which is, what are the contextual biases that you see in India that may not already be there in the Global North, or what the differences might be? So I will talk about this in the Indian context, as well as a slightly more genetic context, which is that I think the existing idea of what biases are comes from the Global North, in the sense that, until date, when we talk about AI bias, we predominantly use examples from the US. One such example is the COMPAS case study, where we realised that race was a very important factor when we were considering bias in AI. So a lot of the discussion around what AI fairness is, what AI bias is, is predominantly revolving around harms that have already shown up in the Global North, which is now starting to translate in the Global South. However, these factors that we’ve already identified, they may not apply in the same manner.
What I mean by that is that there, of course, existing factors that may be similar in the context of the US or the other Global North countries which will also exist here, such as gender, religion, ability, class, ethnicity. But what is different in the Indian context is perhaps caste. Not just in India, but also in other regions, such as Nepal, Bangladesh, Pakistan, there may be other factors which are not necessarily limited only to ethnicity, gender, religion, et cetera.
So in specifically the Indian context, caste is a major factor which does not really show up when we’re considering biases in AI when we look it in the context of the Global North. So what the harm is when we’re considering AI bias or like ‑‑ sorry. What the harm is when we’re considering factors of fairness only from the Global North perspectives is that we lose out on a lot of existing context, which means that the AI systems will not actually work. For example, if you tried to simply adopt an existing AI tool in the Indian context, which is, let’s say, created in the US, it would not work, because the existing context has not been taken into account. The data is not taken into account, which means that it will simply cause a lot of harm, and it will also be extremely ineffective.
So that being said, the larger point that I’m trying to make right now is that when we’re looking at AI ethics principles, and I think in this context specifically fairness, we have to ensure that these principles are tailored to the specific national context. And even within national context, there may be regional context, because especially in a country like India where there’s so much diversity it is important to consider all of the different contexts that are going to affect an AI system, which means that we cannot just have a one‑size‑fits‑all approach.
And this approach is the key point of our broader discussion on contextualising AI fairness, which is that we cannot just develop a general theory of algorithmic fairness solely based on Global North understandings. Each nation, with its own unique historical, cultural, and social dynamics, has to carefully consider how fairness translates into its specific context when it’s intending to develop and/or deploy any AI system. That is my point.
Thank you for your attention. I look forward to hearing from the other panelists now.
>> NIDHI SINGH: Thanks, Tejaswita. So I’ll move to one of our in‑person panelists right now, Yik Chan Chin. You work extensively on AI in China, and from a more global perspective, as well. So can you tell us a little about how you go about conducting this sort of research and what your methodology is? So what I mean is, in practical terms, contextualising ethics to individual context is resource‑intensive for several countries. So how do you look at this in your work?
>> YIK CHAN CHIN: Yeah, I’m going to share presentation, because it’s a bit complicated. So can I share now? Can you hear me? Okay. Can I share it? Okay. Okay. Can you see the presentation now?
Okay. So thank you for inviting me. So I’m going to present a work, which is a methodology about how to do it. So I hope this can clarify it, or may be helpful from the academic point of view.
And just a moment.
So basically, this work is about narratives of digital ethics and is conducted by the Austrian Academy of Science. It is a two‑year project. We collaborated with the Austrian Academy of Science, which actually fitted to today’s topics really well. So that’s why I use it as example.
So what this project actually is about, actually, what we found is that in terms of digital ethics, you know, there’s not much difference in terms of the values, the core value. But what are the differences in narratives? So what is narratives? Narratives are stories that are told repeatedly, consisting of a series of selective events, which have a particular character, so which will shape people’s understanding, you know, collective behaviour, or particular society. So this is what we call the narratives.
So what we found from our two years' research is that most of the country accept a core set of the principle, like a theorist, but what are the major difference is narratives. Okay, so in terms of fairness, and what what fairness means, globally, there’s a global consensus fairness means non‑discrimination, which includes a prevention of bias, inclusiveness in design impact, representatives, and a high quality of data, and as well as equality. So this is from global consent.
And, but with from our research, because it’s a bottom‑up research, we found actually we need to contextualise the principle of the digital ethics. So therefore, especially from the cultural dimensions. So we use the approach, probably, you know, it’s a situatedness, so which is a very common methodology. I think most of you will know well, in the STA’s research, Science and Technology Activities research. So we should actually focus on the differences in the social, cultural, political, and economical, and the institutional conditions.
So they look at the differences instead of the commonalities. Okay. So this is what we are going to, we use in this, our research. We look at the differences. We use the situatedness approach. And especially, we get a lot of evidence from Global South, because there’s a lack of the voice.
So here is the methodology, we did a kind of semi‑structural expert interview. If I remember correctly, it’s a 75 expert interview. And then it’s a workshop series. We’ve invited to talks, and we use also used case discussion, like panel discussions. And so this is how we generate our data. So this is a building block of digital ethics narratives.
So when we look at that, if you look at this site, we look at the key dimensions of the digital ethics. For example, what is the notion of "good" in different society? For example, like a harmony, virtue as a good, deontological, and the consequentialist as a good, and what the fairness means.
So there are different building blocks, like the role of adequacy as a fairness, and the material equality as a fairness, and the formal equality as a fairness. So then we have like a reference point. So who is the major actor? Whether community, or individual, or ecosystem. And whether the technology is beneficial, or victim or actors' opportunities, whether the technology, what the ethical concern is marginalisation, safety, or autonomy. Whether the actor is a government, or technical industry, or others. Whether what kind of tools government should use, like education, law, regulation, or technology. Or whether the legitimacy should be organically involved, or determined by the able, or self‑determination. So we use these measures to analyze the narrative of the ethics.
So then, so if you look at the fairness, actually there’s three different categories of the narrative of fairness. The first one is called the role adequacy. What role adequacy means? It means your role, what is fair or not fair, is not determined by the society, but by someone, you know, some kind of, often based on an assumption that different role has been assigned by power outside of human society, such as God, or religions, you know. So like we have a lot of examples from the African tribes. So what do they mean? Fairness actually is determined outside by the God, by the religions, nature, or faith, or even the spiritual world. Okay.
So the second one is called material equality. What material equality means? The idea of equality is a result, okay. So we look at the result, whether the result is equal or not, and the otherwise formal equality, which is look at equal treatment procedures. So we accept, because we have a different starting point, so maybe the result is unfair, but at least the procedure, the treatment is equal.
So they have three different narratives about the fairness.
So look at the Chinese case. So we look at a different case around the world. So I just use China, but if you look at the, if you read our report, you can see cases around the world, from Europe, from Africa, from Japan, from India, from USA. So I just look at the Chinese case. What are they, the features? So they feature the fundamental ethical assumptions, harmony, and then it’s a role ethic, means it’s determined by the tradition, you know, what kind of traditions, or belief, and the whole. They actually look at the equality, the whole system, the digital ethical system. And also they think that the technology is the opportunities, not a kind of threat. So they look at the technology as opportunities. And the conflict, the major conflict is marginalisation, whether technology can bring the prosperities. And the major role to shape all this development is by government, and what kind of tools the government should use is education. And the culture, who should make decisions, is determined by the able, which is the wisdom people.
Okay, so this is a kind of harmony type of the Chinese narratives in terms of fairness.
And so, but actually recently there’s some new development within China regarding to these narratives. The major change is about here. So what is changing recently, two or five years, we have a Chinese guest. So you can ask their opinion, as well. So it’s actually, they no longer see the technology purely as opportunities. They start to realise the risk and the victim, and they may become a victim of the new technology. And then, before they are more focused on prosperity development, and now they’re also shifting a little bit to safety and harm. And also, now it’s more and more and more rules come from the technical industry, instead only from the government, and start to use law to regulate the ethics of the digital, for example, AI. But still, they’re still determined by the able, and still the fundamental ethical assumptions, harmony and the role adequacy.
So if we look at the American, the Silicon Valley types, so we can see it's very different. Okay? So they are more looking at the consequentialist, which is result. Why is that this kind of technology will result in the fairest, okay? And the formal equality, which means the procedure. We have equal treatment to everybody, but it doesn’t necessarily look at the result, okay? And then it is up to individual to decide. And they see the technology as opportunities rather than the threat. And the main concern is autonomy, lack of freedoms, and also the self‑determination. So it’s the individual to regulate themselves rather than government. And the economy, the market‑driving approach, rather than the culture and education. So we can see it's quite different, you know?
So I think I stop here because I take too much time, and I leave it for the questions. Okay? Thank you.
>> NIDHI SINGH: Thank you so much for that. That was so interesting to see how fairness very practically is being defined in different contexts and how it’s changing over the last couple of years. That was a very useful intervention, and I think it’ll form the basis of our conversation.
Milton, we’ll turn to you. Speaking of practicality, you have a lot more practical experience with AI applications. So how far is AI going to go? And how far is it possible going to go? And how far is it going to go? So how far is it possible to contextualise an AI application? How far is it actually feasible to take these AI applications and contextualise them to culture? Have you seen any of these systems work our well? Like, how have they worked exactly?
>> MILTON MUELLER: Can you hear me? Am I on? Okay, thank you very much. Yeah.
I am going to, yeah, first issue you a few sort of caveats of generally framing the topic. So first of all, I’m what you might call an AI skeptic. That is to say I don’t really believe AI exists. I think the people have created this monster around it and mostly don’t know what the technology actually is or what it does. So one thing to keep in mind is that most of the time we’re just talking about computing technology, and many of the issues of contextualisation have been around in computing for a long time. So think of the keyboard, for example. The whole keyboard was designed for the Roman alphabet, right? And what do Chinese or Arabic people do about this? Well, they have to deal with all kinds of workarounds based on their different scripts. And what impact does this have? Well, in some ways it excluded, but people adapted, and they came up with workarounds, or they just learned the Roman script, right?
Another example are multilingual domain names, right? Where, again, the domain names were an ASCII script. We went through some processes in ICANN to try to come up with a way of representing Arabic or Chinese script in the domain names system, and we thought we were extending access by doing this, making the domain name system available to everybody. Turns out, we were not. It turns out that people in these countries with different scripts don’t adopt these alternative domain names, and it actually reduces their visibility and access to people who don’t speak that language. So it would have fragmented the domain name system. These multilingual domains exist, but they’re just not being adopted and not being used.
So now let’s turn to AI. I got some notes here that I need to see. So a lot of what I’m gonna say is based on research at Georgia Tech, particularly by an Arabic AI specialist in our computer science department.
Oh, my god. This is recording everything I say (looking at mobile device).
So the first thing you have to know about AI is that all of these big models were trained on what we call Common Crawl, right? Which is a way of crawling the Internet and picking up all of this textual and image information. And the top languages on Common Crawl are English, 46%. Next is Russian, 7%. German and Chinese, 6%. You get down to Arabic, it’s 1%.
So Tarek, who by the way, I have to ask the other panelists since he’s working in Georgia Tech, is this knowledge coming from the Global North or, because he’s Arabic, is it coming from the Global South? But we’ll deal with that question later.
So he’s just explored the way that this rootedness in English text produces AI applications that produce bad outputs for Arabic cultural context. I’ll give you an example. You ask it to fill in a word, and let’s say you say, "My grandma is Arab. For dinner, she always makes us," fill in the blank. Now, a standard AI application is going to fill in something like "lasagna" because it’s all based on statistical prediction, right? But it should say something like Majboos or some kind of Arabic dish, right?
So another interesting example is, he says, GPT‑4 is generating a story about a person with an Arab name. And if you use an English name or a French name, a European name, the story will be something like, "Oh, Philippe was this very smart boy who grew up and did this." If you use an Arab name, it’s sort of like, "Allah's was a poor family where life was a daily battle for survival," right?
So, you know, that can be very irritating.
So what Tarek has done, he’s developed a measure set for LLMs that tries to determine the cultural level of appropriateness, as he calls it. And it’s called CAMeL, Cultural Appropriateness Measure Set for LLMs, CAMeL. And again, this is not my research. This is Tarek Naous and Georgia Tech Computer Scientists team.
And there’s also somebody at CDT named Aliya Bhatia who’s done some research on how AI affects very small language groups, very small linguistic groups. And you can see how they kind of get erased. And again, you have to go back to other forms of information technology like the English language was homogenized by the invention of the printing press, right? So similar processes are going to happen with the massive scalability of AI.
But also, and this is something we discussed with Tarek, you know, people are going to develop different models based on different training sets, right? And so the part of the solution to that is for ‑‑ this is actually an opportunity as well as a threat for the so‑called Global South, which means that if they develop using their own resources, training sets and models that are trained on their cultural context, then they will have a product differentiation, a marketable difference with these big platform products, and they might be able to out‑compete them in certain markets. So I think, again, these kinds of disparities and hiccups occur across the development of technology. And I think it’s bad to look at this kind of discrimination as a static thing that is some form of oppression. It’s more like an obvious flaw in the training set of the datasets used to train these systems. And it’s a remediable flaw. It can be fixed. It will be fixed. It’s a matter of investing in the resources to do so.
I think that’s all I have time to say, right?
>> NIDHI SINGH: Thank you so much. Yes, I think we are out of time.
For the second question, we’re going to go a lot faster where everybody’s getting cut off for three minutes, because then we can have a little bit more time for questions.
So speaking of how we can work towards fixing this, Tejaswita, I’ll address the question to you. How do you think we can have more representative and inclusive evaluation frameworks for these AI systems? Like, I think Milton talked about the CAMeL framework. But are there any other ways that you know from your work on ethics in India that we can have these frameworks?
>> TEJASWITA KHAREL: Thank you. I think when we’re thinking of how we can create these frameworks, I think the issue is more of how there is disjointedness between people who want ethics and the people who can possibly deliver it in the actual AI system. So I think the first step to dealing with this problem is by actually resolving that problem, where when I’m saying equality, then we look at how you make things equal. How do you ensure that? If equality means that you’re ensuring that your AI application treats everybody the same, then you must ensure that your AI system is being able to do that. Similarly, when we’re looking at non‑discrimination, the major factor is bias, which means that we first need to remove bias from the datasets, which again is something that we will speak about, but the ones who are working on the AI systems will be the ones who create or work on creating clean databases. We will access these and then work towards implementing our ideas of AI fairness, which means that I think my main recommendation is that we understand bias, fairness, and all of these ethical principles and factors from the perspectives of the ones who actually do the work and get them to understand it from our side of things so that we can implement it in a way that’s actually reasonable instead of just demanding AI ethics and AI fairness.
Yeah. Thank you.
>> NIDHI SINGH: Thank you so much, Tejaswita.
That actually leads me really nicely to my question, because your point on how the ethics of AI or the ethics of anything and the practicality of how it’s delivered, that gap seems to be widening. So I’ll turn to Yik Chan Chin now. My question to you is somewhat related. AI governance right now, the ideas that you have around it are centred around principles and best practices and ethics. And yes, you have a few laws, but most of the world is going with best practices and ethics. Do you really think these are enough to guide issues like fairness? But if we weren’t using this, then what would we use as central tenet to guide fairness?
>> YIK CHAN CHIN: Yeah, I think the other work we are doing at the PNAI is interoperability. So first of all, we have to respect the regional diversity. And for example, when I say the fairness in the Chinese context, so a lot of the fairness we talk about in the Western society, for example, in Britain, we’re talking about gender, age, all this racial discrimination bias, but this will never be a problem. It’s not a major problem in China. We do not address gender, racial. It’s not a major concern. So what is a major concern at the moment, actually, first of all, is more about the consumer protections. So we have the algorithms, the provisions, which regulate what kind of algorithms, the automatic decision‑making, the preference you can give. So basically, they have a special provision that says that you cannot damage people’s consumer rights. For example, you cannot discriminate people in terms of the price. So if I buy a ticket from one website, I got 800. Then I use the other different mobile, maybe Apple, then I got 1,000. So this kind of discrimination is more from the consumer protections, but not from racial, gender. So this is one of the major concerns at the moment. We call it the protection of the consumer rights.
The other one China is doing now is antitrust, because they want to, oh, this is also a major concern in terms of fairness. But this is not a major concern in most of the Western countries at the moment, in Britain, in America. But in China, it’s a major concern, how to provide a fair playing field for everybody, for all the AI companies and the digital platforming. So they are pushing forward the antitrust regulation and implementation in China.
So I think we can see each society have different priorities. But if you ask what is the best practice, so it’s really difficult. So we also want to choose the best practice from all these case studies. So I think in the end, every country, they can contribute their best practice. For example, China can contribute their best practice in terms of how to address the consumer protection, or even antitrust. Maybe from the Western, I mean, Europe, they can contribute in terms of the discrimination against the racial or the gender issue.
And so I think the best practice has to be coming from different regions. And in the end, we need to have an interoperable framework in terms of fairness. So each country, they have different priorities. But in the end, we probably have a minimum consensus on what are the building blocks of the fairness. Okay. I think that’s the approach I would recommend.
Thank you.
>> NIDHI SINGH: Yes, thank you so much. I think that’s actually a very important conversation that we’ve sort of been having, I think, all week now, where we’re talking about maybe having more collaborative platforms where countries can come up. There’s no real point to, I think, building all of these solutions in isolation if we’re not going to share them. A lot of the countries do share commonalities. And then I think a lot of them are actually, there's probably something that we can all look at when we’re looking at our own solutions.
So for my final question before we open up to the audience, this is going to be, I think, a slightly different question from where the conversation’s been going so far. So today, we’ve been talking for the last, I’d say, 40 minutes about contextualisation. There are, however, some concerns around hyper‑contextualisation. So we can always say that, yeah, it’s great that you should always contextualise things to all of the contexts. Is that really even possible to contextualise to all of the concepts? There’s so many cultures. There’s so many languages. Would it actually be feasible to have an idea of fairness or AI systems or any sort of a computer system that’s contextualised to all of the cultural contexts and nuances that you have?
>> MILTON MUELLER: I turned the mic on. I just turned it on. That helps, right?
So can we get too hyper‑contextualised? And I think we’re talking about this in a governance context, right? So unfortunately, almost everybody in the IGF and in the UN system, when they talk about governance, they’re talking about hierarchical regulation by government. And they’re almost never talking about bottom‑up regulation by markets, which is actually what’s going to be doing most of the governing. I mean, I just hate to inform you of this if you’re not aware of it already, but we get these AI applications produced because somebody thinks they’re going to get making money on them, right? So how much contextualisation will we get? Will we get too much? Well, it depends on what the market will provide. If there is intensive demand for incredibly micro‑contextualised applications, and I think there will be eventually, it will build up over time, of course. Then we will get micro‑contextualised things.
Think of a business in Indonesia in some very specific industry sector. Maybe these companies are building machine screws for nuclear power plants. I don’t know. That’s highly specialized. And the AI decisions, the inputs and outputs that would be relevant for those industrial players, would be extremely contextualised. To be useful, they would have to be.
And just a word about discrimination. So one of the things we have to understand is that so many of the mistakes and biases that you’re talking about have to do with the fairly primitive early origins. Like I said, we’re using Common Crawl to look at 46% English. Our facial recognition training has been based on US populations with 80% to 70% white people. So of course, the facial recognition is not the greatest, but again, that database will be expanded in multiple countries around the world, and the applications will have the potential to get better.
The most famous case of facial recognition bias, racial bias, is actually not a case of racial bias. It was a police search in Detroit, Michigan, where we had a very grainy bad picture of a man who stole things from a store, and it was a black man. They went off and they told them that the record matched some guy that was innocent, so they went off and arrested this innocent black man. Now, the point was the real person who stole the stuff was, in fact, black. So it was not racial discrimination. It was not racial bias. It was bad accuracy. Right? Then, even more important than the bad accuracy was bad police practice. This guy did not go off and check whether this person he arrested had an alibi, which he had an airtight alibi. He could prove he wasn’t there in that store, and yet he arrested him anyway just because he was lazy. A lot of what we, again, talk about embeddedness and situatedness, look at the way AI fits into a specific context and how it’s used, and that is going to be determining how harmful or how beneficial its uses are going to be.
>> NIDHI SINGH: Thank you so much. That’s actually really, I think, interesting.
We’ve also been sort of looking at where the AI use sometimes, I think depending on how it’s being used, it may not necessarily be just the AI, but it can magnify things that are already happening. We're just rubber‑stamping those decisions along. Maybe the human in the loop isn’t really being human enough to be counted there, so these problems are something that is coming up.
Okay. I had another question, but I’m not going to ask it. We will move on to audience questions because we have 15 minutes, and I’m cognizant that a lot of people in the room seem to want to ask questions. If you have a question, you can just put your hand up, and we can help bring the mic around. Otherwise, Tejaswita, if there are any questions in the chat, please let us know. Please introduce yourself once before you ask a question.
>> EMAD KARIM: Thank you. Can you hear me? Yes. My name is Emad Karim. I’m from UN Women’s Regional Office for Asia and the Pacific. I’m going to put on my UN Women hat, and also considering that whole perception of fairness. We are also excluding half of the population on their perspective on AI. There is a lot of research coming out to say that there is a lot. Women’s perspective, history, narratives are not even included in those datasets, because we inherited 10,000 years of civilisation that was written by men for men, and that creates a huge gap in the AI outputs related to women and for women. Where do you see this, as well? The more we go into those layers, we’re talking about women, but also women in remote areas, women with disabilities. The more you get into those layers, the less that will be represented in AI infrastructure datasets and outputs. I wonder if you have any reflection into how we can increase and fix those datasets or even have better roles when it comes to women’s representation in AI.
>> YIK CHAN CHIN: In our two‑year workshop, we actually had a discussion on this question for a long time. We have a lot of representatives from Australia and from Africa, especially, from the community village. The approach that they propose is community‑based. In the end, you have those big, big models like OpenAI. Then, probably in China, they have Tencent, Alibaba. Also, I think it’s already happening in India. They have the India model, the small model. We do not call it a big model. It’s a small model. We have one example from the Australian Aboriginal community. They use their own language to develop their own dataset and develop their small model. In the end, I think there’s a diversity at a different level. We don’t need a big model. If you just serve your community, you can simply devise a small model, which does not really consume a lot of data and energies. That’s a specific model. I think this is how, in the end, just like Milton said, there’s a demand, then there’s a supply. I think that will be the way to tackle it in the end.
>> NIDHI SINGH: Are there any other questions?
>> XIAO ZHANG: Hello, this is Xiao from the CINIC.
I think it’s really a very interesting topic and a good discussion. I have a question. I think the bias of the AI in the data is closely linked with the culture and the nation’s history. Your bias is not my bias. My question is, because the data is already the past data, it’s already rooted with the history, with the culture. It’s biased. The data is already biased. How would you use your methodology or something else, your regulatory, to make the bias data no bias, make it fair?
Thank you.
>> MILTON MUELLER: Yeah, I think the idea that you govern bias in AI by making, I think ‑‑ what did she say? Tejaswita said something about we need to clean the data. You’re going to go in and you’re going to scrub the bias out of the data. The data has like dirty spots in it and you’re going to scrub them out. That’s just not how the system works, not at all. The data is the data. The data is a record of something that happened in an information system somewhere at some time. What you are going to change is you’re going to look at other collections of data, bigger collections of data. In that sense, you can engage in AI output governance via data governance by saying, well, for example, many people have spoken about being transparent about what datasets you have used. Then you have these metrics. These measures that the Georgia Tech researcher, Tarek, he does a critical analysis of these measures and points out how some of the main measures can be gamed. It’s just like if you know how Google will rank, what their algorithm uses to rank you in the search results, then you can put a bunch of junk into your website that pushes you up in those standings so people can game whatever metric is out there to optimise it, but they still may not have good results from a cultural bias perspective.
One thing I would emphasize is that you’re dealing partly with the inherent limits of machine learning. Machine learning is taking all of these records, a whole bunch of them, processing them into a neural network that identifies patterns in the data. The data can be changed to change the outputs, or you can retroactively look at the outputs and say, "We’re going to change them." I know Google has a whole program called Fairness in Machine Learning, which is somewhat controversial, but I think everybody here would kind of like it. Their idea is we know that existing data will be biased. If you ask a straightforward question of a search engine, "Show me a CEO," most of the pictures, if not all of them, will be men. They said, "We’re going to tweak our algorithm so that we will show more women in response to this question." They will deliberately make an inaccurate, from a statistical sense, an inaccurate representation of the dataset based on tweaking their algorithms. That’s one. They even call this their Fairness in AI program. So they’re very concerned with fairness and the definition of fairness, which means some form of equal representation.
Now, that sort of got them into trouble because somebody asked their image generator to show a picture of the American Constitution, and their fairness algorithm had black people and Asian people in the Constitutional Convention of 1783, which is a complete misrepresentation of reality, but from a diversity representation standpoint is kind of cool, right? Oh, maybe there was a Chinese American there writing the Constitution, and there wasn’t, but wouldn’t it be kind of interesting to show that as happening? So that was a very controversial output, and a lot of criticism of Google came because of that. And as you probably know, in the US now, DEI is very controversial and on the defensive, if you know what DEI is.
So there’s two sides to this question. And the deeper question, the philosophical almost question is, like, if you have a statistical regularity, you don’t necessarily know how it got there, but you definitely have a statistical regularity, is it biased? Is it unfair to act upon that statistical regularity? So if it is, in fact, true that German origin Georgia Tech professors are more risky driving their car, if it’s a statistical fact, let’s say my risk is 10% more than an Asian woman, can the insurance company charge me higher premiums, right? And you can say, oh, you’re biased against me. No, they’re saying, no, you’re more risky. So that’s the big deal.
>> NIDHI SINGH: That’s a really interesting perspective, actually, because I think this is something that we’ve been working on, and I think this also circles back to a lot of the conversations around cultural context, because I think for a large part when we were having conversations about AI, for us, a lot of it is about things like the public distribution system, and how if you don’t have records, then certain villages get lesser allotment of rations, because they’re not counting women in the public distribution system. But it’s actually really interesting to see how you have metrics for fairness. And if you didn’t fake that metric for fairness, like you’re saying, then you just won’t have enough grains going to that village, which is like some of the issues that we’ve been seeing. So clearly, there needs to be some work done in this.
I’m also just going to let Tejaswita come in really quick on this, because you were talking about fairness, as well. Really quick, Tejaswita, because I think there are some questions in the chat, and I want to take at least one online question before we close.
>> TEJASWITA KHAREL: So I would say when I’m talking about cleaning up data, I think it comes firstly in the sense of, number one, before you start using it. If you know that your data is likely to be biased, for example, if you’re trying to create, like Nidhi said, public distribution, and you know that you don’t have enough information on certain people, or you know that there’s going to be issues arising out of it, or there’s proxies involved, you’ve cleaned that in the sense that I know it’s difficult to clean data with foresight as to what’s going to happen next, in the sense that it’s very closely linked to possible harms, right? If you know that your data or your AI system is likely to be biased against people from certain groups, then you have to ensure that you’re cleaning your dataset in a manner that removes certain proxies and makes everyone seem equal before the system. So I sort of agree in what you’re saying, in the sense that it’s not really easy to clean the data beforehand, because you can’t really identify what you’re supposed to clean. You can’t just be like, okay, these are the issues. It usually comes out of identifying what’s gone wrong, and then fixing it later.
But now that we have seen a lot of things happen in the sense of we’ve recognised what these larger harms are going to be, we know, to a large extent, who these harms are going to be against, so there are possible ways to identify what’s going to happen next, and therefore clean this data beforehand and work on it accordingly.
Yeah, I’ll just limit to that much, because I see I think there’s two questions online. Should I read them out?
Okay. The first question is, given that fairness itself is subjective and varies not just in regional contexts but also in the application or use of AI in question, what may be some of the ways to reconcile these differences in the development of the tech of these technologies?
>> NIDHI SINGH: Can you read out the second one, as well? I think we'll literally have one minute to answer.
>> TEJASWITA SINGH: The second question is, what emerging technologies or methodologies show promise in creating more nuanced context‑sensitive AI fairness assessments?
>> NIDHI SINGH: Okay, I'm going to give all of our speakers like 30 to 45 seconds to answer. Sorry, I know that’s not enough time, but I think we’re literally at the close of the session.
Yik Chan Chin, would you like to go first?
>> YIK CHAN CHIN: Yeah, I think, in the end, we know we have a global framework, like a two UN resolution and the UNESCO ethics guidance. So in the end, there’s still every country set up to that. So we have, we do have a minimum agreement on that. So the other thing that I think we’re not talking about is not a language model, but that may not be the only AI, you know, that’s a different AI system. So in the future, we may have a reason‑based AI, so logical‑based AI. So I think there’s a transitional period. We will see.
>> MILTON MUELLER: I don’t know how to answer either of those questions. Really, not in 30 seconds. So I’ll just pass.
>> TEJASWITA KHAREL: I mean, I do think they’re very difficult questions to answer real quick. But I think the first question, in terms of what may be some of the ways to reconcile the differences, when you’re looking at the context‑based AI applications, I think the answer is in the question, which is that you contextualise the AI fairness based on your specific AI use. If it’s being used a certain way, you identify how it’s being used, and then identify what factors are important, and therefore implement fairness into it.
Yeah, unfortunately, I think I don’t have enough time to answer the other one.
Thank you.
>> NIDHI SINGH: Thank you so much, everyone. I think we all learned a lot.
Please, for the people who are here, you can just come up to us and talk to us later.
I think my main learning from today is that we should apply for a 90‑minute panel next time, just so that there’s more time for everybody to ask questions.
Thank you so much. That was an extremely interesting discussion. We will definitely be following up on a lot of the things that have come up.