Increasingly, data and technologies such as artificial intelligence (AI) and machine learning are involved with everyday decisions in business and society. From tools that sort our online content feeds to online image moderation systems and healthcare, algorithms power our daily lives.

But with new technologies come questions about how these systems can be used for good – and it is up to data scientists, software engineers and entrepreneurs to tackle these questions.

To learn about issues such as ethical AI and using technology for good, we speak with Rayid Ghani, professor in the Machine Learning Department of the School of Computer Science at Carnegie Mellon University and former Chief Scientist at Obama for America 2012.

Professor Ghani has an extraordinary background at the intersection of data science and ethics, making this an exciting and unique show!

The conversation includes these important topics:

Rayid Ghani is a Distinguished Career Professor in the Machine Learning Department and the Heinz College of Information Systems and Public Policy at Carnegie Mellon University.

Rayid is a reformed computer scientist and wanna-be social scientist, but mostly just wants to increase the use of large-scale AI/Machine Learning/Data Science in solving large public policy and social challenges in a fair and equitable manner. Among other areas, Rayid works with governments and non-profits in policy areas such as health, criminal justice, education, public safety, economic development, and urban infrastructure. Rayid is also passionate about teaching practical data science and started the Data Science for Social Good Fellowship that trains computer scientists, statisticians, and social scientists from around the world to work on data science problems with social impact.

Before joining Carnegie Mellon University, Rayid was the Founding Director of the Center for Data Science & Public Policy, Research Associate Professor in Computer Science, and a Senior Fellow at the Harris School of Public Policy at the University of Chicago. Previously, Rayid was the Chief Scientist of the Obama 2012 Election Campaign where he focused on data, analytics, and technology to target and influence voters, donors, and volunteers. In his ample free time, Rayid obsesses over everything related to coffee and works with non-profits to help them with their data, analytics and digital efforts and strategy.

Transcript

Rayid Ghani: How do machine learning and AI systems get bias introduced? The primary source is the world. The world is biased. The world is sexist. The world is racist. So, if all your data was totally accurate, it reflected exactly things that happened in the world, perfectly accurately, it would – the data would not be biased, but the processes that we live in, the world we live in is biased. Just to be very clear; that's the major source of bias.

About Rayid Ghani and technology for good

Michael Krigsman: "Data Science, AI, and Machine Learning for Good." That's Rayid Ghani of Carnegie Mellon University.

Rayid Ghani: My work is at the intersection of all the machine learning, AI, data science buzzwords and all the social impact, public policy buzzwords, so the intersection of those areas. Most of my work is anchored around problems that I work on with governments and nonprofits around health, education, criminal justice, policing, and transportation environment.

With those problems, I then also focus on research areas that come up around fairness, explainability, and also on teaching that provide students experiential learning opportunities to tackle those problems (with partnerships with governments and nonprofits).

Why is responsible AI important?

Michael Krigsman: Can you tell us why you're so passionate about this and why it's such an important set of issues?

Rayid Ghani: For me, the starting point was really trying to tackle problems that a lot of these organizations are facing when they're trying to improve outcomes for society, for people. They're trying to look at public health outcomes around improving people's health or providing people employment opportunities, problems like, which kids might get lead poisoning and how do we act preventatively to reduce that risk; or which people might have unemployment, and what do we do proactively?

What skills do we provide them? What support do we provide them? What people are going to be subjected to police misconduct, and what can we do – again, preventative things?

That was the starting point for me is here are some big issues in society. How can we use data and evidence to help tackle these problems?

But as I started getting into those problems, I realized that it wasn't off the shelf. It wasn't, we just build something and it works. It required a lot of new things that had to be done and really trying to understand how we do these things in an equitable manner, in an ethical manner, and not just propagate the same old things that have been happening in the past.

It's really coming very much from a problem-centered, human-centered view of people going through these problems. Can we help organizations better improve outcomes for them? Then running into challenges when doing that and tackling those challenges through these approaches.

What are the ethical challenges in data science and AI?

Michael Krigsman: What are some of the core challenges that arise in relation to data science, machine learning, AI, and so forth?

Rayid Ghani: There isn't quite a shared understanding. I'm pretty inclusive in how I use these terms. For me, they're interchangeable. They're all part of the same thing.

Yes. They're trying to all use data and evidence to improve decisions. In that context, we treat them very broadly.

When we go a little bit deeper, we find that a lot of these technologies are very good at very narrow and very specific tasks and very specific contexts. The more narrow the task, the better these systems are going to be.

They also heavily rely on data, and the data is often curated, again, very carefully by people. They also heavily rely on a lot of design choices that are made by people.

As much as we think about these technologies as, "Oh, they can do a lot of things," the fundamental approach, the way they work, is they rely on a lot of choices people make, designers of these systems make, developers of these systems make.

That's where a lot of the challenges come, challenges around: Will they result in fair and equitable outcomes? Will they be understandable by the people who are using them and, more importantly, who are being affected by them? Are they robust to changes?

Just because some system is working today at being fair, does it then continue that way? Is it respecting the privacy of the individuals who are being impacted?

Is transparency involved there? What does transparency even mean when you're dealing with some of these issues that are not commonly well understood by people?

Are they inclusive? Who is accountable for the use of these: the people developing them, the people using them, the people being affected by them?

There's a whole series of ethical values that we want to design these systems to achieve. The question becomes, what values should they be, whose values, who should be the stakeholders involved in soliciting these values and getting to a consensus if they don't agree, and then how should we build these systems to achieve those values once you've figured out what they need to do? Those are a pretty massive set of challenges based on what we've seen so far.

What is the source of bias in AI?

Michael Krigsman: Can you give us a couple of examples of where bias or inequitable results arise because of these issues you were just describing?

Rayid Ghani: There are places we're all sort of used to seeing, and none of those places are any different than what the human systems today are creating biases.

If you take healthcare as an example, there are a lot of systems being developed that are used to do things like early warning systems, early diagnosis of diseases. Some of the work we were doing a few years back with hospital systems was around can we detect which patients might be at risk of diabetes. We wanted to detect that early in order for the physicians to provide lifestyle change support and change behaviors of people, so could prevent that from happening.

Now, at that level, that sounds great. Why would we not want to do that?

The question is if you've got limited resources to allocate to people in order to prevent diabetes – that hospital has resources to do additional outreach programs for, let's say, 1,000 patients – which patients should they be? These AI systems flag individuals as high risk, but they're going to make several types of mistakes. No system is perfect.

They might flag somebody who is actually not at risk. It's called a false positive. Or they might omit somebody who is actually at risk. That's called a false negative.

They make all types of mistakes. But if you make those mistakes randomly, you sort of miss some people. You flag some incorrectly. That's sort of the nature of these systems.

But you're disproportionately making mistakes on certain types of people who have systematically been disadvantaged. So, take people again in the U.S. healthcare system. Typically, black and brown people underuse hospitals. They don't have access to hospitals as much. They don't go to the hospital as much. They get underdiagnosed. It's very well known.

While that's a fact, when an AI system is built on top of that, it propagates that. If somebody is underdiagnosed, the AI system is going to basically even further under-diagnose those individuals and not flag them as high risk of something, which means now the physicians, in addition to what they were doing before, they are even more not providing them with the support programs, with the inventions. That's going to result in even more underdiagnosed.

That's an example of a system that's built to replicate the past, which is what most AI and machine learning, by default, are. You give it training data, and you ask it to build a system that best replicates that data (in the past).

Now, when new patients come to you, and you score these patients to flag them as high risk, if your historical data was biased (in this case because certain people are being underdiagnosed for various reasons), then your AI system will just propagate that. That causes issues around equity and fairness and leads to disparate outcomes for people who already are undergoing such issues historically.

That's one example, but that repeats the same thing around the criminal justice systems, if you're trying to use a system to figure out who needs mental health services or social services. You might miss people, disproportionately.

The same for the work that we're doing around child welfare or in education and after-school programs to get students to graduate on time from high school. If we miss students disproportionately from different backgrounds, that ends up hurting those groups disproportionately and making the inequities that exist today even worse.

Michael Krigsman: We have a couple of really interesting questions that have popped up on both Twitter and LinkedIn. I'm just going to start taking them in order.

What are some examples of AI ethical issues in healthcare?

The first point is Suman Kumar Chandra. This is from LinkedIn. He says, "Can you give an example of how human intervention and choices will affect/impact the outcomes that you were just describing, those negative outcomes?"

Rayid Ghani: What happens is when we build a machine learning system, we make a series of choices. Let's say we make thousands of choices. It starts by thinking about how we formulate that problem.

There's a pretty popular paper now (that came out a couple of years ago) around healthcare. What the paper was doing was predicting which individuals in the healthcare system will have additional healthcare needs. Who is going to be more in need? They formulated that problem.

When you build machine learning models, one of the key things we use are called labels or outcomes. It's not as if an outcome is handed to us. We formulate the problem and define an outcome.

In this case, what they thought was, "Oh, how much did this person cost? How much money was spent on them by the healthcare system?" It seems like a reasonable proxy for their needs.

High cost, high needs, so let's just build a system. The intention was to predict need. The actual design choice made was to predict cost because they thought it would be a proxy and a reasonable proxy.

Now, what happened was, when they build that system, it predicted who is going to have high cost. Well, again, there are going to be some people who will have high needs but low cost because they didn't use the system very much.

They needed the system; they didn't use it. The system failed them. They didn't have access to it. They didn't order the right tests, they went to places that didn't do the right tests, or they were in areas where the cost was less.

Now, what happened was that while that system was predicting cost, the assumption was predicting need. If we had formulated the problem in different ways, if we had formulated in additional ways to think about what are good proxies for actual needs instead of just cost. That was the very basic choice we made.

Then additionally, we made choices around when we link data sources together. It's a common component of most machine learning systems is we link different data sources, and we do this in record linkage or matching.

When we do that, we make mistakes sometimes. We link people that shouldn't be linked together, and we miss people that should be linked together.

Those mistakes are not random. Often, people whose names are long with lots of consonants together, get mistyped and mis-linked. People's names are very similar and common names like Joe and Mike, they get linked together even though they might not be the same person.

Ideally, that mistake wouldn't happen. But, in practice, it does happen.

What happens, if you link two people together who are not the same, you've added interactions. You've added somebody else's information to this person's record. If you miss somebody, you've removed information.

If you're now applying for a loan and your name was not linked together, you might have had extra jobs and income that the system doesn't know about you now, and it might predict that you're not able to pay back this loan. You don't have the credit history, and you might not get the loan. If somebody else with a similar name to you has a bankruptcy and you've linked their data together, now you've added a bankruptcy to the record.

How you link things together, the mistakes you make, how you design those things, how you think about if you're doing simple things, like we have missing information about people – somebody's income is missing – and we do something called imputation (fill in missing values).

If you don't think about it carefully and your data is, let's say, 80% males and 20% non-males, if you are a non-male with missing income, you might do something really naïve, bad, and just say, "I'm going to take the mean off everyone." Now you've made non-males look a lot like males because 80% of the data is male. Now you've imputed income.

None of those choices are inherent in machine learning, in the data you got, in the problem you're solving. It's a choice that you made as a machine learning or AI system developer.

What we do in our projects, in our training, in our programs, we discuss each choice we make at each step of the process, of the ML process, the AI process, and talk about what are the design choices we're making – What options do we have? What are the ethical implications of each choice downstream (two months later, four months later, a year later) on the people who are being impacted? – and try to make those choices consciously to minimize that risk that we're taking at this point.

As opposed to doing kind of an upfront or a retrospective, we do a continuous ethics conversation around each phase of the project. These choices that seem very small have pretty high impact on ethical and equity issues when applied to people in these contexts.

Michael Krigsman: Please subscribe to our newsletter by hitting the subscribe button at the top of our website. If you're watching on YouTube, then you should subscribe to our YouTube channel.

We have a couple of questions now from Tim Crawford. I know Tim. He is very prominent in CIO circles.

What is the impact of culture in driving socially responsible AI?

Tim's first question, he says, "What is the impact of culture in driving AI, or is it the reverse that AI is driving culture?" The interplay between culture, AI, and addressing the issues you've just been describing.

Rayid Ghani: I think it's two types of culture there. One is the culture in society. Let's talk about those.

One is the teams that are developing in designing these systems, that culture. The second is the society that these systems are being designed for.

Ideally, those two have high overlap. The people designing these systems are in the community that these systems are being designed ideally with and not for.

Unfortunately, today that is not true. Most of us developing these types of systems are not part of the communities they're necessarily being developed for, and they're not being developed with them. They're being developed for them.

I think that's one dimension where we need to be careful about is we need to develop the talent and the people who are working with communities, that are being co-created with these communities, and they're part of the process. That's not the case today.

The other piece that I think is important to think about is a lot of these teams. We need to have the right environment set up for these teams, the right culture set up so that these types of discussions happen.

What we often hear about is people are worried about raising these types of concerns in engineering teams. "Well, I have an issue. I have a concern about some of the choices we're making because they might have ethical issues," and they don't have a good setup for bringing up these issues, for having a framework.

I think we need to have the right environment to do this. We need to have the right people with the right training so that they know how to have these discussions and that they've had them before.

I think, in addition to that, we do need more processes and checklists. We need to make sure that we follow the process. Have we had conversations with all the stakeholders?

We often talk to people. At least in lots of spaces I work in where we're working on these types of societal issues, we talk to people who are using the system, but we may not talk to people who are being impacted by the system. Having a process that makes them be a critical part of the process is important. I think we're getting there, but we don't have repeatable processes to standardize in order to do that. I think we need to get there.

I think the development, the tech and AI development culture of today is very much sort of build things fast and fail fast and break things. I doesn't necessarily work when we're dealing with fairly critical issues around society, around, again, health, education, and criminal justice. We need to be much more conscious of the impact we're having on people and development environments and cultures where this can happen much more responsibly.

How can we address human bias when it comes to AI and machine learning?

Michael Krigsman: We have a couple of questions on the issues of bias. I want to go back. Tim Crawford has a second question, which he says, "Do you introduce new types of bias as you're trying to shape the data to remove the original bias that you were trying to address?" Let's talk about that first.

Rayid Ghani: Data is not the only or, in many cases, even the major source of bias in those types of systems. You step back and say, "Well, how do machine learning and AI systems get bias introduced in them?"

The primary source is the world. The world is biased. The world is sexist. The world is racist.

If all your data was totally accurate, it reflected exactly things that happen in the world, perfectly accurately. The data would not be biased, but the processes that we live in, the world we live in is bias.

Just to be very clear; that's the major source of bias. That bias comes in, and the world generates data that embeds all of its biases in there.

Then we make choices as system designers. We use certain data sources because they're available to us.

If I'm in a hospital, again, I'm going to use my EHR system because that's what I have. I may not go out and say, "Well, there are a lot of people in my community who don't come to my hospital. Maybe I should go collect data about them, with them, and understand what I'm missing." Or I might have access to social media data, and I will just hit that data and assume it reflects the entire world.

I make a lot of choices around data and, to Tim's point, if I add a data source, I am reducing one type of bias. But I may be adding another type of bias. The next source of bias becomes the design choice that I talked about, how my ML is working and what choices I'm making there.

Let's say I've magically figured out some way of getting everything right up to now. The last piece in most of these systems is often the human intervention. It's, how did the physicians intervene? How did the social service worker intervene?

That intervention, the action that we take – an unemployment counselor connecting somebody with a training program, or the after-school program coordinator enrolling a child into a program to help them graduate on time, or a health inspector going into a house and looking for lead hazards – my AI system can be perfectly fair, but that intervention can be unfair.

What if I'm doing outreach for a program for a mental health outreach program? My outreach program is by phone in English. It doesn't matter how fair my machine learning model is. I'm going to disproportionately miss out on people who are not reachable by phone, that I don't have accurate numbers for, and that don't speak English. That's not an AI problem, but it's the impact of my system.

I think it's one of the things that we kind of need to keep in mind, and it's not a novel idea. It's a pretty simple idea. We're looking at a larger system of which our machine learning or AI is a tiny component.

We need to look at the whole thing and not just focus on let's make the data unbiased, let's make our model unbiased. We need to think about how we measure the equity in outcomes of the people we're trying to help.

Then how do we figure out which component of our system? Is it the human component? Is the intervention component? Is it the designer component? Is it the machine learning component? Focus on the things that will help us get there.

I went on a little bit of a tangent because the question was around changing one type of bias might lead to another type of bias. While you were right. You want to set up your framework in a way that you are auditing the types of biases you're introducing.

You may never get to a totally unbiased system. But that's not the immediate goal. When we build a system, we want to compare that system to the status quo, not to perfection.

We want to measure and say, "Look. Is this system better enough than the status quo for it to be worth implementing?" If it is, let's get it there.

Yes, we introduce some biases. We reduce some biases. Let's go through a process where we measure that change and figure out if this change is better enough to implement.

What's important is to figure out who should be part of that discussion. Who is impacted by it are the primary people, who are using it, who is going to be taking action on it.

If we set up our framework right, we should be able to make an informed decision about, again, are we better off than the status quo, than the system that's being used today (whether it's human-driven, machine-driven, or a combination)? That's at least how I think about it is setting up that baseline based on today, trying to improve over that, measuring that, running pilots, and then seeing is this worth implementing for now while we continue to improve and get towards that perfection that we want.

How can we avoid human bias in AI algorithms and data?

Michael Krigsman: Arsalan Khan—who is a regular listener of CXOTalk and asks questions that get right to the point—comes back, responds to you, and says, "Okay, fine. So, how do we avoid bias in the data and in the algorithms that power the AI?" Given everything you're talking about, what do we actually do?

Rayid Ghani: We do basically three things. We set up an audit process of our system. During the development process (when we're building our systems), we keep auditing them to see what biases do they have, on which groups, and how do they compare to the status quo.

That gives us an accurate assessment, as accurate as we can, of how good or bad is our current thing that we've developed so far. If it's not good enough, if it's not better enough, then there's a series of methods and tools that exist to try to reduce that bias.

If the bias is coming from the data, we might need to collect additional data from sources, so we go out, collect, and get data. If the bias is coming from design choices we made, we make different design choices.

If we develop a model that is—

Let me give you an example of a specific project. This is work we're doing with Los Angeles, and this is with the city attorney's office in Los Angeles.

Their goal was to reduce the number of people who are coming back into misdemeanor recidivism, to the criminal justice system.

A lot of these people, they needed different diversion programs, different social services programs.

Because they weren't being provided with those, they kept getting stuck in the system.

The city attorney's office said, "Look. If you can help us figure out which individuals are likely to be at risk of coming back into the system, we're going to figure out programs for the that we can provide and connect them with so that it reduces their risk of cycling into the system."

That's sort of a common project we've done with several different organization – some in Kansas and some other cities and counties – to help them do preventative outreach programs, mental health assistance programs, social services programs to help people who may otherwise be at risk from the criminal justice system.

The first system we built was that we could help a couple of hundred people a month. Can we give them a list of people who they should be prioritizing?

The first system we built was focused on efficiency. The idea was, can we build a system as accurate as possible so, when we give them a list of 150, as many of them as possible are going to come back into the system, so that all their resources are allocated towards people (or as many of them towards people) who are actually at risk?

Now, you might call it the most accurate system. But if you think about it from ethical values, it's an efficient system. It's maximizing the number of people you're helping with limited resources.

When we looked at the system, we found that it was accurate, but it was more accurate for white people than non-white people. It turns out that, in LA, the recidivism rate was much higher for non-white people than white. We're starting with this recidivism rate for non-white, this one for white, and we have a system which is accurate, but differentially accurate.

Now, if you play it out over the years, it starts off by here, and it's more accurate. It's accurate for both. It reduces the rate for both. But it reduces the white rate faster than the non-white rate.

You start with here, and you increase the disparities over time in the recidivism rate. It is most efficient: 80% of these people come back, but it increases disparities.

We built option number two, and we tweaked the algorithms to be equally accurate for both groups. It was slightly less efficient, a full two percentage points, and here's what that did. It reduced recidivism rates for both groups, but equally.

It didn't decrease disparities, but it increased disparity. It kind of propagated, in some ways, the status quote disparity at a 2% additional cost.

We built option number three, which was, we increased the accuracy for non-white people proportional to their needs. Again, it was accurate for both, but more for non-white than white. It started from the same place, but now this is what it would do; it would decrease the disparities and get to equity in recidivism rates, and it was about the same cost as number two.

Basically, we provided this menu option:

  • Option number one focused on equity. It's 80% efficient, but it increases disparity.
  • Option number two, equality, meh, you know, 2% less efficient (78%). It preserved status quo disparities. It doesn't increase. It doesn't decrease.
  • Option number three, the same efficiency but decreases the disparities to get to equity. What do you want to do?

Now, on the back, there's a bunch, and I point you to papers. If you search for "Los Angeles," "fairness," and my name, you'll see papers there. We talk about what were the technical details. We had to develop certain algorithms to deal with the bias in our models, in our predictions.

From the policy side, providing a policy menu and saying, "Well, here's the cost, here's the outcome. What do you care about?" that's sort of a tangible thing you can do and work with the different stakeholders to figure out what do we want to do.

They ended up choosing to go with number three, and that was the right thing to do. But we often don't give business managers and policymakers these types. We give lots of technical jargon and don't make it clear that, yes, there's a lot of technical stuff in the back, there are a lot of different algorithms doing a lot of bias correction and things, but in the front, those choices are really policy choices, and they rely on the values you care about.

Our job as AI and machine learning design and developers really is mapping from the values to the code that we write and the models that we build so that the outcome is aligned with these values and, in the process, has to be much more inclusive. That's an example of a project where, very tangibly, we had to figure out how to do this.

What skills are needed to create explainable AI and focus on AI ethics and society?

Michael Krigsman: We then have a question following up from this. This is on LinkedIn from Suman Kumar Chandra. He comes back and says, "Okay. Given all of this, on the skillset front, what skills are needed in this field to add value in data-driven equity, fairness, justice (apart from the engineering skills)?"

Rayid Ghani: Without the people who have these skills, we're not going to get there. I think that's where the skills that you need. Yes, you need the technical skills, the math skills, the data skills, the programming skills, the engineering skills.

You don't need to be an expert in ethics, every machine learning person. Ideally, every machine learning and AI person is also an ethicist, but that takes years of training and experience.

I think what we need to be aware of is we have to build teams that consist of people who have that training and that background. We need to be able to have a conversation. We need to be able to be a little bit more people-driven.

The human-centered design is another buzzword, but that's something we need to learn how to do because all of these systems (that at least I'm talking about) are designed to achieve certain outcomes for humans. They're designed for humans, to work with humans, and to impact human outcomes.

The skills to work with communities, to understand their values, to reach consensus, to work with people, to work with social scientists are in a lot of the projects that I work on. And I run a program called Data Science for Social Good.

That's actually going on right now where we have about 24 (mostly grad students) coming in from around the world working on projects like this that I'm talking about. That set of students, they come from computer science, they come from social science, and they come from stats, math, and engineering because we need those skills. We need people from design.

Usually, when I talk about these skills, there's a large set of skills that we need. We need an understanding of law, ethics, and communication, and managing teams, and managing programs, and working with community organizations. It doesn't mean we all have to be experts in all of them. We have to understand the need for them, and we have to be able to work with people who have deep expertise there and work in teams.

That would be the mini answer is you don't need to be an expert. You need to be able to collaborate with people with those skills, bring them into your teams, be aware of that need, and be open to inputs.

I think the best way to do it, honestly, is to work with people like that. It's very hard to pick up that skillset by yourself and be an expert in all of these skills, but I think those of you who are building teams in your organizations, having that set of people who have that collective expertise can talk to each other and can work with each other, that's the recipe (at least we've found) that helps us when we embark on these projects.

What kinds of policy governance is needed to oversee AI algorithms and reduce human bias?

Michael Krigsman: Arsalan comes back, and he wants to know about governance. He says, "Since everyone is creating their own AI, do we need a broader governance structure for AI even at the federal level or through a third party that determines which AI is good and which AI is bad (from an equity standpoint)?"

Rayid Ghani: I wouldn't say longstanding conversations because long in AI doesn't mean very much. But for the last couple of years, that conversation has been happening.

Some countries are farther ahead. Some agencies are farther ahead. Europe has been further ahead in thinking about what are some of the regulations around governance.

The U.S. has been a little bit more distributed, so federal. Every company has a few of these frameworks. Every city has at least one, if not more. Federal agencies have been starting to create one.

A couple of months back, NIST (National Institutes of Standards and Technology) came up with what was a good start towards a governance framework for these.

There are going to be many of them. They are all going to be somewhat similar. I think one thing that we need there that we're getting there – we're not there yet – is they're all very high level. We will probably agree with all of them, but they're not necessarily operational. They have lots of good frameworks.

Now, if you say, "Okay. What do I do tomorrow? Give me a tangible set of things that I am going to do," they're not going to get there today. They're working towards that.

That being said, I think, for me, in some ways, the very simple way that I describe what AI is better than another AI, well, it's the one that achieves the values we care about more reliably. Now, I'm just punting your question because I'm taking your question of building the AI and put it on the value side.

The values I want is not an AI question. I think that if we all agreed on this, "Here are the values we need to have in our system. We want it to be equitable for these types of people over this type of period for these types of outcomes, and we're going to respect this type of privacy, and these people need to be accountable, and it needs to be transparent in this manner," if we achieve those requirements, then we need to compare different systems and figure it out because the values already exist.

If you take federal agencies – take the Consumer Finance Protection Bureau, take the Environmental Protection Agency, take the Federal Trade Commission, take the Food and Drug Administration – all of those agencies have these values. Those values are for their human-driven systems today. They know what is fair lending and what is unfair lending, what does the process for drug approval look like, what is a violation of hazardous waste disposal, what is antitrust, all of the privacy.

They know the values. What they don't have expertise in, they don't have the people, the tools, and the processes to do investigations, to do audits when AI systems are involved. That's what they're trying to figure out right now is how we do that.

They're making a good start. A lot of the values exist, but we haven't kind of updated them to reflect the needs when the thing on the other side is not a human process but an AI supported process. Not an AI automated process.

None of these things are automated. They're still ideally, and for good reason, collaborative. AI is supporting humans making decisions.

I think that the one that's better is the one that better helps you achieve those values. The question is, can we design tools and processes? Can we train people to create a system that's better able to evaluate that, to audit that, to compare that, and to monitor that because, again, unlike a lot of other technologies, what's different about AI also is that it's not static.

Even if it is static, the world is not static. The world applied to an AI system will give you different results over time, so you need to be constantly monitoring and updating. You have to have the people who are also doing these updates.

Michael Krigsman: What happens if policymakers are trying to develop governance frameworks, what have you, and they just simply don't understand AI? What do we do about this? Very quickly.

Rayid Ghani: You should work with them, collaborate with them, help them. They shouldn't be developing in isolation, and neither should the private sector (in isolation). I think this is a collaborative exercise, so we need to work together.

It needs to be a transparent and open process. It shouldn't be just built on the side and thrown over saying, "This is the governance process." That's kind of my very short answer.

Michael Krigsman: Arsalan comes back, and he says, "We know AI has great potential, but how do you avoid people who veto the recommendations by the AI? What do you do when the AI is making a recommendation you know, as an expert, is the right recommendation, and the people who are in charge are saying, 'No, no, no. That's wrong.'?"

Rayid Ghani: It's not an autonomous system that's making a decision. It's supporting a human.

Sometimes, the AI is right and the human can use that help. Many times, most of the time the AI is wrong and the human needs to be able to override the AI.

I think the question becomes; how do we design our systems to not be this one shot? "I'm going to give you a decision. I'm going to give you a recommendation."

Really, I talk to caseworkers, social service workers, unemployment counselors, high school counselors, people in the frontline, people who are doing really important work who we want to support with these tools. They don't want an answer. They want more information. They want to make an informed decision.

The way to design these systems is to be much more collaborative, to have a conversation with them and say, "I think this is the recommendation."

They might ask and say, "Well, why do you think so?"

"It was because of this."

It's like, "Yeah, but last time we did this with this person, here's what the outcome was. It didn't work for them."

It's like, "Yeah, but this person is—"

Think of it as designing these systems to be much more interactive because the way I look at it is if I build a system, it can be (as you said) 100% accurate, and if the person taking the output and, every time, overrides it, then my impact is zero. I need this person to work, to take action, but only when that action is the right action.

I think today's AI, at least in these areas that we're working on, isn't reliable and robust enough for me to say, "I want to build something that you follow every single time blindly." I don't trust it. It's not there yet.

I really want this to be explainable and transparent. It's not just transparency because transparency doesn't mean comprehensibility.

I want something that this person can interact with, figure out when it's wrong, why it's wrong, override it when it makes sense, correct it when it makes sense, follow it when it's right. Really, together, get towards outcomes that we want.

That's not easy. What I'm saying we're building is not easy, but also I think it's too dangerous to say, "I trust my system is right all the time. This human just doesn't follow it." I don't know of an AI system that is so good, so right, and only if I change the humans.

I think we need to develop things that are a little bit more aware of their uncertainty, a little bit more aware of when they're right, when they're wrong, and able to explain a little bit of their reasoning in order to help the human collaborator better assess its limitations and its benefits and take action. At the end, we want the right action to be taken regardless of where it came from. That would be my advice there.

How can we increase AI explainability?

Michael Krigsman: What advice do you have for folks on the subject of--? I'm laughing because I want to see how you respond to this one. In one sentence – algorithm, transparency, and explainability – what advice do you have?

Rayid Ghani: Figure out what you need this transparency and explainability for. What's the purpose? What's the goal?

Who do you need it for? Who is the consumer of this?

How are you going to measure if your system is transparent and explainable?

Anchor it on a real problem. Don't do something abstract.

Take the problem that you're trying to solve in that system. Figure out who is the user that you want this transparency and explainability for. How would you evaluate it is transparent and explainable? What does that help you achieve?

There are some writings, some papers, that we have that go a little bit more into how you do that. I'm happy to point people to that if you're interested.

Michael Krigsman: We have a lot of technologists and data science folks who listen to CXOTalk. What advice do you have for technologists, including chief information officers, on the topics of fairness and equitability and what they need to focus on? Again, just in one sentence, please.

Rayid Ghani: Build your teams to have this collection of skills and expose them to the entire flow from what is the goal of the product or the service you're developing, the formulation of that, the data source, identification, the development of it, the evaluation, monitoring, and the deployment. We often silo our people, and we keep them very single disciplinary.

If we want to build these systems that have high impact, an equitable impact, we need to broaden the teams and make the teams much more diverse and inclusive to reflect the problems we're working on, but also give them visibility on the entire chain because the more they have that, the more they're equipped with having the right environment, the more likely they are to produce the outcomes, the systems that we care about.

Michael Krigsman: Okay. With that, we are out of time. I want to say a huge thank you to Rayid Ghani. He is with the machine learning department in computer science at Carnegie Mellon University. Rayid, thank you so much for taking time with us today.

Rayid Ghani: No, thank you. These questions and conversation is great. Thanks for having me here.

Michael Krigsman: A huge thank you to everybody who watched. Your questions, as always, are phenomenal. You're a great audience. I learn so much from you guys every single time. I feel like I'm the proxy or the channel for you guys who are watching, so lucky me, and lucky you for getting to ask questions as well.

Now, before you go, please subscribe to our newsletter by hitting the subscribe button at the top of our website. If you're watching on YouTube, then you should subscribe to our YouTube channel, and we'll keep you notified. You'll get to hear about all these great, upcoming episodes.

Everybody, thank you so much for watching. I hope you have a great day, and we'll see you soon. 

Rayid Ghani: How do machine learning and AI systems get bias introduced? The primary source is the world. The world is biased. The world is sexist. The world is racist. So, if all your data was totally accurate, it reflected exactly things that happened in the world, perfectly accurately, it would – the data would not be biased, but the processes that we live in, the world we live in is biased. Just to be very clear; that's the major source of bias.

About Rayid Ghani and technology for good

Michael Krigsman: "Data Science, AI, and Machine Learning for Good." That's Rayid Ghani of Carnegie Mellon University.

Rayid Ghani: My work is at the intersection of all the machine learning, AI, data science buzzwords and all the social impact, public policy buzzwords, so the intersection of those areas. Most of my work is anchored around problems that I work on with governments and nonprofits around health, education, criminal justice, policing, and transportation environment.

With those problems, I then also focus on research areas that come up around fairness, explainability, and also on teaching that provide students experiential learning opportunities to tackle those problems (with partnerships with governments and nonprofits).

Why is responsible AI important?

Michael Krigsman: Can you tell us why you're so passionate about this and why it's such an important set of issues?

Rayid Ghani: For me, the starting point was really trying to tackle problems that a lot of these organizations are facing when they're trying to improve outcomes for society, for people. They're trying to look at public health outcomes around improving people's health or providing people employment opportunities, problems like, which kids might get lead poisoning and how do we act preventatively to reduce that risk; or which people might have unemployment, and what do we do proactively?

What skills do we provide them? What support do we provide them? What people are going to be subjected to police misconduct, and what can we do – again, preventative things?

That was the starting point for me is here are some big issues in society. How can we use data and evidence to help tackle these problems?

But as I started getting into those problems, I realized that it wasn't off the shelf. It wasn't, we just build something and it works. It required a lot of new things that had to be done and really trying to understand how we do these things in an equitable manner, in an ethical manner, and not just propagate the same old things that have been happening in the past.

It's really coming very much from a problem-centered, human-centered view of people going through these problems. Can we help organizations better improve outcomes for them? Then running into challenges when doing that and tackling those challenges through these approaches.

What are the ethical challenges in data science and AI?

Michael Krigsman: What are some of the core challenges that arise in relation to data science, machine learning, AI, and so forth?

Rayid Ghani: There isn't quite a shared understanding. I'm pretty inclusive in how I use these terms. For me, they're interchangeable. They're all part of the same thing.

Yes. They're trying to all use data and evidence to improve decisions. In that context, we treat them very broadly.

When we go a little bit deeper, we find that a lot of these technologies are very good at very narrow and very specific tasks and very specific contexts. The more narrow the task, the better these systems are going to be.

They also heavily rely on data, and the data is often curated, again, very carefully by people. They also heavily rely on a lot of design choices that are made by people.

As much as we think about these technologies as, "Oh, they can do a lot of things," the fundamental approach, the way they work, is they rely on a lot of choices people make, designers of these systems make, developers of these systems make.

That's where a lot of the challenges come, challenges around: Will they result in fair and equitable outcomes? Will they be understandable by the people who are using them and, more importantly, who are being affected by them? Are they robust to changes?

Just because some system is working today at being fair, does it then continue that way? Is it respecting the privacy of the individuals who are being impacted?

Is transparency involved there? What does transparency even mean when you're dealing with some of these issues that are not commonly well understood by people?

Are they inclusive? Who is accountable for the use of these: the people developing them, the people using them, the people being affected by them?

There's a whole series of ethical values that we want to design these systems to achieve. The question becomes, what values should they be, whose values, who should be the stakeholders involved in soliciting these values and getting to a consensus if they don't agree, and then how should we build these systems to achieve those values once you've figured out what they need to do? Those are a pretty massive set of challenges based on what we've seen so far.

What is the source of bias in AI?

Michael Krigsman: Can you give us a couple of examples of where bias or inequitable results arise because of these issues you were just describing?

Rayid Ghani: There are places we're all sort of used to seeing, and none of those places are any different than what the human systems today are creating biases.

If you take healthcare as an example, there are a lot of systems being developed that are used to do things like early warning systems, early diagnosis of diseases. Some of the work we were doing a few years back with hospital systems was around can we detect which patients might be at risk of diabetes. We wanted to detect that early in order for the physicians to provide lifestyle change support and change behaviors of people, so could prevent that from happening.

Now, at that level, that sounds great. Why would we not want to do that?

The question is if you've got limited resources to allocate to people in order to prevent diabetes – that hospital has resources to do additional outreach programs for, let's say, 1,000 patients – which patients should they be? These AI systems flag individuals as high risk, but they're going to make several types of mistakes. No system is perfect.

They might flag somebody who is actually not at risk. It's called a false positive. Or they might omit somebody who is actually at risk. That's called a false negative.

They make all types of mistakes. But if you make those mistakes randomly, you sort of miss some people. You flag some incorrectly. That's sort of the nature of these systems.

But you're disproportionately making mistakes on certain types of people who have systematically been disadvantaged. So, take people again in the U.S. healthcare system. Typically, black and brown people underuse hospitals. They don't have access to hospitals as much. They don't go to the hospital as much. They get underdiagnosed. It's very well known.

While that's a fact, when an AI system is built on top of that, it propagates that. If somebody is underdiagnosed, the AI system is going to basically even further under-diagnose those individuals and not flag them as high risk of something, which means now the physicians, in addition to what they were doing before, they are even more not providing them with the support programs, with the inventions. That's going to result in even more underdiagnosed.

That's an example of a system that's built to replicate the past, which is what most AI and machine learning, by default, are. You give it training data, and you ask it to build a system that best replicates that data (in the past).

Now, when new patients come to you, and you score these patients to flag them as high risk, if your historical data was biased (in this case because certain people are being underdiagnosed for various reasons), then your AI system will just propagate that. That causes issues around equity and fairness and leads to disparate outcomes for people who already are undergoing such issues historically.

That's one example, but that repeats the same thing around the criminal justice systems, if you're trying to use a system to figure out who needs mental health services or social services. You might miss people, disproportionately.

The same for the work that we're doing around child welfare or in education and after-school programs to get students to graduate on time from high school. If we miss students disproportionately from different backgrounds, that ends up hurting those groups disproportionately and making the inequities that exist today even worse.

Michael Krigsman: We have a couple of really interesting questions that have popped up on both Twitter and LinkedIn. I'm just going to start taking them in order.

What are some examples of AI ethical issues in healthcare?

The first point is Suman Kumar Chandra. This is from LinkedIn. He says, "Can you give an example of how human intervention and choices will affect/impact the outcomes that you were just describing, those negative outcomes?"

Rayid Ghani: What happens is when we build a machine learning system, we make a series of choices. Let's say we make thousands of choices. It starts by thinking about how we formulate that problem.

There's a pretty popular paper now (that came out a couple of years ago) around healthcare. What the paper was doing was predicting which individuals in the healthcare system will have additional healthcare needs. Who is going to be more in need? They formulated that problem.

When you build machine learning models, one of the key things we use are called labels or outcomes. It's not as if an outcome is handed to us. We formulate the problem and define an outcome.

In this case, what they thought was, "Oh, how much did this person cost? How much money was spent on them by the healthcare system?" It seems like a reasonable proxy for their needs.

High cost, high needs, so let's just build a system. The intention was to predict need. The actual design choice made was to predict cost because they thought it would be a proxy and a reasonable proxy.

Now, what happened was, when they build that system, it predicted who is going to have high cost. Well, again, there are going to be some people who will have high needs but low cost because they didn't use the system very much.

They needed the system; they didn't use it. The system failed them. They didn't have access to it. They didn't order the right tests, they went to places that didn't do the right tests, or they were in areas where the cost was less.

Now, what happened was that while that system was predicting cost, the assumption was predicting need. If we had formulated the problem in different ways, if we had formulated in additional ways to think about what are good proxies for actual needs instead of just cost. That was the very basic choice we made.

Then additionally, we made choices around when we link data sources together. It's a common component of most machine learning systems is we link different data sources, and we do this in record linkage or matching.

When we do that, we make mistakes sometimes. We link people that shouldn't be linked together, and we miss people that should be linked together.

Those mistakes are not random. Often, people whose names are long with lots of consonants together, get mistyped and mis-linked. People's names are very similar and common names like Joe and Mike, they get linked together even though they might not be the same person.

Ideally, that mistake wouldn't happen. But, in practice, it does happen.

What happens, if you link two people together who are not the same, you've added interactions. You've added somebody else's information to this person's record. If you miss somebody, you've removed information.

If you're now applying for a loan and your name was not linked together, you might have had extra jobs and income that the system doesn't know about you now, and it might predict that you're not able to pay back this loan. You don't have the credit history, and you might not get the loan. If somebody else with a similar name to you has a bankruptcy and you've linked their data together, now you've added a bankruptcy to the record.

How you link things together, the mistakes you make, how you design those things, how you think about if you're doing simple things, like we have missing information about people – somebody's income is missing – and we do something called imputation (fill in missing values).

If you don't think about it carefully and your data is, let's say, 80% males and 20% non-males, if you are a non-male with missing income, you might do something really naïve, bad, and just say, "I'm going to take the mean off everyone." Now you've made non-males look a lot like males because 80% of the data is male. Now you've imputed income.

None of those choices are inherent in machine learning, in the data you got, in the problem you're solving. It's a choice that you made as a machine learning or AI system developer.

What we do in our projects, in our training, in our programs, we discuss each choice we make at each step of the process, of the ML process, the AI process, and talk about what are the design choices we're making – What options do we have? What are the ethical implications of each choice downstream (two months later, four months later, a year later) on the people who are being impacted? – and try to make those choices consciously to minimize that risk that we're taking at this point.

As opposed to doing kind of an upfront or a retrospective, we do a continuous ethics conversation around each phase of the project. These choices that seem very small have pretty high impact on ethical and equity issues when applied to people in these contexts.

Michael Krigsman: Please subscribe to our newsletter by hitting the subscribe button at the top of our website. If you're watching on YouTube, then you should subscribe to our YouTube channel.

We have a couple of questions now from Tim Crawford. I know Tim. He is very prominent in CIO circles.

What is the impact of culture in driving socially responsible AI?

Tim's first question, he says, "What is the impact of culture in driving AI, or is it the reverse that AI is driving culture?" The interplay between culture, AI, and addressing the issues you've just been describing.

Rayid Ghani: I think it's two types of culture there. One is the culture in society. Let's talk about those.

One is the teams that are developing in designing these systems, that culture. The second is the society that these systems are being designed for.

Ideally, those two have high overlap. The people designing these systems are in the community that these systems are being designed ideally with and not for.

Unfortunately, today that is not true. Most of us developing these types of systems are not part of the communities they're necessarily being developed for, and they're not being developed with them. They're being developed for them.

I think that's one dimension where we need to be careful about is we need to develop the talent and the people who are working with communities, that are being co-created with these communities, and they're part of the process. That's not the case today.

The other piece that I think is important to think about is a lot of these teams. We need to have the right environment set up for these teams, the right culture set up so that these types of discussions happen.

What we often hear about is people are worried about raising these types of concerns in engineering teams. "Well, I have an issue. I have a concern about some of the choices we're making because they might have ethical issues," and they don't have a good setup for bringing up these issues, for having a framework.

I think we need to have the right environment to do this. We need to have the right people with the right training so that they know how to have these discussions and that they've had them before.

I think, in addition to that, we do need more processes and checklists. We need to make sure that we follow the process. Have we had conversations with all the stakeholders?

We often talk to people. At least in lots of spaces I work in where we're working on these types of societal issues, we talk to people who are using the system, but we may not talk to people who are being impacted by the system. Having a process that makes them be a critical part of the process is important. I think we're getting there, but we don't have repeatable processes to standardize in order to do that. I think we need to get there.

I think the development, the tech and AI development culture of today is very much sort of build things fast and fail fast and break things. I doesn't necessarily work when we're dealing with fairly critical issues around society, around, again, health, education, and criminal justice. We need to be much more conscious of the impact we're having on people and development environments and cultures where this can happen much more responsibly.

How can we address human bias when it comes to AI and machine learning?

Michael Krigsman: We have a couple of questions on the issues of bias. I want to go back. Tim Crawford has a second question, which he says, "Do you introduce new types of bias as you're trying to shape the data to remove the original bias that you were trying to address?" Let's talk about that first.

Rayid Ghani: Data is not the only or, in many cases, even the major source of bias in those types of systems. You step back and say, "Well, how do machine learning and AI systems get bias introduced in them?"

The primary source is the world. The world is biased. The world is sexist. The world is racist.

If all your data was totally accurate, it reflected exactly things that happen in the world, perfectly accurately. The data would not be biased, but the processes that we live in, the world we live in is bias.

Just to be very clear; that's the major source of bias. That bias comes in, and the world generates data that embeds all of its biases in there.

Then we make choices as system designers. We use certain data sources because they're available to us.

If I'm in a hospital, again, I'm going to use my EHR system because that's what I have. I may not go out and say, "Well, there are a lot of people in my community who don't come to my hospital. Maybe I should go collect data about them, with them, and understand what I'm missing." Or I might have access to social media data, and I will just hit that data and assume it reflects the entire world.

I make a lot of choices around data and, to Tim's point, if I add a data source, I am reducing one type of bias. But I may be adding another type of bias. The next source of bias becomes the design choice that I talked about, how my ML is working and what choices I'm making there.

Let's say I've magically figured out some way of getting everything right up to now. The last piece in most of these systems is often the human intervention. It's, how did the physicians intervene? How did the social service worker intervene?

That intervention, the action that we take – an unemployment counselor connecting somebody with a training program, or the after-school program coordinator enrolling a child into a program to help them graduate on time, or a health inspector going into a house and looking for lead hazards – my AI system can be perfectly fair, but that intervention can be unfair.

What if I'm doing outreach for a program for a mental health outreach program? My outreach program is by phone in English. It doesn't matter how fair my machine learning model is. I'm going to disproportionately miss out on people who are not reachable by phone, that I don't have accurate numbers for, and that don't speak English. That's not an AI problem, but it's the impact of my system.

I think it's one of the things that we kind of need to keep in mind, and it's not a novel idea. It's a pretty simple idea. We're looking at a larger system of which our machine learning or AI is a tiny component.

We need to look at the whole thing and not just focus on let's make the data unbiased, let's make our model unbiased. We need to think about how we measure the equity in outcomes of the people we're trying to help.

Then how do we figure out which component of our system? Is it the human component? Is the intervention component? Is it the designer component? Is it the machine learning component? Focus on the things that will help us get there.

I went on a little bit of a tangent because the question was around changing one type of bias might lead to another type of bias. While you were right. You want to set up your framework in a way that you are auditing the types of biases you're introducing.

You may never get to a totally unbiased system. But that's not the immediate goal. When we build a system, we want to compare that system to the status quo, not to perfection.

We want to measure and say, "Look. Is this system better enough than the status quo for it to be worth implementing?" If it is, let's get it there.

Yes, we introduce some biases. We reduce some biases. Let's go through a process where we measure that change and figure out if this change is better enough to implement.

What's important is to figure out who should be part of that discussion. Who is impacted by it are the primary people, who are using it, who is going to be taking action on it.

If we set up our framework right, we should be able to make an informed decision about, again, are we better off than the status quo, than the system that's being used today (whether it's human-driven, machine-driven, or a combination)? That's at least how I think about it is setting up that baseline based on today, trying to improve over that, measuring that, running pilots, and then seeing is this worth implementing for now while we continue to improve and get towards that perfection that we want.

How can we avoid human bias in AI algorithms and data?

Michael Krigsman: Arsalan Khan—who is a regular listener of CXOTalk and asks questions that get right to the point—comes back, responds to you, and says, "Okay, fine. So, how do we avoid bias in the data and in the algorithms that power the AI?" Given everything you're talking about, what do we actually do?

Rayid Ghani: We do basically three things. We set up an audit process of our system. During the development process (when we're building our systems), we keep auditing them to see what biases do they have, on which groups, and how do they compare to the status quo.

That gives us an accurate assessment, as accurate as we can, of how good or bad is our current thing that we've developed so far. If it's not good enough, if it's not better enough, then there's a series of methods and tools that exist to try to reduce that bias.

If the bias is coming from the data, we might need to collect additional data from sources, so we go out, collect, and get data. If the bias is coming from design choices we made, we make different design choices.

If we develop a model that is—

Let me give you an example of a specific project. This is work we're doing with Los Angeles, and this is with the city attorney's office in Los Angeles.

Their goal was to reduce the number of people who are coming back into misdemeanor recidivism, to the criminal justice system.

A lot of these people, they needed different diversion programs, different social services programs.

Because they weren't being provided with those, they kept getting stuck in the system.

The city attorney's office said, "Look. If you can help us figure out which individuals are likely to be at risk of coming back into the system, we're going to figure out programs for the that we can provide and connect them with so that it reduces their risk of cycling into the system."

That's sort of a common project we've done with several different organization – some in Kansas and some other cities and counties – to help them do preventative outreach programs, mental health assistance programs, social services programs to help people who may otherwise be at risk from the criminal justice system.

The first system we built was that we could help a couple of hundred people a month. Can we give them a list of people who they should be prioritizing?

The first system we built was focused on efficiency. The idea was, can we build a system as accurate as possible so, when we give them a list of 150, as many of them as possible are going to come back into the system, so that all their resources are allocated towards people (or as many of them towards people) who are actually at risk?

Now, you might call it the most accurate system. But if you think about it from ethical values, it's an efficient system. It's maximizing the number of people you're helping with limited resources.

When we looked at the system, we found that it was accurate, but it was more accurate for white people than non-white people. It turns out that, in LA, the recidivism rate was much higher for non-white people than white. We're starting with this recidivism rate for non-white, this one for white, and we have a system which is accurate, but differentially accurate.

Now, if you play it out over the years, it starts off by here, and it's more accurate. It's accurate for both. It reduces the rate for both. But it reduces the white rate faster than the non-white rate.

You start with here, and you increase the disparities over time in the recidivism rate. It is most efficient: 80% of these people come back, but it increases disparities.

We built option number two, and we tweaked the algorithms to be equally accurate for both groups. It was slightly less efficient, a full two percentage points, and here's what that did. It reduced recidivism rates for both groups, but equally.

It didn't decrease disparities, but it increased disparity. It kind of propagated, in some ways, the status quote disparity at a 2% additional cost.

We built option number three, which was, we increased the accuracy for non-white people proportional to their needs. Again, it was accurate for both, but more for non-white than white. It started from the same place, but now this is what it would do; it would decrease the disparities and get to equity in recidivism rates, and it was about the same cost as number two.

Basically, we provided this menu option:

  • Option number one focused on equity. It's 80% efficient, but it increases disparity.
  • Option number two, equality, meh, you know, 2% less efficient (78%). It preserved status quo disparities. It doesn't increase. It doesn't decrease.
  • Option number three, the same efficiency but decreases the disparities to get to equity. What do you want to do?

Now, on the back, there's a bunch, and I point you to papers. If you search for "Los Angeles," "fairness," and my name, you'll see papers there. We talk about what were the technical details. We had to develop certain algorithms to deal with the bias in our models, in our predictions.

From the policy side, providing a policy menu and saying, "Well, here's the cost, here's the outcome. What do you care about?" that's sort of a tangible thing you can do and work with the different stakeholders to figure out what do we want to do.

They ended up choosing to go with number three, and that was the right thing to do. But we often don't give business managers and policymakers these types. We give lots of technical jargon and don't make it clear that, yes, there's a lot of technical stuff in the back, there are a lot of different algorithms doing a lot of bias correction and things, but in the front, those choices are really policy choices, and they rely on the values you care about.

Our job as AI and machine learning design and developers really is mapping from the values to the code that we write and the models that we build so that the outcome is aligned with these values and, in the process, has to be much more inclusive. That's an example of a project where, very tangibly, we had to figure out how to do this.

What skills are needed to create explainable AI and focus on AI ethics and society?

Michael Krigsman: We then have a question following up from this. This is on LinkedIn from Suman Kumar Chandra. He comes back and says, "Okay. Given all of this, on the skillset front, what skills are needed in this field to add value in data-driven equity, fairness, justice (apart from the engineering skills)?"

Rayid Ghani: Without the people who have these skills, we're not going to get there. I think that's where the skills that you need. Yes, you need the technical skills, the math skills, the data skills, the programming skills, the engineering skills.

You don't need to be an expert in ethics, every machine learning person. Ideally, every machine learning and AI person is also an ethicist, but that takes years of training and experience.

I think what we need to be aware of is we have to build teams that consist of people who have that training and that background. We need to be able to have a conversation. We need to be able to be a little bit more people-driven.

The human-centered design is another buzzword, but that's something we need to learn how to do because all of these systems (that at least I'm talking about) are designed to achieve certain outcomes for humans. They're designed for humans, to work with humans, and to impact human outcomes.

The skills to work with communities, to understand their values, to reach consensus, to work with people, to work with social scientists are in a lot of the projects that I work on. And I run a program called Data Science for Social Good.

That's actually going on right now where we have about 24 (mostly grad students) coming in from around the world working on projects like this that I'm talking about. That set of students, they come from computer science, they come from social science, and they come from stats, math, and engineering because we need those skills. We need people from design.

Usually, when I talk about these skills, there's a large set of skills that we need. We need an understanding of law, ethics, and communication, and managing teams, and managing programs, and working with community organizations. It doesn't mean we all have to be experts in all of them. We have to understand the need for them, and we have to be able to work with people who have deep expertise there and work in teams.

That would be the mini answer is you don't need to be an expert. You need to be able to collaborate with people with those skills, bring them into your teams, be aware of that need, and be open to inputs.

I think the best way to do it, honestly, is to work with people like that. It's very hard to pick up that skillset by yourself and be an expert in all of these skills, but I think those of you who are building teams in your organizations, having that set of people who have that collective expertise can talk to each other and can work with each other, that's the recipe (at least we've found) that helps us when we embark on these projects.

What kinds of policy governance is needed to oversee AI algorithms and reduce human bias?

Michael Krigsman: Arsalan comes back, and he wants to know about governance. He says, "Since everyone is creating their own AI, do we need a broader governance structure for AI even at the federal level or through a third party that determines which AI is good and which AI is bad (from an equity standpoint)?"

Rayid Ghani: I wouldn't say longstanding conversations because long in AI doesn't mean very much. But for the last couple of years, that conversation has been happening.

Some countries are farther ahead. Some agencies are farther ahead. Europe has been further ahead in thinking about what are some of the regulations around governance.

The U.S. has been a little bit more distributed, so federal. Every company has a few of these frameworks. Every city has at least one, if not more. Federal agencies have been starting to create one.

A couple of months back, NIST (National Institutes of Standards and Technology) came up with what was a good start towards a governance framework for these.

There are going to be many of them. They are all going to be somewhat similar. I think one thing that we need there that we're getting there – we're not there yet – is they're all very high level. We will probably agree with all of them, but they're not necessarily operational. They have lots of good frameworks.

Now, if you say, "Okay. What do I do tomorrow? Give me a tangible set of things that I am going to do," they're not going to get there today. They're working towards that.

That being said, I think, for me, in some ways, the very simple way that I describe what AI is better than another AI, well, it's the one that achieves the values we care about more reliably. Now, I'm just punting your question because I'm taking your question of building the AI and put it on the value side.

The values I want is not an AI question. I think that if we all agreed on this, "Here are the values we need to have in our system. We want it to be equitable for these types of people over this type of period for these types of outcomes, and we're going to respect this type of privacy, and these people need to be accountable, and it needs to be transparent in this manner," if we achieve those requirements, then we need to compare different systems and figure it out because the values already exist.

If you take federal agencies – take the Consumer Finance Protection Bureau, take the Environmental Protection Agency, take the Federal Trade Commission, take the Food and Drug Administration – all of those agencies have these values. Those values are for their human-driven systems today. They know what is fair lending and what is unfair lending, what does the process for drug approval look like, what is a violation of hazardous waste disposal, what is antitrust, all of the privacy.

They know the values. What they don't have expertise in, they don't have the people, the tools, and the processes to do investigations, to do audits when AI systems are involved. That's what they're trying to figure out right now is how we do that.

They're making a good start. A lot of the values exist, but we haven't kind of updated them to reflect the needs when the thing on the other side is not a human process but an AI supported process. Not an AI automated process.

None of these things are automated. They're still ideally, and for good reason, collaborative. AI is supporting humans making decisions.

I think that the one that's better is the one that better helps you achieve those values. The question is, can we design tools and processes? Can we train people to create a system that's better able to evaluate that, to audit that, to compare that, and to monitor that because, again, unlike a lot of other technologies, what's different about AI also is that it's not static.

Even if it is static, the world is not static. The world applied to an AI system will give you different results over time, so you need to be constantly monitoring and updating. You have to have the people who are also doing these updates.

Michael Krigsman: What happens if policymakers are trying to develop governance frameworks, what have you, and they just simply don't understand AI? What do we do about this? Very quickly.

Rayid Ghani: You should work with them, collaborate with them, help them. They shouldn't be developing in isolation, and neither should the private sector (in isolation). I think this is a collaborative exercise, so we need to work together.

It needs to be a transparent and open process. It shouldn't be just built on the side and thrown over saying, "This is the governance process." That's kind of my very short answer.

Michael Krigsman: Arsalan comes back, and he says, "We know AI has great potential, but how do you avoid people who veto the recommendations by the AI? What do you do when the AI is making a recommendation you know, as an expert, is the right recommendation, and the people who are in charge are saying, 'No, no, no. That's wrong.'?"

Rayid Ghani: It's not an autonomous system that's making a decision. It's supporting a human.

Sometimes, the AI is right and the human can use that help. Many times, most of the time the AI is wrong and the human needs to be able to override the AI.

I think the question becomes; how do we design our systems to not be this one shot? "I'm going to give you a decision. I'm going to give you a recommendation."

Really, I talk to caseworkers, social service workers, unemployment counselors, high school counselors, people in the frontline, people who are doing really important work who we want to support with these tools. They don't want an answer. They want more information. They want to make an informed decision.

The way to design these systems is to be much more collaborative, to have a conversation with them and say, "I think this is the recommendation."

They might ask and say, "Well, why do you think so?"

"It was because of this."

It's like, "Yeah, but last time we did this with this person, here's what the outcome was. It didn't work for them."

It's like, "Yeah, but this person is—"

Think of it as designing these systems to be much more interactive because the way I look at it is if I build a system, it can be (as you said) 100% accurate, and if the person taking the output and, every time, overrides it, then my impact is zero. I need this person to work, to take action, but only when that action is the right action.

I think today's AI, at least in these areas that we're working on, isn't reliable and robust enough for me to say, "I want to build something that you follow every single time blindly." I don't trust it. It's not there yet.

I really want this to be explainable and transparent. It's not just transparency because transparency doesn't mean comprehensibility.

I want something that this person can interact with, figure out when it's wrong, why it's wrong, override it when it makes sense, correct it when it makes sense, follow it when it's right. Really, together, get towards outcomes that we want.

That's not easy. What I'm saying we're building is not easy, but also I think it's too dangerous to say, "I trust my system is right all the time. This human just doesn't follow it." I don't know of an AI system that is so good, so right, and only if I change the humans.

I think we need to develop things that are a little bit more aware of their uncertainty, a little bit more aware of when they're right, when they're wrong, and able to explain a little bit of their reasoning in order to help the human collaborator better assess its limitations and its benefits and take action. At the end, we want the right action to be taken regardless of where it came from. That would be my advice there.

How can we increase AI explainability?

Michael Krigsman: What advice do you have for folks on the subject of--? I'm laughing because I want to see how you respond to this one. In one sentence – algorithm, transparency, and explainability – what advice do you have?

Rayid Ghani: Figure out what you need this transparency and explainability for. What's the purpose? What's the goal?

Who do you need it for? Who is the consumer of this?

How are you going to measure if your system is transparent and explainable?

Anchor it on a real problem. Don't do something abstract.

Take the problem that you're trying to solve in that system. Figure out who is the user that you want this transparency and explainability for. How would you evaluate it is transparent and explainable? What does that help you achieve?

There are some writings, some papers, that we have that go a little bit more into how you do that. I'm happy to point people to that if you're interested.

Michael Krigsman: We have a lot of technologists and data science folks who listen to CXOTalk. What advice do you have for technologists, including chief information officers, on the topics of fairness and equitability and what they need to focus on? Again, just in one sentence, please.

Rayid Ghani: Build your teams to have this collection of skills and expose them to the entire flow from what is the goal of the product or the service you're developing, the formulation of that, the data source, identification, the development of it, the evaluation, monitoring, and the deployment. We often silo our people, and we keep them very single disciplinary.

If we want to build these systems that have high impact, an equitable impact, we need to broaden the teams and make the teams much more diverse and inclusive to reflect the problems we're working on, but also give them visibility on the entire chain because the more they have that, the more they're equipped with having the right environment, the more likely they are to produce the outcomes, the systems that we care about.

Michael Krigsman: Okay. With that, we are out of time. I want to say a huge thank you to Rayid Ghani. He is with the machine learning department in computer science at Carnegie Mellon University. Rayid, thank you so much for taking time with us today.

Rayid Ghani: No, thank you. These questions and conversation is great. Thanks for having me here.

Michael Krigsman: A huge thank you to everybody who watched. Your questions, as always, are phenomenal. You're a great audience. I learn so much from you guys every single time. I feel like I'm the proxy or the channel for you guys who are watching, so lucky me, and lucky you for getting to ask questions as well.

Now, before you go, please subscribe to our newsletter by hitting the subscribe button at the top of our website. If you're watching on YouTube, then you should subscribe to our YouTube channel, and we'll keep you notified. You'll get to hear about all these great, upcoming episodes.

Everybody, thank you so much for watching. I hope you have a great day, and we'll see you soon.