Data ethics are becoming one of the Chief Data Officer’s core responsibilities, so if you haven't codified your approach to ethical considerations related to data, now is the time to start.

The problem is, most people don’t have more than a vague understanding of what data ethics is, even though they understand how important it is. Officially it’s defined as systemizing, defending, and recommending concepts of right and wrong conduct in relation to data, in particular personal data. I prefer Gartner’s data ethics definition as a system of values and moral principles related to the responsible collection, use and sharing of data. 

But even if your definition isn’t this specific or your understanding very deep, a fair-minded, morally driven use of data is critical for every individual and for every organization. Strong data ethics build stronger organizations. Here’s how:

  1. Data ethics help you remember the people behind the data. Working ethically with data means taking the time to understand how your data may impact other people. This data ethics principle stood out to me in the early days of reporting on COVID-19 when we saw infections and deaths represented as marks on a chart. But these were individuals and families, and data ethics are there to remind us to infuse humanity into our work. Cathy O'Neil's book Weapons of Math Destruction suggests a Hippocratic oath for data scientists, and I made this oath part of our values at a company where I recently worked. As a data scientist or analyst, you’re supposed to understand how your data is going to impact other people. And it's not just what the data is, it's how you talk about it and how it's implemented.

A few ways to get started: 

  • Put together a reading list on data ethics for your team.
  • Make a data ethics discussion part of your weekly or monthly team meeting agenda.
  • Deputize someone on your team to collect and share recent questionable data ethics examples in the world at large. You won’t have to search too long to find examples.
  1. Data ethics guide you in how to model your data. Karl Marx once said, “The road to hell is paved with good intentions,” and the same thing can be true of data. There's no data scientist in the world, I don't think, who sits down at their computer and says, “Today I think I’ll build an algorithm that discriminates against a certain population, and that creates systemic inequalities based on my model.” And yet these algorithms are created all the time. Why? Because most people think that data is objective, and it isn’t. Data ethics teach us to dig deeper when we hear responses such as, “Well, the model told me X, Y, and Z, so that’s just what the data said.” They also teach us not to cherry-pick the results of our models if they support a hypothesis that we like. 

A few ways to get started: 

  • Work on a plan to operationalize machine learning and AI in a way that encourages more ethical use and treatment of data.
  • Look for unspoken or implied assumptions that might be driving the design and presentation of your data. For example: if you test a facial recognition tool on 80,000 male faces but only 5,000 female faces, don’t expect equal accuracy in your results. 
  • Establish data governance standards in regard to data collection ethics. For example, set standards that allow for securing informed consent for data collection, anonymizing PII (personally identifiable information) and ensuring legal compliance.
  1. Data ethics demand that we set the highest standards for our use of data. Data ethics in practice are a function of your own self-discipline. Some possess this discipline more than others, but that’s not to say that it can’t be learned. As data leaders, we owe it to our organizations to drive ethical standards all the way across the business. That means having conversations with peers about data definition, capture, use, and sharing. It also means crafting policies that help us manage data based on its ethical implications. 

A few ways to get started: 

  • Conduct cross-functional meetings on organizational data standards that include Legal, Finance, Marketing, Product Development, and other areas that have a particular focus on data.
  • Appoint or hire a chief ethics officer to make data ethics a more singular focus.
  • Call out and reward team members who demonstrate the high data ethics standards your organization values. 

Of course, we don’t want to run every shred of data through a painstaking clinical trials-like process that slows business to a crawl, but we also can’t afford to forget that data increasingly shapes our credit worthiness, our ability to buy a house, and even funding from our governments based on census results.

In the absence of national oversight bodies that govern the ethical use of data, we need to become more serious about holding ourselves and our organizations to high standards and calling out instances where data is used to exclude or deceive. In my mind that’s the only way forward.