As Krystal Tsosie, a Navajo geneticist and bioethics scholar at Vanderbilt University recalls, her career path toward a PhD in cancer biology was set. Then the lure of data science and genetics opened her eyes to new possibilities. 

“While I was studying cancer biology, I actually patented a couple of technologies related to tumor targeting of cancer cells with some drug analogs we created,” she says. “But then I had this fundamental identity crisis. I knew that if I went into cancer biology long term and I invented something or played a key part in developing a therapeutic, I’d probably be in my seventies before it got through clinical trials. And that drug, if it did make it to market, would probably not benefit people from my own community because Indigenous peoples do not get access to top-tier drugs when they’re new to the market.”

Tsosie explains that a large part of her new career momentum toward genetics stemmed from reading about the unfair use of information gathered from people in her Diné (Navajo) tribe and other Indigenous communities. 

“When I went back to graduate school in the pursuit of a masters in bioethics, I learned about all these genetic controversies in Native American and Indigenous communities related to the technologies being misused, and I realized that genetics and precision medicine are not going to go away.

“In fact,” she adds, “it became clear that Indigenous peoples would be increasingly called upon to be included in datasets. And the best way to ensure that our inclusion was ethical was to ensure that people who look like us were at the helm of the research. This has been the push.” 

Tsosie describes her research trajectory as ethically-centered, equity-centered, and justice-centered approaches to genetics, precision medicine, and health.





IndigiData Curriculum


The IndigiData curriculum places a strong emphasis on data ethics, contextualizing the importance and future of informatics skills for Indigenous peoples within the framework of health, culture, environment, and data.


Data Inspiration: Biomedical Ethics

For Oliver Bear Don’t Walk IV, a member of the Apsáalooke Nation and a PhD candidate in Biomedical Informatics at Columbia University, it was a career fair during his senior year that redirected his path toward data science.

“I majored in math and computational science as an undergrad,” he remembers, “and when I entered my senior year, I started to go to career fairs. I saw a lot of recruiters whose pitch was essentially, ‘We have a large amount of data on consumers that you can work with. We want to make sure they buy the latest phone or click on an ad that we serve to them.’”

Although Bear Don’t Walk considered these to be interesting problems, they didn’t promise the kind of impact that excited him. “A friend of mine told me about biomedical informatics, and I realized that a lot of the tools I had in my toolbox could translate well to working with health data,” he says. “The reason I got into it was because if someone is giving me their data to work with, they better get something out of that. There’s a trust there, and I should hope that whatever I do, they can benefit somehow.”

Bear Don’t Walk also sees a tight link between data privacy and the way data has been appropriated from people in Indigenous communities. “The idea of biocommercialism, taking someone’s data and profiting from it, has helped me to re-contextualize things like secondary use of electronic health record (EHR) data,” he says. “When you and a hospital sign papers that say that your medical data can be repurposed for secondary use research, and you don't have to be consulted after this point, I wonder how much we actually understand of what we're signing? I think there should be a way for patients to go back and say, ‘Hey, I’d actually prefer that my data not be used in this way, or not for this research, or not used at all.’ I want to make sure that the value systems are aligned between researcher and patient, and that the patient always feels empowered about their data.”

Looking to More Holistic Data Types

Tsosie and Bear Don’t Walk have both discussed the example of diabetes to make their point. Diabetes is well studied in Native populations, but the early conclusions of researchers, that a genetic predisposition was likely at play, betrayed a shortsighted and incomplete approach to the data.

“Diabetes is something that is heavily studied in Indigenous communities, particularly in the Southwest,” Tsosie says, “and one population in general in the Southwest is so over-studied because they have the highest adult-onset prevalence of Type 2 diabetes. Yet that disease didn't exist in that community until a western commodity diet was imposed upon them.” 

Bear Don’t Walk agrees. “There are a lot of other sociocultural factors, including colonialism, that went into making diabetes more prevalent in these communities. Not to say that there’s no genetic factor. But it’s important to remember that the lens through which you’re viewing science and doing science is not unbiased.” 

Coming Together for a Purpose

Based on such shared understandings, Tsosie and Bear Don’t Walk’s paths merged as two of several organizers of the first IndigiData workshop, held virtually over one week in July 2021 and based on a theme of “empowering the next generation of Indigenous data scientists.” 

“It’s critical to have cross-disciplinary conversations in which scientists are willing to listen, as well as to grant agency to individuals who can make decisions that are ethically-minded,” Tsosie says of the workshop. “We need to focus on the empowerment of data and data-driven decisions as well as focusing attention to the need of ethics.”

Both data scientists intend to continue this conversation against the backdrop of next year’s IndigiData, which is currently funded by the National Science Foundation through 2025.