Too Much Trust in AI Poses Unexpected Threats to the Scientific Process

It’s vital to “keep humans in the loop” to avoid humanizing machine-learning models in research

Illustration of Lisa Messeri and Molly Crockett — Shideh Ghandeharizadeh

Machine-learning models are quickly becoming common tools in scientific research. These artificial-intelligence systems are helping bioengineers discover new potential antibiotics, veterinarians interpret animals’ facial expressions, papyrologists read words on ancient scrolls, mathematicians solve baffling problems and climatologists predict sea-ice movements. Some scientists are even probing large language models’ potential as proxies or replacements for human participants in psychology and behavioral research. In one recent example, computer scientists ran ChatGPT through the conditions of the Milgram shock experiment—a study on obedience, begun in 1961, in which people gave what they believed were increasingly painful electric shocks to an unseen person when told to do so by an authority figure—and other well-known psychology studies. The AI model responded similarly to humans: 75 percent of simulated participants administered shocks of 300 volts or more.

But relying on these machine-learning algorithms also carries risks. Some of those risks are commonly acknowledged, such as generative AI’s tendency to produce occasional “hallucinations” (factual inaccuracies or nonsense). AI tools can also replicate and even amplify human biases about characteristics such as race and gender. And the AI boom, which has given rise to complex, trillion-variable models, requires water- and energy-hungry data centers that are likely to have high environmental costs.

One big risk is less obvious, though potentially very consequential: humans tend to attribute a great deal of authority and trustworthiness to machines. This misplaced faith could cause serious problems when AI systems are used for research, according to a recent paper in Nature.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

“These tools are being anthropomorphized and framed as humanlike and superhuman. We risk inappropriately extending trust to the information produced by AI,” says Molly Crockett, a cognitive psychologist and neuroscientist at Princeton University and a co-author of the study. AI models are human-made products, and they “represent the views and positions of the people who developed them,” says Lisa Messeri, a Yale University sociocultural anthropologist who worked with Crockett on the paper. Scientific American spoke with both researchers to learn more about the ways scientists use AI—and the potential effects of trusting this technology too much.

An edited transcript of the interview follows.

Why did you write this paper?

LISA MESSERI: [Crockett] and I started seeing and sharing all sorts of large, lofty promises of what AI could offer the scientific pipeline and scientific community. When we really started to think we needed to write something was when we saw claims that large language models could become substitutes for human subjects in research. These claims, given our years of conversation, seemed wrong-footed.

MOLLY CROCKETT: I have been using machine learning in my own research for several years, and advances in AI are enabling scientists to ask questions we couldn’t ask before. But as I’ve been doing this research and observing that excitement among colleagues, I have developed a sense of uneasiness that’s been difficult to shake.

Beyond using large language models to replace human participants, how are scientists thinking about deploying AI?

CROCKETT: Previously we helped write a response to a study in [the Proceedings of the National Academy of Sciences USA] that claimed machine learning could be used to predict whether research would [be replicable] just from the words in a paper. That struck us as technically implausible. But more broadly, we’ve discovered that scientists are talking about using AI tools to make their work more objective and to be more productive.

We found that both those goals are quite risky and open up scientists to producing more while understanding less. The worry is that we’re going to think these tools are helping us to understand the world better, when in reality, they might be distorting our view.

MESSERI: We categorize the AI uses we observed in our review into four categories: the Surrogate, the Oracle, the Quant and the Arbiter. The Surrogate is what we’ve already discussed—it replaces human subjects. The Oracle is an AI tool that is asked to synthesize the existing corpus of research and produce something, such as a review or new hypotheses. The Quant is AI that is used by scientists to process the intense amount of data out there—maybe produced by those machine surrogates. AI Arbiters are like [the tools described] in the PNAS replication study Crockett mentioned—tools for evaluating and adducting research. We call these visions for AI because they’re not necessarily being executed today in a successful or clean way, but they’re all being explored and proposed.

You’ve pointed out that even if AI’s hallucinations and other technical problems are solved, risks remain.

CROCKETT: The overarching metaphor we use is this idea of monoculture, which comes from agriculture. Monocultures are very efficient. They improve productivity. But they’re vulnerable to being invaded by pests or disease; you’re more likely to lose the whole crop when you have a monoculture versus diversity in what you’re growing. Scientific monocultures, too, are vulnerable to risks such as errors propagating throughout the whole system. This is especially the case with the foundation models in AI research, where one infrastructure is being used and applied across many domains. If there’s some error in that system, it can have widespread effects.

We identify two kinds of scientific monocultures that can arise with widespread AI adoption. The first is the monoculture of knowing. AI tools are suited to answer only certain kinds of questions. Because these tools boost productivity, the overall set of research questions being explored could become tailored to what AI is good at.

Then there’s the monoculture of the knower, where AI tools come to replace human thinkers. And because AI tools have a specific standpoint, this shift eliminates the diversity of human perspectives from research production. When you have many kinds of minds working on a problem, you’re more likely to spot false assumptions or missed opportunities. Both monocultures could lead to cognitive illusions.

What do you mean by “illusions”?

MESSERI: One example that’s already out there in psychology is the illusion of explanatory depth. Basically, when someone in your community claims they know something, you tend to assume you know that thing as well.

In your paper, you cite research demonstrating that using a search engine can trick someone into believing they know something when really they only have online access to that knowledge. And students who use AI assistant tools to respond to test questions end up thinking they understand a topic better than they do.

MESSERI: Exactly. Building off that illusion of explanatory depth, we also identify two others. One is the illusion of exploratory breadth, where someone thinks they’re examining more than they are. There are an infinite number of questions we could ask about science and about the world. We worry that with the expansion of AI, the questions that AI is well suited to answer will be mistaken for the entire field of questions one could ask. Then there’s the risk of the illusion of objectivity. Either there’s an assumption that AI represents all standpoints, or there’s an assumption that AI has no standpoint at all. But at the end of the day, AI tools are created by humans coming from a particular perspective.

How can scientists avoid falling into these traps? How can we mitigate these risks?

MESSERI: There’s the institutional level where universities and publishers dictate research. These institutions are developing partnerships with AI companies. We have to be very circumspect about the motivations behind that. One mitigation strategy is just to be incredibly forthright about where the funding for AI is coming from and who benefits from the work being done on it.

CROCKETT: At the institutional level, funders, journal editors and universities can be mindful of developing a diverse portfolio of research to ensure that they’re not putting all the resources into research that uses a single-AI approach. In the future it might be necessary to consciously protect resources for the kinds of research that can’t be addressed with AI tools.

And what type of research is that?

CROCKETT: Well, as of right now, AI cannot think like a human. Any research about human thought and behavior, as well as qualitative research, is not addressable with AI tools.

Would you say that in the worst-case scenario, AI poses an existential threat to human scientific knowledge production? Or is that an overstatement?

CROCKETT: I don’t think it’s an overstatement. I think we are at a crossroads: How do we decide what knowledge is, and how do we proceed in the endeavor of knowledge production?

Is there anything else you think is important for the public to really understand about what’s happening with AI and scientific research?

MESSERI: From the perspective of reading media coverage of AI, it seems as though this is some preordained, inevitable “evolution” of scientific and technical development. But as an anthropologist of science and technology, I would really like to emphasize that science and tech don’t proceed in an inevitable direction. It is always human-driven. These narratives of inevitability are themselves a product of human imagination and come from mistaking the desire by some for a prophecy for all. Everyone, even nonscientists, can be part of questioning this narrative of inevitability by imagining the different futures that might come true instead.

CROCKETT: Being skeptical about AI in science doesn’t require being a hater of AI in science. We love science. I’m excited about its potential for science. But just because an AI tool is being used in science does not mean that it is automatically better science.

As scientists, we are trained to deny our humanness. We’re trained to think human experience, bias and opinion have no place in the scientific method. The future of autonomous AI “self-driving” labs is the pinnacle of realizing that sort of training. But increasingly, we are seeing evidence that diversity of thought, experience and training in humans who do the science is vital for producing robust, innovative and creative knowledge. We don’t want to lose that. To keep the vitality of scientific-knowledge production, we need to keep humans in the loop.