This article was originally published on The Conversation.
Since the release of ChatGPT in late 2022, millions of people have started using the large language model to access knowledge. And their appeal is easy to understand: ask a question, get a sophisticated synthesis and move on – it feels like intuitive learning.
However, a new paper I co-authored provides experimental evidence that this ease may come at a cost: When people rely on large language models to summarize information on a topic, they develop shallower knowledge about it than they learn through a standard Google search.
Co-authors Jin Ho Yun and I, both professors of marketing, reported this finding in a paper based on seven studies with more than 10,000 participants. Most studies used the same basic paradigm: participants were asked to learn about a topic – such as how to grow a vegetable garden – and were randomly assigned to do so using either ChatGPT or the “old-fashioned way” such as LLM by navigating links using a standard Google search.
No restrictions were placed on how they used the equipment; They can search Google for as long as they want and continue to prompt ChatGPT if they feel they want more information. Once they completed their research, they were asked to write advice to a friend on the topic based on what they learned.
The data revealed a consistent pattern: People who learned about a topic through an LLM versus a Web search felt they learned less, subsequently put less effort into writing their advice, and ultimately wrote advice that was shorter, less factual, and more general. In turn, when this advice was presented to an independent sample of readers who were unaware of what tool was used to learn about the topic, they found the advice less informative, less useful, and were less likely to adopt it.
We found these differences to be robust across different contexts. For example, one possible reason why LLM users wrote shorter and more general advice is that LLM results provided less diverse information to users than Google results. To control for this possibility, we conducted an experiment where participants were exposed to the same set of facts in the results of their Google and ChatGPT searches. Similarly, in another experiment we held the search platform – Google – constant and varied whether participants learned from standard Google results or from Google’s AI Overview feature.
The findings confirmed that, even while holding facts and platform constant, learning from synthesized LLM responses yielded shallower knowledge than gathering, interpreting, and synthesizing information for oneself via standard web links.
why it matters
Why did the use of LLM show learning loss? One of the most basic principles of skill development is that people learn best when they are actively engaged with the material they are trying to learn.
When we learn about a topic through a Google search, we face a lot of “friction”: we have to navigate different web links, read informational sources, and interpret and synthesize them ourselves.
While more challenging, this friction leads to the development of a deeper, more fundamental mental representation of the subject at hand. But with LLM, this entire process is done on behalf of the user, transforming learning from a more active to a passive process.
What will happen next?
Clearly, we do not believe that the solution to these issues is to avoid using LLMs, especially given the undeniable benefits they provide in many contexts. Rather, our message is that people need to become smarter or more strategic users of LLMs – which starts with understanding the domains in which LLMs are beneficial versus detrimental to their goals.
Need a quick, factual answer to a question? Feel free to use your favorite AI co-pilot. But if your aim is to develop deep and generalizable knowledge in a field, relying on LLM synthesis alone will be less helpful.
As part of my research on the psychology of new technology and new media, I am also interested in whether it is possible to make LLM learning a more active process. In another experiment we tested this by pairing participants with a special GPT model that offered real-time web links along with synthesized responses. However, we found that once participants received the LLM summary, they were not motivated to delve deeper into the original sources. The result was that participants still developed less knowledge than those using standard Google.
Based on this, in my future research I plan to study generative AI tools that create healthy friction for learning tasks – specifically, investigating what types of guardrails or speed bumps most successfully motivate users to actively learn beyond easy, synthesized answers. Such tools would seem particularly important in secondary education, where a major challenge for teachers is how to prepare students to develop foundational reading, writing and mathematics skills, while also preparing for the real world where LLMs are likely to be an integral part of their daily lives.
Research Brief is a short article on interesting academic work.![]()
Shiri Melumad, Assistant Professor of Marketing, University of Pennsylvania
This article is republished from The Conversation under a Creative Commons license. Read the original article.