Can generative artificial intelligence systems like ChatGPT genuinely create original ideas? A new study led by Professor Karim Jerbi from the Department of Psychology at the Université de Montréal, with participation from renowned AI researcher Yoshua Bengio, takes on that question at an unprecedented scale. The research is the largest direct comparison ever conducted between human creativity and the creativity of large language models.
The study, published in Scientific Reports (Nature Portfolio), points to a meaningful shift. Generative AI systems have now reached a level where they can outperform the average human on certain creativity measures. Simultaneously occurring, the most creative people still show a clear and consistent advantage over even the strongest AI models.
AI Reaches Average Human Creativity Levels
Researchers evaluated several leading large language models, including ChatGPT, Claude, Gemini, and others, and compared their performance with results from more than 100,000 human participants. The findings highlight a clear turning point. Some AI systems,including GPT-4,exceeded average human scores on tasks designed to measure divergent linguistic creativity.
“Our study shows that some AI systems based on large language models can now outperform average human creativity on well-defined tasks,” explains Professor Karim Jerbi. “This result might potentially be surprising – even unsettling – but our study also highlights an equally critically important observation: even the best AI systems still fall short of the levels reached by the most creative humans.”
Further analysis by the study’s co-first authors, postdoctoral researcher Antoine Bellemare-Pépin (Université de Montréal) and PhD candidate François Lespinasse (Université Concordia), revealed a striking pattern. While some AI models now outperform the average person, peak creativity remains firmly human.
In fact, when researchers examined the most creative half of participants, their average scores surpassed those of every AI model tested. The gap grew even larger among the top 10 percent of the most creative individuals.
“We developed a rigorous framework that allows us to compare human and AI creativity using the same tools, based on data from more than 100,000 participants, in collaboration with Jay Olson from the University of Toronto,” says professor Karim jerbi, who is also an associate professor at Mila.
How Scientists Measured Creativity
The researchers used a test called the Alternate Uses Task (AUT). Participants were asked to list as many unusual uses as possible for everyday objects, like a brick or a paperclip. The originality and diversity of responses were then analyzed. This method is a well-established way to assess divergent thinking, a key component of creativity.
The study’s findings suggest that AI is getting better at generating novel ideas, but it still struggles with the kind of nuanced, imaginative thinking that characterizes human creativity. while AI can produce a large volume of ideas, the most creative humans consistently generate ideas that are both original and meaningful.
“AI can now match the average human in terms of quantity, but it doesn’t yet possess the quality of creativity seen in exceptional individuals,” adds Professor Jerbi. “This suggests that there’s still something uniquely human about the creative process.”
The research team plans to continue exploring the differences between human and AI creativity, with the goal of better understanding the cognitive mechanisms that underlie creative thought. This could lead to new insights into how to foster creativity in both humans and machines.