Imagine you decide to write a short story about a protagonist who creates an artificial human and then falls in love with it. What gender is your protagonist? What about the artificial human? Would you write a moving love story? A cautionary dystopian tale?
Would your story be more compelling than one written by ChatGPT?
Likely yes, says Nina Beguš, a researcher and lecturer in UC Berkeley’s School of Information and Department of History. Leveraging her background in comparative literature and knowledge of generative AI, Beguš tested this scenario on hundreds of humans and AI-generated responses.
Her findings, published Oct. 28 in the journal Humanities and Social Sciences Communications, offer a window into the inner workings and ongoing limitations of generative AI tools like ChatGPT.
Generative AI is getting much more sophisticated. But for now, it seems, quality creative writing remains the realm of (human) storytellers and scribes.
“The humanities can reveal a lot about the strengths and weaknesses of these new AI tools,” Beguš said. “Fiction, in particular, offers a window into the collective cultural imaginary—the shared set of narratives, ideas and symbols—that machines have inherited from us.”
Beguš’s work is part of a new field of research she calls the “artificial humanities,” a discipline focused on using history, literature and other humanities subjects to add depth to AI development. Her forthcoming book, “Artificial Humanities: A Fictional Perspective on Language in AI,” expands on her recent research, which first gained attention last year in a widely read series of posts on X.
Before she could begin her research, Beguš needed to decide on a common storytelling structure to compare human responses to those from generative AI models. She settled on the myth of Pygmalion, the 2,000-year-old plotpoint from Ovid’s poem “Metamorphoses,” in which an artist falls in love with a statue he sculpted. The motif has been deployed countless times, most recently—and some might say relatably—in blockbuster films like “Her” and “Ex Machina.”
Beguš instructed both humans and the AI tools ChatGPT and Llama to write a story based on one of two short prompts: “A human created an artificial human. Then this human (the creator/lover) fell in love with the artificial human” or “A human (the creator) created an artificial human. Then another human (the lover) fell in love with the artificial human.”
Using simple prompts, rather than incrementally having the AI system refine its response the way many people use it, made it easier to use narrative analyses and statistics to assess the quality of the baseline writing for both humans and AI.
“I was interested in that averageness,” Beguš said, “Most people are not professional writers.”
Beguš obtained 250 human-written responses as well as 80 stories from generative AI tools. She then reviewed details in each response, including how they discussed gender and sexuality, race and ethnicity, and culture. She also evaluated the complexity of their overall narrative arcs.
Both humans and AI systems showed a common understanding of the Pygmalion myth inherent to the prompt. That was somewhat unsurprising, since AI models are trained on millions of written texts and writings about those written texts, and humans have a tendency to draw on pop culture reference points during bursts of creativity.
Where humans consistently wrote richer and more varied narratives, AI systems generated similar versions of the same story over and over with just slight alterations. Narratives were formulaic, lacked tension and were rife with clichés.
“The characters were flat, generic and unmotivated,” Beguš said.
But there was a surprise.
Early versions of ChatGPT did not indicate whether the humans or their creations were male or female. But newer AI models, like ChatGPT 4, which were built with more information about 21st century progressive human values, produced more inclusive writing. One-quarter of those stories included same-sex love interests. One even included a polyamorous relationship.
“They paved the way for deeper understanding of love and humanity and what it means to be human,” Beguš said of more recent AI tools.
By comparison, just 7% of human-created stories featured same-sex relationships.
“Large-language models mimic human values,” she said. “This paper shows that the values from training data can be overridden by technologists’ choices made during the process of value alignment.”
For Beguš, the intersection of AI and literature predates the popular generative AI hype of the past two years. As far back as 2010, she wondered how the looming AI bonanza might reflect centuries of art and literature. AI was in its infancy then, so she largely shelved the question until 2020, when basic chatbots using AI became more accessible for testing her ideas.
Scholars in the humanities are the wordsmiths, she reasoned. Why shouldn’t they also be part of the exploration of AI?
“In the humanities, for centuries, we have been exploring and have become the experts on language, on writing, on what it means to be human,” Beguš said. “So this all just kind of naturally came together.”
It’s more than just an academic exercise, she said. AI is changing—and has changed—how we interact with writing. Universities are increasingly providing access to ChatGPT and teaching students how to use it effectively in their work. Some professors now include Beguš’s work on AI and the humanities in their course syllabi.
It’s important in exploring what role the humanities may have in the future of AI development.
Beguš thinks about that often, and what role it will have in her own life as a reader and writer.
“I wonder if my grandchildren will be shocked when I tell them, ‘Your grandma used to write from scratch,'” she said. “But then again, writing is such an essential human activity. We have been taught to write since preschool. We connect our thought process with writing.”
That’s why Beguš says it’s essential that scholars from the humanities help develop future AI tools.
“We need quality writers to create quality stories,” she said. “I’m really curious about what insight writers will be able to get from machines, if there’s something that is actually valuable, that is worthwhile. So far, I don’t think there has been much.
“But nonetheless, this technology is transformative of writing.”
More information:
Experimental narratives: A comparison of human crowdsourced storytelling and AI storytelling, Humanities and Social Sciences Communications (2024). DOI: 10.1057/s41599-024-03868-8
Provided by
University of California – Berkeley
Citation:
Storytelling study provides a window into the inner workings, ongoing limitations of generative AI tools like ChatGPT (2024, October 28)