If you ask the GPT-4 to write a paragraph in the style of Carmen Machado, Margaret Atwood, or Alexander Chee, it will do it well, and for good reason: it has most likely absorbed all of their work during its training, and is now applying their brilliance to its own creation. But these writers, and thousands of others, are not happy about it.
In an open letter signed by more than 8,500 authors of fiction, nonfiction, and poetry, the tech companies behind large-scale language models like ChatGPT, Bard, LLaMa, and others are accused of using their work without permission or compensation.
"These technologies mimic and replicate our language, stories, styles, and ideas. Millions of copyrighted books, articles, essays, and poems provide 'food' for AI systems, an endless 'smorgasbord' that generates no bill," the letter reads.
While the AI systems were shown to be able to cite and parody the works of the authors in question, the AI developers did not substantially address the provenance of these works. Are they trained on samples culled from bookstores and reviews? Did they borrow each book from the library? Or did they just download a bunch of illegal archives, like Libgen?
One thing's for sure: they didn't go to the publishers to get authorization - no doubt the preferred method, and arguably the only legal and ethical one, the authors write.
Not only does the Supreme Court's recent decision in Warhol v. Goldsmith make it clear that AI use is highly commercial, which is antithetical to fair use, but no court would excuse copying a work from an illegal source as fair use. By embedding our work into your systems, generative AI could harm our profession by flooding the market with mediocre machine-written books, stories, and news based on our work.
In fact, we've already seen this happen. Recently, a number of extremely low-quality AI-generated works have made it onto Amazon's Young Adult Literature bestseller list; publishers are being inundated with mass-generated works; and every day, the content of this website (and soon this post) is being scraped in order to be adapted for use in search engine optimization.
These malicious actors are using tools, APIs, and proxies developed by companies such as OpenAI and Meta, which in this case are arguably malicious actors themselves. After all, who else would knowingly steal millions of works to power a new commercial product? (There's Google, of course - but there's a fundamental difference between search indexing and AI ingestion, and Google Books at least has the excuse that it's supposed to be a specialized index.)
The open letter warns that fewer and fewer authors are able to make a living from their writing due to the complexities and narrow profit margins of mass publishing, and that this is an untenable situation for them, especially new authors, "particularly younger authors and voices from underrepresented groups."
The letter asks these companies to do the following:
1. obtain permission to use our copyrighted material in generative AI programs.
2. provide fair compensation to past and present writers who use our work in generative AI programs.
3. fairly compensate writers for the use of our works in AI outputs, whether or not those outputs are infringing under current law.
There is no legal threat - as Writers Guild CEO (and signatory) Mary Rasenberger told NPR, "Litigation is a huge expense. Litigation takes a long time." Now, artificial intelligence is hurting authors.
What company is going to be the first to say, "Yes, our AI is based on stolen work, and we're sorry, and we're going to pay for it"? But there seems to be little incentive to do so. Most people don't realize, nor do they care, that LLMs are created by whatever illegal means, and that they may actually contain and transcribe copyrighted works. It's easier for people to see this (very similar) problem when the resulting image reproduces the artist's distinctive style, and there is some backlash.
But the subtle danger of using all of George Saunders' or Diana Gabaldon's work as "food" for artificial intelligence may not spur as many people to action - although there are many authors who are ready to fight.