No Comments

How AI worries writers including Game of Thrones author George R.R. Martin and John Grisham, and why they are suing ChatGPT developer OpenAI

When novelist Douglas Preston first started messing around with ChatGPT, he gave the AI software a challenge: could it write an original poem based on a character from some of his books?

“It came out with this terrific poem written in iambic pentameter,” Preston recalls. The result was impressive – and concerning.

“What really surprised me was how much it knew about this character; way more than it possibly could have gleaned from the internet,” Preston says.

The adventure writer suspected that the chatbot had somehow absorbed his work, presumably as part of the training process by which an artificial intelligence model ingests lots of data that it then synthesises into seemingly original content.

“That was a very disturbing feeling,” Preston says, “not unlike coming home and finding that someone’s been in your house and taken things.”

Those worries led Preston to sign on to a proposed class action lawsuit accusing OpenAI, the developer behind ChatGPT and a major player in the growing AI industry, of copyright infringement.

Preston is joined in the suit by a host of other big-name authors, including John Grisham, Jonathan Franzen, Jodi Picoult and George R.R. Martin – the notoriously slow-to-publish Game of Thrones author who, Preston says, joined out of frustration that fans were using ChatGPT to preemptively generate the last book in his series.
OpenAI, for its part, has contended that training an AI system falls under fair use protections, especially given the extent to which AI transforms the underlying training data into something new.

A spokesman for OpenAI said the firm respects authors’ rights and believes they should “benefit from AI technology”.

“We’re having productive conversations with many creators around the world, including the Authors Guild, and have been working cooperatively to understand and discuss their concerns about AI,” the spokesman said of America’s oldest and largest organisation for published writers.

“We’re optimistic we will continue to find mutually beneficial ways to work together to help people utilise new technology in a rich content ecosystem.”

Nevertheless, the publishing industry is pushing back as it reckons with a software boom that’s given anyone with Wi-fi the power to automatically generate large reams of text.

In addition to Preston’s suit, various other groups of authors are pursuing their own proposed class action suits against OpenAI.

“Everybody’s realising to what extent their data, their information, their creativity, has been absorbed,” says Ed Nawotka, an editor at American trade news publication Publishers Weekly. There is, in the industry, a degree of “abject panic”, he says.

In one recent pair of lawsuits, American comedian and actress Sarah Silverman accused OpenAI as well as Meta – Facebook’s parent company and a major AI developer itself – of copyright infringement. The two companies have since pushed to get most of Silverman’s cases dismissed.

A different suit recently found Paul Tremblay (The Cabin at the End of the World) and Mona Awad (Bunny) suing OpenAI for copyright violations – the company is trying to get that one mostly dismissed too – while Michael Chabon (The Yiddish Policemen’s Union) is a plaintiff in two additional legal actions that are targeting OpenAI and Meta, respectively.

And in July, the Authors Guild – a professional trade group, not a labour union – sent several technology companies an open letter calling for consent, credit and fair compensation when writers’ works are used to train AI models.

Among the signatories were Margaret Atwood, Dan Brown, James Patterson, Suzanne Collins, Roxane Gay and Celeste Ng.
That’s all on top of the nearly five-month-long strike that Hollywood screenwriters recently undertook that led to, among other things, new regulations on the use of AI for script generation. A separate strike, still ongoing, has found screen actors rallying around AI concerns of their own.

The lawsuit in which Preston is involved, which features 17 other named plaintiffs including the Authors Guild, claims that OpenAI copied the authors’ works “without permission or consideration” to train AI programs that now compete with those authors for readers’ time and money.

The suit also takes issue with ChatGPT’s generation of derivative works, or “material that is based on, mimics, summarises, or paraphrases [the] Plaintiffs’ works, and harms the market for them”.

The plaintiffs are seeking damages for their lost licensing opportunities and “market usurpation”, as well as an injunction against future such practices, on behalf of American fiction authors whose copyrighted works were used to train OpenAI software.

“They didn’t ask our permission, and they aren’t compensating us,” Preston says of OpenAI. “What they’ve done is created a very valuable commercial product which can reproduce our voices. … It’s basically theft of our creative work on a grand scale.”

Since the plaintiffs’ books aren’t freely available on the open web, he adds, OpenAI “almost certainly” accessed them via alleged piracy sites such as the file-sharing platform LibGen.

OpenAI declines to answer a question about whether the plaintiffs’ books were part of ChatGPT’s training data or accessed via file-sharing sites such as LibGen. In a statement to the US Patent and Trademark Office cited in the Authors Guild suit, OpenAI stated that modern AI systems are sometimes trained on publicly available data sets that include copyrighted works.

Michael Connelly, the author of the Harry Bosch series of crime novels and another plaintiff in the Authors Guild lawsuit, framed those concerns as a matter of control: “control of your own work, your own property.”
Connelly never got to decide whether his books would be used to train an AI, he said, but if he’d been asked – even if there were money on the table – he would probably have opted out.

The idea of ChatGPT writing an unofficial Bosch sequel strikes him as a violation; even when Amazon adapted the series into a TV show, he says, he had some control over the scripts and casting.

“These characters belong to us,” Connelly says. “They come out of our heads. I even put stuff in my will about [how] no other author can carry the Harry Bosch torch after I’m gone. He’s mine, and I don’t want anyone else telling his story. I certainly don’t want a machine telling it.”

But whether the law will allow the machines to do so is a different question.

The various lawsuits against OpenAI allege copyright violations. But copyright law – and especially fair use, the area of law governing when copyrighted work can be incorporated into other endeavours, such as for the sake of education or criticism – still doesn’t offer a cut-and-dried answer to how these lawsuits will shake out.

“We’ve got kind of a push and pull right now in the case law,” says intellectual property lawyer Lance Koonce, a partner at the law firm Klaris, pointing to two recent US Supreme Court cases that offer competing models of fair use.

In one, Authors Guild vs. Google, the court held that Google was allowed to digitise millions of copyrighted books to make them searchable.

In the other, Andy Warhol Foundation for the Visual Arts Inc. vs. Goldsmith, the court found that the titular pop artist’s incorporation of a photographer’s work into his own art didn’t fall under fair use because Warhol’s art was commercial and had the same basic purpose as the original photo.
“These AI cases – and especially the Authors Guild case [against OpenAI] – fall into that tension,” Koonce said.

In its patent office statement, OpenAI argued that training artificial intelligence software on copyrighted works “should not, by itself, harm the market for or value of copyrighted works” because the works are being consumed by software rather than real people.

Outside of legal avenues, stakeholders are already pitching solutions to this tension.

Suman Kanuganti, the chief executive of AI messaging platform Personal.ai, says the tech industry will probably adopt some sort of attribution standard that allows people who contribute to an AI’s training data to be identified and compensated.

“Once you build the models with known, authenticated data units, then technologically, it’s not a challenge,” Kanuganti says. “And once you solve that problem … the economic association then becomes easier.”

Preston, the adventure novelist, agrees that there may yet be a path forward.

Licensing books to software developers through a centralised clearing house could provide authors with a new income stream while also securing high-quality training data for AI companies, he says, adding that the Authors Guild tried to set up such an arrangement with OpenAI at one point but that the two sides were unable to reach an agreement.

“We were trying to get them to sit down with us in good faith; we’re not opposed at all to AI,” Preston says. “It’s not a zero-sum game.”

Cyber Gear Webinar Series