Ethics & Responsibility

Copyright and IP Concerns

The unresolved legal questions about training data and AI-generated content

What it is

LLMs are trained on vast amounts of copyrighted content without explicit licensing from creators, books, articles, code, art. The legal question of whether this constitutes fair use or copyright infringement is actively litigated (e.g., The New York Times v. OpenAI, Getty Images v. Stability AI).

On the output side: the US Copyright Office has stated that purely AI-generated works lack the human authorship required for copyright protection, though works with significant human creative input may qualify.

Models can also reproduce training data verbatim (memorization), generated code might reproduce GPL-licensed code, and the legal status of AI-assisted creative work is unsettled.

Why it matters

For AI products in commercial contexts, copyright concerns affect what training data can be used, what outputs can be commercially exploited, and what disclosures clients need. Enterprises increasingly want indemnification from AI vendors for IP issues. Understanding the legal landscape helps you advise clients on risk and make informed decisions.

Resources

Who Owns the Output? AI Copyright & IP Explained w/ Chris Paniewski

youtube.com· Practical startup-oriented discussion of AI copyright and IP issues with legal expert Chris Paniewski. Covers output ownership and liability.

15 min

'No More Copyright Protection For Anyone': Author David Baldacci Rips Big Tech Over AI Copyright

youtube.com· Bestselling author's perspective on how AI threatens creative professionals. Strong on the content creator side of the debate.

10 min

Generative AI Has a Visual Art Problem

theverge.com· Covers the Getty Images and artist lawsuits against Stability AI. Strong on the visual art side of the copyright debate.

10 min

copyright.gov· Official U.S. government guidance on AI-generated works and copyright. Primary source for understanding the legal framework.

10 min

brookings.edu· Policy analysis arguing that copyright law alone is insufficient to address AI's impact on creative industries. Proposes broader policy solutions.

12 min