AI Problems Index

AI and Copyright

Examining claims about AI image generation, copyright, and the ethics of machine learning

Original Paper
The claims being examined

Goetze, T. (2024)

"The Greatest Art Heist in History: How Generative AI Steals from Artists"

View Paper
Key Rebuttal Points
Technical and legal corrections

Models use stochastic latent-diffusion, not image databases; they don't store or "steal" images

Style is not protected under copyright law (17 U.S.C. §102(b))

Licensed, consent-based training already exists (Shutterstock-OpenAI, Adobe Firefly)

U.S. Copyright Office has not declared dataset training infringing; fair-use analysis is ongoing

Job-loss figures are speculative; no peer-reviewed study shows 90% displacement

Autoencoders vs. Latent-Diffusion
Understanding the technical architecture

The paper incorrectly characterizes diffusion models as systems that "reverse-engineer" captions into images through memorization. This fundamentally misunderstands how these models work.

How Latent Diffusion Actually Works:

  • Models like Stable Diffusion and DALL-E 2 use a denoising diffusion process that iteratively refines noise in the latent space of a variational autoencoder
  • Sampling is stochastic (random), not deterministic; weights are not storing JPEGs
  • Quantitative research finds unintentional memorization in ≲0.03% of prompts, not systemic copying
  • The unCLIP architecture generates an image-embedding first, then decodes it; it is not a lookup or "photocopying" process
Environmental Cost Cherry-Picking
Incomplete analysis of computational impact

The paper selectively highlights environmental costs of AI training without providing important context:

  • No baseline comparison is given (e.g., artist workstations, renders, logistics)
  • Recent Latent Diffusion Models need approximately 1/10 the inference FLOPs of pixel-space diffusion
  • Efficiency improvements continue to reduce computational requirements, with each generation requiring less energy over time

Conclusion

The framing of AI image generation as "theft" relies on technical misconceptions, legal overreach, and flawed analogies to physical property. While legitimate concerns exist around proper attribution, consent, and economic transitions, these are better addressed through targeted policy approaches rather than sweeping moral condemnation. The ongoing development of licensing models, opt-out mechanisms, and artist-centric platforms demonstrates that ethical AI development is possible without abandoning the technology altogether.

Sources

High-Resolution Image Synthesis with Latent Diffusion Models
The Greatest Art Heist in History: How Generative AI Steals from Artists