- Roko's Basilisk
- Posts
- AI vs Artists: Copyright Showdown
AI vs Artists: Copyright Showdown
Plus: Reddit sues AI, Meta automates moderation, and more.
Midweek Menu
🎨 Artists vs AI in court, and what Getty’s new lawsuit means for creative rights.
🗳️ Should AI companies pay artists for training data, or is “fair use” fair enough?
🧠 Lawsuits, UK privacy shakeups, and the data that fuels your favorite AI.
🛠️ A builder’s snack on creative hustle, plus why originality still matters.

Your next AI breakthrough starts with better hiring.
The gap between AI ambition and real outcomes often comes down to hiring. Athyna bridges that gap—with top-tier AI/ML professionals sourced from emerging markets and vetted for excellence.
From core engineers to prompt experts to research talent, we deliver candidates in under 5 days and support the entire process: brief, onboarding, contracts, and beyond.
No upfront fees. Just scalable, enterprise-grade hiring built for serious AI teams that need to move now.
*This is sponsored content

The Laboratory
AI vs artists: The battle for originality
If one were to take the saying "there is nothing new under the sun" seriously, it would suggest that everything has been done or experienced before, meaning originality is difficult to achieve, and history often repeats itself. The saying from the Bible would have us wondering what exactly is art? Is it a reproduction of something older in a just way, is it a new perspective on looking at things, or is it just a remake of something someone else created?
Whatever it may be, the saying came to my mind as I wondered what the impact of AI on artists and whether a picture, a song, or an article written by AI can be considered art. Does it violate the rights of human artists since AI uses their work to create something “new”? The thought was triggered by the news that Getty's lawsuit on copyright against Stability AI is set to start in the UK.
The lawsuit is not the first one to challenge the use of copyrighted work by AI companies to train models, but it could set the tone for future legislation and the evolution of copyright law.
The debate on copyright and AI
Copyright law was first established in 1710 to deal with the outcome of the invention of the printing press. At the time, the law aimed to protect publishers against any unauthorised publication while encouraging learning. Since then, the laws have been amended and evolved as new technologies have changed from the printing press to the photocopying machine, recording devices, and to the Internet.
Since the launch of OpenAI’s ChatGPT the use of copyrighted works, such as books, news, articles, images to train their models has resulted in several legal cases all of which, at their core, are struggling to ascertain if using copyright works to train and improve AI models for the benefit their user base trumps the right of artists not to have their work fed into a machine, and if so, should artists be compensated for their work.
The more prominent of these cases are:
Getty Images’ lawsuit against Stability AI in the U.K. for using millions of its images to train Stable Diffusion without a license.
Reddit’s lawsuit against Anthropic for allegedly scraping over 100,000 pages of Reddit content to train Claude.
The New York Times’ lawsuit against OpenAI and Microsoft in late 2023 for using its articles to train AI, claiming it competes directly with its products.
The authors filed a lawsuit against Meta, alleging the company used pirated copies of their work to train its LLaMA model.
But this brings us to the question of why AI companies need so much data to train their models.
Why AI models hunger for human data
Training AI models requires vast amounts of data, which could include millions of lines of text, images, or transaction records. These are used as “experiences” from which the AI learns patterns. Simply put, data is AI’s teacher. But like any good teacher, it matters what data you’re feeding the model.
This brings us to the problem of finding clean data that can be used to train models. Clean data here means data that is free of typos, duplicates, mislabeled items, or missing values. The cleaner the data AI models are trained on, the better, more reliable their performance will be.
Examples of how the lack of clean data can wreak havoc on AI models were seen in 2016 when Microsoft’s AI chatbot, designed to develop conversational understanding by interacting with humans, started spewing lewd and racist tweets. In business, muddy data can lead to wasted marketing budgets, faulty customer insights, or regulatory penalties challenging the financial viability of developing AI models.
AI models, especially large language models (LLMs) and image generation systems, often rely on copyrighted materials for training because these materials represent a large portion of high-quality, human-created content available on the internet.
What human creators stand to lose
For artists, musicians, and writers, the stakes in the AI copyright debate have become deeply personal. Many view their years, sometimes decades, of creative work as being reduced to raw material for AI systems, without any consent, credit, or compensation.
Famous writers like George R.R. Martin and comedian Sarah Silverman joined lawsuits against companies like OpenAI and Meta, alleging that these companies used copyrighted books without permission to train large language models.
These cases have since been consolidated before a New York federal court in April 2025, merging claims from Silverman, The New York Times, and other authors, a move aimed at streamlining litigation and setting clear legal standards.
And it is not just individual writers, unions representing screenwriters and journalists have also entered the fray. Press organizations have warned that AI-generated content could flood the media landscape, diluting journalistic integrity and undermining livelihoods.
Legal defense, policy changes, and partnerships: AI companies respond to copyright violations
AI firms push back with policy shifts
AI companies have responded to copyright infringement allegations with a mix of legal defenses, policy changes, and industry partnerships.
Most AI companies, including OpenAI, Meta, and Google, have argued that their use of copyrighted material falls under “fair use”, a legal doctrine that allows for limited use of copyrighted material without permission for purposes like commentary, research, or education.
AI firms also assert that their models don’t “store” the original copyrighted content, but rather extract patterns and representations that allow them to generate new, not copied, outputs.
Some companies, in response to legal pressure, have also shifted toward licensing deals to secure access to high-quality content.
Another strategy adopted by AI companies is to provide opt-out provisions that allow creators to stop their data from being scraped for use in training AI models. However, not all these moves have shaped out the way AI companies or creators would like.
Recently, it was reported that Google didn’t want to give publishers the choice to keep their content out of AI Search results. The news came after Google had discussed offering publishers more granular control over how website data would be used in AI Search features, instead of the illusion of choice they eventually received.
Getty case signals a larger battle
The fight between AI and copyright isn’t just a legal skirmish, it’s a battle over how we define creativity in the digital age. As machines learn from human expression, the question isn’t only whether they can replicate it, but whether they should. Lawsuits from authors, publishers, and platforms like Getty Images aren’t just about royalties; they’re about consent, recognition, and the value of creative labor.
AI companies, meanwhile, argue that without access to human-made content, innovation would stall. The courts will decide the legality, but the larger societal reckoning, who gets to shape the future of art, literature, and journalism, is just beginning. For now, human creators are drawing a line in the sand, demanding not just compensation, but a voice in shaping the tools that will inevitably redefine the creative process. Whether that voice is heard may determine what “originality” means in the age of artificial intelligence.


Wednesday Poll
🎨 Should AI Companies Pay Artists for Training Data? |

Quick Hits
Reddit filed a landmark lawsuit against Anthropic, claiming the AI startup scraped 100,000+ pages of Reddit content to train its Claude models. The outcome could redefine how AI companies access social data.
Meta plans to automate up to 90% of its privacy and safety risk assessments across platforms like Facebook and Instagram, raising concerns among UK safety campaigners.
Apple unveils new visionOS features at WWDC. Spatial widgets, lifelike Personas, and more—the latest updates make Apple’s mixed-reality OS smarter and more immersive.

Level up your brand vision using the swipe’s curated classics.
Feel stuck on your latest campaign? Get inspired by the wildest brand experiments captured in The Swipe, from comedic guerrilla marketing to clever influencer collaborations.
Discover how witty one-liners, stunning visuals, or interactive elements make a lasting impact in today’s crowded marketplace. Our curated approach highlights proven tactics with a dash of risk-taking, showing you how to capture attention while staying authentic.
With The Swipe, brainstorming your next winning idea just got easier.
*This is sponsored content

Brain Snack for the Builders
![]() | 💡 If your AI tool can’t explain what makes it unique in a single sentence, you’ve got a feature, not a product. TL;DR: Ditch the jargon, find the hook, and let your users finish the story. |

Meme of the Day


Rate this edition
What did you think of today's email? |
