- Roko's Basilisk
- Posts
- When Grok Becomes A Liability
When Grok Becomes A Liability
Plus: Red-teaming kits, Games Workshop’s AI ban, and unemployed training bots.
Here’s what’s on our plate today:
🧵 Grok’s Christmas fiasco and the hidden costs of loose guardrails.
🧠 DC chips bill, Warhammer’s AI ban, and unemployed training bots.
🧪 Three real tools to stress-test, monitor, and harden your models.
🗳️ This week’s poll: Do you trust “edgy” models at work?
Let’s dive in. No floaties needed…

Introducing the first AI-native CRM
Connect your email, and you’ll instantly get a CRM with enriched customer insights and a platform that grows with your business.
With AI at the core, Attio lets you:
Prospect and route leads with research agents
Get real-time insights during customer calls
Build powerful automations for your complex workflows
Join industry leaders like Granola, Taskrabbit, Flatfile and more.
*This is sponsored content

The Laboratory
The Grok episode and the hidden costs of loose AI guardrails
Over the past couple of years, artificial intelligence has gone from a technological curiosity to a topic of everyday conversation. While models evolved to create music, generate images, and promise big gains in productivity, regulators debated the best possible way to balance regulations with innovation.
While this debate raged on, the task of minimizing the misuse of AI models was largely left to developers. However, as was seen in the case of social media platforms, it can be a daunting task to manage the speedy innovations with user safety.
OpenAI learnt this the hard way. The company is currently facing multiple lawsuits alleging its chatbots pushed users to commit suicide. Similar lawsuits have also been filed against Character.AI and Google as a platform partner.
While both companies tried to make amends by introducing stricter policies to safeguard underage users, many believe the policies either came too late or do not do enough to push companies to make their products safer for users and non-users alike.
Grok’s Christmas controversy
The latest example comes from an unexpected direction: Grok, the AI chatbot developed by Elon Musk's xAI and integrated into the social media platform X.
What began as a feature launch around Christmas 2025 quickly devolved into what Reuters termed a "mass digital undressing spree," as users discovered they could manipulate photographs of women and children into sexualized imagery without consent.
The controversy culminated in an unusual moment: Grok itself posted an apology to X on December 28, acknowledging it had generated an AI image of "two young girls (estimated ages 12-16) in sexualized attire" based on a user's prompt. The bot admitted this "violated ethical standards and potentially US laws on CSAM."
When journalists contacted xAI for comment, they received an automated reply: "Legacy Media Lies."
The feature that opened Pandora's box
The technical trigger was straightforward. On Christmas Day, X rolled out an "Edit Image" button that allowed any user to modify photos through text prompts directed at Grok, without requiring permission from whoever originally posted the image.
Within days, a pattern emerged: users were replying to photos of women and girls with prompts like "put her in a bikini," "remove her clothes," or "turn her around."
Grok, which xAI had deliberately positioned as a less restrictive alternative to competitors like ChatGPT, complied. A report from Reuters documented 102 such attempts in a single 10-minute window, with the bot fully complying about 21 times and partially complying seven more. The requests frequently targeted young women, with some prompts explicitly requesting transparent or minimal clothing.
For the women targeted, the experience was visceral. Samantha Smith told the BBC she felt "dehumanised and reduced into a sexual stereotype." Julie Yukari, a Rio de Janeiro musician, posted a photo of herself in a red dress curled up with her cat on New Year's Eve. By the next day, users were prompting Grok to digitally undress her.
A pattern, not an accident
This episode was not Grok’s first brush with controversy. Since xAI launched Grok in 2023, the company has deliberately framed it as the rebellious alternative in a market dominated by what it portrays as overly cautious, tightly controlled AI systems. Grok was marketed less as a safe assistant and more as a chatbot willing to say and show what others would not.
That positioning has repeatedly brought problems. In August 2025, xAI rolled out “Spicy Mode” for Grok Imagine, its image and video generation tool. The feature was meant to underline Grok’s anything-goes ethos, but it quickly crossed lines that most AI companies have spent years trying to avoid.
As The Verge reported, Spicy Mode generated sexually explicit content, including nude deepfakes of celebrities, sometimes without users explicitly asking for it. In one test, the tool produced an uncensored topless video of Taylor Swift, the very first time it was used.
What made the situation more striking was the contradiction at the heart of xAI’s own rules. The company’s acceptable use policy explicitly bans “depicting likenesses of persons in a pornographic manner.” Yet Spicy Mode was clearly designed, promoted, and differentiated on the basis that it could do exactly that. The appeal was not accidental. That was the point.
Earlier in 2025, Grok generated comments referencing “white genocide” in South Africa, posted antisemitic content that praised Adolf Hitler, and was blocked by a Turkish court after producing vulgar responses about the country’s president. Each incident sparked outrage, headlines, and temporary fixes. None appeared to trigger a deeper rethink of how the system was designed or governed.
Taken together, these moments reveal a consistent pattern. While competitors poured resources into safety research, moderation teams, and increasingly strict guardrails, xAI chose a different path. It bet on standing out through permissiveness, speed, and shock value. Grok was allowed to go further, faster, and with fewer constraints, even when that meant flirting with legal, ethical, and cultural boundaries.
That strategy may have helped Grok grab attention in a crowded AI market, but attention is not the same as trust. As the latest controversy shows, the costs of this approach compound over time. Each new incident makes it harder to argue that these are edge cases or unforeseen bugs. They look instead like predictable outcomes of a system built to push limits first and ask questions later.
When permissiveness meets scale
Now, that long-running bet has collided with reality. Regulators, users, advertisers, and enterprise customers are far less tolerant of “edgy” experimentation when it results in sexualized images of real people, hate speech, or content that can trigger legal action across jurisdictions.
And there is data to back their concerns. According to the Internet Watch Foundation, a nonprofit that identifies child sexual abuse material online. There has been a 400% increase in confirmed AI-generated CSAM in the first six months of 2025 compared to the same period in 2024.
The number of AI-generated videos of child sexual abuse rocketed from just two in the first half of 2024 to 1,286 in the first half of 2025. All 1,286 videos were realistic enough to be treated under UK law as if they were genuine footage.
What this means for enterprises
And while Grok’s misuse is just a small representation of the overall problem, it raises an important question of whether the incidents represent lapses in oversight or are a deliberate attempt to grab attention.
If these are viewed as oversights, then questions need to be asked of model developers about their ability to control outputs. Especially since most developers are now looking to enterprise clients as their main customer base.
And if these are not oversights, it begs the question of whether regulators are willing to sacrifice the pace of development on the altar of user safety.
Either way, the Grok controversy is a reminder that enterprises need to be careful when relying on AI models. While Elon Musk's quest for free speech may help Grok gain visibility, enterprises may not have the ability to afford deploying models that can go off the rails.
Ultimately, the Grok episode is not about a single feature gone wrong, but about the risks embedded in how AI systems are designed, positioned, and governed.
For enterprises, the lesson is clear: permissiveness is not a feature; it is a liability. As AI becomes deeply embedded in customer-facing products, internal workflows, and decision-making systems, the cost of failure shifts from embarrassment to legal exposure, regulatory scrutiny, and lasting brand damage.
In that context, trust, accountability, and control matter far more than shock value. The next phase of AI adoption will not be won by the loudest models, but by those that can be relied on when things go wrong.


Quick Bits, No Fluff
Chip access clampdown: U.S. lawmakers push a bill to curb China’s remote access to advanced American AI chips.
Warhammer draws a line: Games Workshop bans AI in its creative pipeline while reporting record half-year profits.
Teaching your replacement: Startup Mercor pays unemployed workers to train AI on their jobs, raising obvious displacement questions.

Build your startup on Framer—Launch fast. Design beautifully.
First impressions matter. With Framer, early-stage founders can launch a beautiful, production-ready site in hours. No dev team, no hassle. Join hundreds of YC-backed startups that launched here and never looked back.
One year free: Save $360 with a full year of Framer Pro, free for early-stage startups.
No code, no delays: Launch a polished site in hours, not weeks, without hiring developers.
Built to grow: Scale your site from MVP to full product with CMS, analytics, and AI localization.
Join YC-backed founders: Hundreds of top startups are already building on Framer.
Eligibility: Pre-seed and seed-stage startups, new to Framer.
*This is sponsored content

Thursday Poll
🗳️ If you were choosing a model for a customer-facing product, what’s the hard line? |

3 Things Worth Trying
Patronus AI: Stress-test your LLMs for safety, hallucinations, and policy violations.
Hive Moderation: AI APIs to flag nudity, sexual content, deepfakes, hate, and violence.
Rate This Edition
What did you think of today's email? |





