- Roko's Basilisk
- Posts
- Lights, Camera, AI
Lights, Camera, AI
Plus: Tesla’s robot army, Microsoft’s new AI Copilot, and OpenAI’s latest Mac buy.
Here’s what’s on our plate today:
🎥 AI video tools reshape media, advertising, and entertainment.
🧠 OpenAI buys Sky, Tesla teases robots, Microsoft ships Mico.
💡 AI video: Time-saver, quality boost, or novelty trap?
🗳️ Should AI video generators be clearly labeled?
Let’s dive in. No floaties needed…

Lower your taxes and stack bitcoin with Blockware.
What if you could lower your tax bill and stack Bitcoin at the same time?
Well, by mining Bitcoin with Blockware, you can. Bitcoin Miners qualify for 100% Bonus Depreciation. Every dollar you spend on mining hardware can be used to offset income in a single-tax year.
Blockware's Mining-as-a-Service enables you to start mining Bitcoin without lifting a finger.
You get to stack Bitcoin at a discount while also saving big come tax season.
*This is sponsored content

The Laboratory
How AI video generators are transforming media, advertising, and creativity
In 2023, a global survey found that 85% of people worried about the impact of online disinformation, and 87% believe it has already harmed their country’s politics. At the time, the United Nations expressed concerns about the false information and hate speech, stating that its amplification by social media platforms posed “major risks to social cohesion, peace and stability”.
At the time, AI video generators were making significant strides in improving their output. However, there were still trade-offs. The market had shifted away from the early experimental phase of the late 2010s and early 2020s, but still struggled to produce complex sequences involving body motion, object manipulation, or cinematic storytelling, revealing visible artifacts such as inconsistent lighting, awkward gestures, or distorted backgrounds. While they were still suitable for marketing and educational purposes, they had yet to achieve the fidelity required for professional film or television production.
However, just two years later, all this had changed. By July 2025, Netflix was using AI in one of its TV shows. This was the first instance of a mainstream show creating a visual-effects scene of a building collapsing in Buenos Aires. That VFX shot was produced about ten times faster than it would have taken with traditional methods, and at a much lower cost.
Just months later, OpenAI released the Sora 2 video generator, and within five days of its launch, the generator’s app reached a million downloads, despite it being in invite-only mode.
So, how is it that AI models like Sora and Veo 3 can create short video clips that look almost real with frames, lighting, motion, and even lip-syncing audio?
How do AI video generators work?
At its core, modern AI video generation stitches together two ideas: diffusion models and transformers. These are bridged by what is known as latent representations.
To make video generation possible, diffusion models are trained to reverse a noising process: you take a clear image, add noise, add more, until it’s just a mess; the model learns to walk backward from mess to image. To generate a video, that inference must happen across a sequence of frames, but without letting things fall apart from frame to frame.
That’s where transformers lend their strength: they track long sequences and enforce consistency over time, so the subject you asked for in frame 1 still has the same attributes in frame 30.
However, since AI generators need a prompt to generate an image, they are paired with a Large Language Model. Together, they are trained on a large dataset of prompts and images to ensure the models understand the relationship and continue to push the diffusion model, which generates the images to align them with the text input.
However, making AI videos takes a huge amount of computing power because each video frame contains millions of pixels. To make the process manageable, models don’t work directly with full-size video files. Instead, they first shrink the video into a simpler, compressed form called a ‘latent space’. It’s like creating a shorthand version of the video where only the essential details are kept. According to experts, in this stage, the model compresses the video frames into a mathematical code that captures just the essential features of the data and throws out the rest.
The AI then does all its editing, noise-removal, and generation work inside that smaller space, which saves time and energy. Once the AI is done, the compressed version is expanded back into a normal, visible video, matched with the user prompt, and if the model’s assessment shows it to be a good match, the user gets the video output.
According to MIT Tech Review, DeepMind’s Veo 3 model takes this a step further by handling both video and audio together inside that compressed space. This means it can generate sound effects, speech, and visuals that match perfectly, avoiding problems like lip movements not lining up with dialogue. DeepMind calls this “the end of the silent era of AI video”. Still, even with these improvements, the technology isn’t flawless; videos can sometimes show strange textures, flickering, or awkward character movements that give away their synthetic origins.
Where AI video generators fail?
Just like AI chatbots have their limits, they can hallucinate; video generators, too, are limited by the data they are trained on. When asked to handle situations they haven’t dealt with before, they struggle.
While they can convincingly mimic motions and physics from their training data, they often fail when confronted with new or unusual scenarios, suggesting they rely more on recalling similar examples than understanding universal physical laws. Another less-discussed issue is the trade-off between closed, highly optimized platforms like Sora and Veo, which deliver polished outputs, and open-source alternatives like Open-Sora, which share model weights with the community and encourage experimentation.
Even when video generators manage to perform their job, they can still be problematic for companies and users.
When OpenAI first launched the Sora 2 model, it allowed users to generate content featuring copyrighted characters unless rights holders explicitly opted out, a decision that triggered strong backlash from artists and Hollywood talent agencies.
The Creative Artists Agency (CAA), that represents represents thousands of actors, directors and music artists, warned that it weakened creative ownership and threatened artists.
In response, OpenAI reversed this stance, promising tighter control over copyrighted material. Concerns about bias also persist. Since these models learn from large amounts of internet data, they tend to reproduce the same social and cultural biases embedded in that data. This can include showing certain races, genders, or beauty standards more often than others and underrepresenting marginalized groups.
The virtual worlds they create, therefore, mirror and sometimes magnify real-world inequalities.
Beyond the creative and social challenges, AI video generators are also a big drain on the environment. The cost of training and running these systems remains high. Even though diffusion models are more efficient than traditional transformer architectures, generating full-motion video still consumes enormous computing power and energy, making AI video production an expensive and resource-intensive process.
And though the integration of specialized AI chips and optimized software is paving the way for on-device AI video generation, it is still far from perfect.
On-device computing to the rescue
Google’s Veo 3 model exemplifies the shift towards on-device generators offering users the ability to generate 8-second videos with native audio from text prompts. Accessible through platforms like Gemini and YouTube Shorts, Veo 3 is touted to be capable of delivering cinematic-quality results directly on devices.
Similar features are also being rolled out by platforms like Snapchat, which has introduced AI Video Lenses and text-to-video generation tools. These capabilities, powered by Snap’s in-house generative video model, allow users to create dynamic content seamlessly within the app.
Beyond social media, AI video generators are also transforming the advertising industry, prompting both excitement and concern among executives. According to a CNBC report, Mark Read, outgoing CEO of WPP, described AI as “totally disrupting” the business, predicting it will revolutionize access to expertise at low cost, from legal and medical fields to creative marketing.
WPP is already integrating AI into its operations, with 50,000 employees using its AI-powered platform, WPP Open, to streamline tasks like campaign planning and content creation. Similarly, Maurice Levy, CEO of Publicis Groupe, emphasized that AI accelerates content production and enables unprecedented personalization, though he stressed that AI should remain a tool to augment human work.
As such, with more use cases and increasing adoption, the quality and performance of AI video generators are expected to improve in the coming years.
Where are video generators headed?
The evolution of AI video generators over the past few years illustrates a remarkable trajectory from experimental novelty to practical tools capable of reshaping multiple industries. By 2025, mainstream adoption, exemplified by Netflix’s use of AI to create complex visual effects, demonstrated that these systems could deliver high-quality results at a fraction of the time and cost of traditional production methods.
Yet, despite these advances, the technology remains imperfect. Models still struggle with unfamiliar scenarios, hallucinate content, and reveal artifacts such as flickering or inconsistent motion. Biases embedded in training data can also perpetuate social inequities, while the high computational cost of video generation raises environmental concerns.
On the other hand, the integration of AI into industries like advertising signals its growing economic and creative potential. AI can streamline content creation, personalize experiences at scale, and augment human expertise rather than replace it. Emerging developments in on-device AI generation promise to make these tools more accessible, reducing reliance on cloud infrastructure and expanding creative possibilities to a broader audience.
And, while challenges like disinformation and misinformation remain, with stronger guardrails, safeguards, copyright laws, and regulations, AI video generation is poised to become a foundational technology, redefining how media is produced, consumed, and experienced across both entertainment and commercial sectors.

Bite-Sized Brains
OpenAI buys Sky, bets on desktop UX: The Mac-native AI interface will help OpenAI push beyond browsers.
Elon hints at Tesla “robot army”: Tesla’s Q3 call centered less on earnings, more on AI ambitions.
Microsoft quietly ships Mico Copilot: A lightweight AI helper for quick tasks has just hit Windows.

Roko Pro Tip
![]() | 💡 AI video is advancing fast. Before using it, ask: Is this saving time, improving quality, or just chasing novelty? |

The context to prepare for tomorrow, today.
Memorandum distills the day’s most pressing tech stories into one concise, easy-to-digest bulletin, empowering you to make swift, informed decisions in a rapidly shifting landscape.
Stay current, save time, and enjoy expert insights delivered straight to your inbox.
Streamline your daily routine with the knowledge that helps you maintain a competitive edge.
*This is sponsored content

Monday Poll
🗳️ Should AI-generated video be clearly labeled across all platforms? |
Meme Of The Day
this is like the same guy four times
— dax (@thdxr)
5:11 PM • Oct 21, 2025

Rate This Edition
What did you think of today's email? |






