- Roko's Basilisk
- Posts
- The Battle For AI’s Voice
The Battle For AI’s Voice
Plus: AI abuse, NYC robotaxis, and developer squeeze.
Here’s what’s on our plate today:
🎙️ ElevenLabs and how voice becomes AI’s real moat.
🧠 Bite-Sized Brains: AI abuse, robotaxis, and App Store power.
💡 Roko’s Prompt: Pressure test your product for the voice era.
📊 Today’s poll: Where voice AI hits your work first.
Let’s dive in. No floaties needed…

Launch fast. Design beautifully. Build your startup on Framer—free for your first year.
First impressions matter. With Framer, early-stage founders can launch a beautiful, production-ready site in hours. No dev team, no hassle. Join hundreds of YC-backed startups that launched here and never looked back.
Key value props:
One year free: Save $360 with a full year of Framer Pro, free for early-stage startups.
No code, no delays: Launch a polished site in hours, not weeks, without hiring developers.
Built to grow: Scale your site from MVP to full product with CMS, analytics, and AI localization.
Join YC-backed founders: Hundreds of top startups are already building on Framer.
Eligibility: Pre-seed and seed-stage startups, new to Framer.
*This is sponsored content

The Laboratory
Why ElevenLabs is becoming the poster child for AI’s next competitive era
The discovery or development of a new resource or technology is often followed by a tumultuous period during which enterprises scramble to find ways to extract value from it.
Look at the example of electricity. Although experiments in the 18th and early 19th centuries revealed the basic principles of electric charge and electromagnetism, it would take a long time for these principles to translate into practical products for mass consumption.
Even then, in the early stages, electricity served limited roles, such as lighting, rather than transforming entire industries. Its true economic impact depended on the creation of scalable infrastructure, technical standards, and complementary technologies, including the eventual dominance of alternating current, which enabled long-distance transmission and large interconnected grids.
Today, many view artificial intelligence through a similar lens. If one considers AI the foundation, companies such as ElevenLabs are developing appliances that unlock AI's true potential for their users.
AI’s appliance layer
The idea behind companies working on AI enterprise layering is that, though powerful in its own right, AI requires fine-tuning and development to perform functions that can drive measurable economic growth. This fine-tuning and repurposing of AI models were driven by companies such as ElevenLabs.
At its core, ElevenLabs’s goal was to develop the first human-like voice model that would transform how humans interact with machines. The company’s origins trace to the childhoods of its founders, Piotr Dabkowski and Mati Staniszewski, in Warsaw. At the time, foreign films on Polish television were voiced by a single male narrator who droned over the original actors in a flat monotone. Every character, every emotion, every dramatic beat, flattened into one bored voice. It was, as Staniszewski later told Sifted, a fundamental breakdown in how content crossed language barriers.
Today, the company has evolved from a small voice-synthesis startup into an $11B business supported by investors including Sequoia, Andreessen Horowitz, ICONIQ, and NVIDIA.
By the end of 2025, it reported more than $330M in annual recurring revenue and adoption inside over 60% of Fortune 500 organizations, while its leadership has signaled clear IPO ambitions. Yet the company’s significance extends beyond its valuation or growth. ElevenLabs illustrates a broader shift in the AI industry, where competitive advantage is increasingly tied
From model wars to interface wars
For the past three years, the AI industry’s center of gravity has been the foundation model: the massive neural networks trained on oceans of data that power everything from chatbots to code generators. Billions have been spent on computing, talent, and research to develop slightly improved versions of these systems. However, a structural shift is underway today.
As IBM’s Chief AI Architect recently observed, the central battle in AI is shifting away from who builds the most powerful models and towards model commoditization. Competitive advantage is increasingly determined by the systems, interfaces, and workflows that translate AI capabilities into usable human experiences.
ElevenLabs and the rise of voice as AI’s primary interface
ElevenLab’s growth represents this shift. The company chose not to pursue a general-purpose language model. Instead, it focused on the interface layer for voice, wagering that as AI systems grow more capable, natural, and expressive speech would become more valuable than the underlying models.
The company’s growth trajectory suggests that the strategy is working. ElevenLabs expanded from virtually no revenue in 2022 to tens of millions in recurring revenue within a year, then accelerated to roughly $90M by late 2024, about $200M in 2025, and more than $330M entering 2026. Much of this expansion occurred while maintaining profitability, culminating in a $500M Series D round in early February 2026 that tripled the company’s valuation within a year.
The company’s continued growth is not just about the development of AI’s ability to generate a human-like voice; it reflects a broader shift in how voice is becoming the primary gateway through which people interact with AI systems.
The company’s prospects are also bright, and more enterprises are moving towards implementing voice solutions in their workflows.
A 2025 Deepgram survey of 400 business leaders reported that 84% intend to increase investment in voice solutions. Further, the AI voice generator segment, valued at about $4.16B today, is projected to grow to $20.71B by 2031, and the broader voice AI agents market is projected to reach $47.5B by 2034.
For enterprises, these shifts are already translating into operational change. Contact centers, which account for a substantial share of global BPO spending, are emerging as one of the earliest and most heavily disrupted domains.
ElevenLabs’ ElevenAgents platform, for instance, has enabled the deployment of more than 2M conversational voice agents used by organizations such as Deutsche Telekom, Revolut, and Square.
As more enterprises opt to integrate systems developed by companies such as ElevenLabs, the economics are shifting accordingly.
Content localization and dubbing, once costing up to $100 per minute using traditional studio workflows, can now be delivered across dozens of languages at a fraction of that expense. From audiobooks and gaming dialogue to training systems and customer support, advances in voice AI are redefining how businesses produce, scale, and automate spoken interactions.
When synthetic voices collide with law, labor, and trust
However, as the company continues to scale and the industry evolves in response to new developments, the shifting economics have become a point of contention among companies such as ElevenLabs, voice actors, and regulators.
One of ElevenLabs’ most strategically important moves came in late 2025 with the launch of its Iconic Voice Marketplace. The purpose of the platform is to allow brands to license AI-generated versions of well-known celebrity and historical voices, turning recognizable speech patterns into commercial assets. Early examples highlighted the model’s appeal, with public figures and investors using synthetic voices to reach new audiences and languages.
The platform’s ability to turn voice into a tradable media product, similar to stock images or music libraries, has not been well received by professional voice actors, who perceive it as introducing new competitive pressures.
Since synthetic alternatives can reduce demand and compress prices, many are resorting to legal measures to ensure adequate compensation. Voice performers claim that AI systems were trained on their recordings without permission. Although ElevenLabs emphasizes consent-driven licensing within its marketplace, critics note that such safeguards tend to favor famous voices and established estates. At the same time, lesser-known professionals may lack comparable bargaining power.
Beyond copyright, there are also concerns about the misuse of technology to create deepfakes. Soon after its early releases, users demonstrated how easily the technology could be misused, generating deepfake audio of public figures and triggering high-profile controversies, including a robocall that mimicked President Biden during the 2024 US election cycle. Independent forensic researchers later concluded that the audio was very likely produced using ElevenLabs’ tools, underscoring the difficulty of detection, even for the company’s own systems.
In response, ElevenLabs implemented stricter safeguards, including enhanced voice verification, watermarking, and AI-driven detection systems. Yet the underlying dilemma remains unchanged: the very improvements that make synthetic voices more realistic and commercially valuable also expand the potential for deception and misuse.
This conflict extends far beyond any single company. It captures a central governance challenge of the generative AI era: technological progress and risk mitigation proceed in parallel rather than sequentially. Because human speech carries an exceptional degree of trust and emotional authority, failures or abuses in voice technologies can produce consequences that are arguably more destabilizing than those associated with text or image generation.
Interface power and the next phase of AI competition
The story of ElevenLabs, at its core, represents the plurality of AI development by showcasing the structural shift in where AI value gets created and captured.
While the foundation model era rewarded scale, compute, and research budgets, the emerging era rewards product intuition, ecosystem design, and interface control.
ElevenLabs did not build the biggest or most versatile AI model; it focused on building the voice people prefer to interact with. In an AI landscape where intelligent systems are becoming ubiquitous, controlling how humans experience and interact with them could ultimately matter more than owning the underlying models.
At the same time, the challenges posed by ElevenLabs’s technology are representative of broader challenges facing the generative AI industry. The company then represents the evolution of the market, the legal paradigm, and the potential of generative AI to transform how humans interact with machines.


Tuesday Poll
🗳️ How fast do you think AI voice will replace most human-led customer interactions? |

The context to prepare for tomorrow, today.
Memorandum merges global headlines, expert commentary, and startup innovations into a single, time-saving digest built for forward-thinking professionals.
Rather than sifting through an endless feed, you get curated content that captures the pulse of the tech world—from Silicon Valley to emerging international hubs. Track upcoming trends, significant funding rounds, and high-level shifts across key sectors, all in one place.
Keep your finger on tomorrow’s possibilities with Memorandum’s concise, impactful coverage.
*This is sponsored content

Prompt Of The Day
![]() | Pick one customer touchpoint (support, sales, onboarding, etc.) and redesign it as a voice-first flow using AI agents. In bullets, define: what the agent says, when it hands off to a human, and the one metric you’d track to decide if voice stays or dies. |

Bite-Sized Brains
Chatbots are turbocharging stalkers: Futurism details how ChatGPT-style bots are reinforcing users’ delusions, helping them justify harassment, doxxing, and stalking targets instead of challenging dangerous behavior.
New York freezes robotaxi rollout: New York regulators have paused robotaxi expansion, citing safety risks and labor concerns, forcing autonomous-vehicle companies into a slower, tightly supervised pilot phase.
App Store ‘top charts’ are a mirage: A Verge analysis finds many top apps are ad-stuffed, data-hungry reskins dominated by a handful of big players, leaving almost no oxygen for smaller, independent developers.
Rate This Edition
What did you think of today's email? |





