The End Of Unlimited AI

Plus: surveillance marinara, Meta worker tracking, and SpaceX-Cursor deal.

Here’s what’s on our plate today:

• 🧪 Why AI labs are killing unlimited pricing.
• 📰 Prego's creepy mics, Meta's keystroke harvest, and SpaceX-Cursor.
• 🛠️ Three tools worth trying: Dust, Krea, Lavender.
• 🗳️ Poll: Is metered AI fair or a budget killer?

Let’s dive in. No floaties needed…

In partnership with

Ship Docs Your Team Is Actually Proud Of

Mintlify helps you create fast, beautiful docs that developers actually enjoy using. Write in markdown, sync with your repo, and deploy in minutes. Built-in components handle search, navigation, API references, and interactive examples out of the box, so you can focus on clear content instead of custom infrastructure.

Automatic versioning, analytics, and AI powered search make it easy to scale as your product grows. Your docs stay accurate automatically with AI-powered workflows with every pull request.

Whether you're a dev, technical writer, part of devrel, and beyond, Mintlify fits into the way you already work and helps your documentation keep pace with your product.

*This is sponsored content

The Laboratory

TL;DR

Flat fees are dying across AI: Anthropic, OpenAI, Windsurf, and GitHub have all moved away from unlimited pricing toward token or usage-based billing, signaling an industry-wide structural shift rather than a one-off decision.

Compute demand is outrunning supply: Agentic tools consume orders of magnitude more tokens than chat, GPU lead times stretch past a year, and spot prices for NVIDIA Blackwell chips rose 48%, making flat pricing mathematically unsustainable.

IPO math is accelerating the change: Anthropic’s annualized revenue hit $30B, but gross margins fell to 40% as inference costs surged, and usage-based billing lets revenue scale with compute costs instead of against them.

Enterprises face a new budgeting problem: AI is shifting from a fixed line item to a variable cost that grows with adoption, but the FinOps-style tooling needed to manage it is still in its infancy, leaving buyers caught between unpredictable spend and pressure to embed AI into core operations.

Why AI labs are killing unlimited pricing

The true strength of any business lies in its ability to generate meaningful returns for the investors, workers, and owners who manage the product’s costs and prices. Over the past few years, much of the talk around artificial intelligence has focused on what the technology can help those involved in its development and users achieve, and, to a certain extent, on its impact on broader socio-economic structures.

Now, in 2026, the conversation is shifting toward what it costs to develop and run the technology, and how those who make it run will manage costs and prices.

From flat subscriptions to metered intelligence

Recently, Anthropic gave the world a glimpse into what it, and others within the industry, think will be the pricing structure that works for the developers and the consumers, and it is a big departure from what many thought would be the cost of relying on AI for their automation goals.

In April 2026, Anthropic restructured its enterprise pricing for Claude, replacing flat per-seat subscriptions of up to $200 per user per month with a lower base fee of roughly $20 per-seat plus billing tied to actual token consumption. The change, first reported by The Information, applies across Claude, Claude Code, and Cowork deployments for business customers. To ensure everyone is on the same cost structure, organizations on older plans with fixed usage allowances must migrate at their next contract renewal or lose the terms they have.

The end of unlimited AI

What makes this shift important is that Anthropic is not alone; just days before it announced the new structure, OpenAI shifted its Codex coding tool from per-message pricing to token-based metering for business customers. Others within the industry have been doing something similar: Windsurf replaced its credit system with daily and weekly usage quotas in March, and GitHub tightened Copilot’s premium request limits in February.

The message across the industry is clear: the flat-fee subscription model for compute-intensive workloads is being dismantled, and three forces are converging to make this shift stick.

When compute becomes the bottleneck

The first major factor driving the disappearance of unlimited AI subscriptions is that the cost of serving heavy users now exceeds what flat fees can cover. When a software team runs Claude Code or Codex autonomously across entire code repositories, a single debugging session can consume hundreds of thousands of tokens, well beyond what a monthly subscription was designed to support. Agentic tools, software that operates autonomously on behalf of users across multi-step tasks, consume orders of magnitude more compute than a standard chat interaction. The impact of this can be seen in the rising cost of running AI tools while their cost remains stagnant.

Growth that breaks the business model

Anthropic’s Claude Code reached $2.5B in annualized revenue by February 2026, just nine months after its public launch. That growth created a direct tension: the product driving the most revenue was also the product eroding margins fastest under flat pricing. To address this problem, Anthropic has already tightened session limits for Pro and Max users during weekday peak hours.

However, the problem extends beyond that, and the supply chain is struggling to keep up with the increasing demand for compute infrastructure. So while AI consumption continues to rise, the supporting infrastructure, which is the data centers, is taking longer than anticipated to come online to serve the increasing demand.

Gartner forecasts worldwide AI spending at $2.52T in 2026, a 44% year-over-year increase, with AI-optimized server spending alone up 49%. But lead times for data center GPUs now run 36 to 52 weeks, high-bandwidth memory suppliers have redirected capacity away from consumer products, and TSMC’s advanced fabrication nodes are fully booked through 2026. The Wall Street Journal reported that spot prices for NVIDIA’s latest Blackwell GPUs rose 48%, and that OpenAI’s API token usage jumped from 6B per minute in October to 15B per minute by the end of March.

OpenAI CFO Sarah Friar told the Journal she spends much of her time hunting for near-term compute capacity and that the company is making difficult decisions about which projects to shelve.

The disruption is structural and will continue to affect the AI industry due to the time required to strengthen existing infrastructure. New semiconductor plants take years and billions of dollars to build. And it does not stop there: the constraint extends beyond chips to electrical power, cooling infrastructure, and skilled labor. When demand grows exponentially, and supply grows linearly, pricing becomes the demand-management mechanism.

For AI labs, it does not make sense to slow down customer acquisition and wait for infrastructure to catch up, especially given that most are working toward their IPOs. This brings us to the investor side of the problem and why the new cost structures have become more appealing for the industry.

Why IPO math favors usage-based pricing

For AI labs approaching their IPOs, consumption-based revenue presents a cleaner financial story than flat subscriptions with unpredictable margins.

Anthropic closed a $30B Series G in February 2026 at a $380B valuation and is reportedly evaluating an IPO as early as October 2026. Its annualized revenue crossed $30B in early April, up from $9B at the end of 2025. Still, the company plans to spend roughly $19B on training and inference in 2026, approximately matching its revenue. Gross margins reportedly fell to 40% as inference costs surged beyond projections.

Usage-based billing solves the core problem by tying revenue directly to consumption. The more tokens a customer uses, the more they pay, which means the company’s income rises alongside the compute and infrastructure costs required to serve that usage. That removes the imbalance you get with flat subscriptions, where a small group of heavy users can drive up costs without contributing additional revenue. Under a usage model, those same users become proportionally more valuable instead of a financial drag.

In effect, it turns scaling from a risk into an advantage: higher usage no longer erodes margins; it supports them.

OpenAI faces a similar dynamic: its Codex usage has grown sixfold since January 2026, with more than 2M developers using the tool weekly. The shift to token-based billing coincided with a base seat price reduction from $25 to $20, a structure designed to lower the barrier to adoption while ensuring heavy compute usage is paid for at the point of consumption.

And this is not the first time the technology industry has witnessed a shift in how it prices its products, and cloud computing is one of the best examples of this shift.

In its early years, Amazon Web Services relied on simple pricing tiers because most usage was still experimental. As companies moved into production, usage became harder to predict, pushing the industry toward reserved instances, committed-use discounts, and eventually a discipline like FinOps to manage spending. AI now appears to be following the same path, only at a much faster pace.

AI becomes a variable cost for enterprises

However, while the shift in pricing strategy benefits AI labs, it creates problems for enterprise buyers. When AI was still a pilot project confined to a sandbox with a handful of curious employees, flat pricing felt manageable because usage was low and relatively predictable. But as companies begin embedding AI into core business functions such as code generation, customer support, document analysis, or financial modeling, usage patterns become far more uneven, varying significantly across teams and individuals.

What makes the situation tricky for enterprises is that the enterprise software industry is also experimenting with different approaches. Salesforce tried per-conversation pricing for its Agentforce AI agents but reversed course after customers demanded predictability, with CEO Marc Benioff acknowledging the pushback. Salesforce then introduced the Agentic Enterprise License Agreement, a flat-fee, multi-year commitment for unlimited AI agent usage.

ServiceNow took another approach, embedding AI across its entire product line at no additional charge while introducing consumption-based ‘Assist Packs’ for heavier workloads. No single model has won, but the old binary of flat subscription or pay-per-use is clearly giving way to hybrid structures.

For companies integrating AI into core operations, this pricing shift has consequences beyond the invoice. Analysis from NPI Financial suggests total costs could rise, as lower seat fees fail to offset the loss of API discounts and the absence of volume pricing. At the same time, Anthropic now expects customers to commit to estimated usage upfront.

Reliability adds another layer of friction, as David Hsu pointed out, with outages as a reason to switch providers, highlighting the gap between usage-based pricing and enterprise-grade availability expectations. With significant customer overlap between OpenAI and Anthropic, many enterprises are already adopting dual-vendor strategies, using competition between providers to manage costs and risk.

The question enterprises are left with

Whether AI pricing follows the cloud computing playbook depends on how quickly companies learn to manage their AI spend. Some of that tooling is already taking shape, from model-routing systems that send tasks to the cheapest adequate model, to usage monitoring, prompt caching, and shifting routine work to lighter models while reserving advanced models for complex tasks.

But while cloud FinOps has taken years to mature, AI adoption is scaling much faster, and the tools to control costs are still in their early stages. For enterprise leaders, AI is no longer a fixed line item. It is a variable cost that grows with adoption, even as the systems to manage it are still catching up.

Quick Bits, No Fluff

Prego goes full dystopia: The pasta sauce brand is launching small microphones that listen in on family conversations, because apparently, nothing says marinara like surveillance.
Meta's keystroke harvest: Meta will track its workers' clicks and keystrokes to train AI, turning every internal task into training data.
SpaceX eyes Cursor: SpaceX is collaborating with Cursor and holds an option to acquire the AI coding startup for $60B.

Launch fast. Design beautifully. Build your company's website on Framer

Framer helps teams design, build, and launch their marketing sites lightning fast. With the ability to publish hundreds of CMS pages in a single click, operate at a global scale with seamless localization, and even host unified content across multiple domains, teams have never been able to ship faster. Trusted by companies like Miro, Bilt, and Perplexity.

*This is sponsored content

Thursday Poll

🗳️ Unlimited AI is dying. What's your take?

Login or Subscribe to participate in polls.

3 Things Worth Trying

OpenRouter: Single API that routes your prompts to the cheapest adequate model, perfect for the usage-billing era.
Helicone: Observability layer for LLM calls that tracks token spend per user, feature, and prompt in real time.
Portkey: AI gateway with prompt caching, fallbacks, and cost controls built for teams trying to keep their AI bill from exploding.

The Toolkit

Dust: No-code platform for building custom AI agents that connect to your company's tools and data, so teams can automate workflows without engineering.
Krea: Real-time AI image and video generator with a creative-first interface, great for designers who want to actually steer the output instead of fighting prompts.
Lavender: AI sales email coach that scores your drafts in real time and suggests rewrites to lift reply rates.

Rate This Edition

What did you think of today's email?

Login or Subscribe to participate in polls.