Roko's Basilisk
Posts
The Compute Bill Cracks

The Compute Bill Cracks

Plus: AI is making us dumber, Tim Cook's heir apparent, and NFT mascots return.

Roko's Basilisk
April 22, 2026

Here’s what’s on our plate today:

• 🧪 ZYPHRA's $1B bet against NVIDIA-scale AI.
• 📰 Chatbot brain rot, Apple's next CEO, and weirdo revival.
• 🧠 Brain Snack: co-design beats brute force every time.
• 🗳️ Poll: scale, efficiency, or neither wins?

Let’s dive in. No floaties needed…

Launch fast. Design beautifully. Build your company's website on Framer

Framer helps teams design, build, and launch their marketing sites lightning fast. With the ability to publish hundreds of CMS pages in a single click, operate at a global scale with seamless localization, and even host unified content across multiple domains, teams have never been able to ship faster. Trusted by companies like Miro, Bilt, and Perplexity.

_{*This is sponsored content}

The Laboratory

TL;DR

• $1B valuation, 62 staff: ZYPHRA closed a Series A in June 2025 at a $1B valuation with a headcount that barely registers next to OpenAI or Anthropic.
• The AMD bet: ZAYA1 is the first frontier-scale Mixture-of-Experts model trained entirely on AMD silicon, using MI300X chips and IBM Cloud. With 760M active parameters, it reportedly matches Qwen3-4B and beats Llama-3-8B.
• Co-design, not brute force: By engineering the model around the hardware, ZYPHRA achieves checkpoint speeds up to 10x faster than NVIDIA setups, cutting training costs and failure risk.
• Economics will decide it: OpenAI is projected to burn $85B by 2028, and Anthropic’s inference costs already exceed half of revenue. If compute keeps pace with demand, scale wins. If not, efficiency stops being a curiosity and becomes the only path.

Decoding ZYPHRA’s bet that AI can be built without hyperscaler capital

The idea that better AI comes from bigger models and more compute stems from scaling laws identified by OpenAI, DeepMind, and Google, which showed that performance improves predictably as model size, data, and compute increase. This turned AI progress into a largely engineering-driven process, where systems like GPT-3 improved simply by being scaled up, reinforcing the idea that more compute is the primary path forward.

Founded in 2021 by Krithik Puthalath, Beren Millidge, Tomas Figliolia, and Danny Martinelli, ZYPHRA reached a $1B valuation in its June 2025 Series A, underscoring growing confidence in its alternative approach to building AI systems. Photo Credit: ZYPHRA.

However, that view of the technology tells only one part of the story of artificial intelligence; the other half comes from approaches that focus on efficiency rather than size. Companies such as Meta and Mistral AI are building smaller, more competitive models, while others, like Anthropic, emphasize smarter use of compute at inference time. Alongside this, modular systems and better data strategies suggest that future gains come less from scaling alone and more from the intelligent design and deployment of systems.

With this belief, one company has managed to secure a valuation of $1B, with a staff of just 62 people, a rounding error next to OpenAI’s and Anthropic’s headcounts.

A small company in a big-capital market

The company, ZYPHRA, was founded in 2021 by Krithik Puthalath, Beren Millidge, Tomas Figliolia, and Danny Martinelli. In June 2025, the company closed a Series A round that valued it at $1B, according to its own announcement and the joint press release from IBM and AMD that followed in October.

The company describes itself as a ‘full-stack, open-source superintelligence company.’ That phrase covers three products: a family of foundation models (Zamba, Zonos, ZR1, ZAYA1, and ZUNA), an inference platform called ZYPHRA Inference Cloud, and an enterprise agent called Maia.

The unusual part is that it is trying to link all products by designing models around the hardware they will run on. The hardware is chosen for its economics, not its brand, and the agent sits on top of the models ZYPHRA trained itself.

ZYPHRA’s ZAYA1-base shows strong early performance on complex math and STEM reasoning, approaching models like Qwen3-4B-Thinking even before instruction tuning. Photo Credit: AMD.

The company is then a test case for this approach, which became visible with the release of ZAYA1. ZYPHRA announced the model on 24 November 2025 and published its technical report on arXiv, positioning it as a direct challenge to the assumption that scale alone drives capability.

ZAYA1 has 8.3B total parameters, but only 760M are active at any given time, using a Mixture-of-Experts design that routes each query through a small set of specialized subnetworks rather than the full model. By the company’s own benchmarks, it matches Qwen3-4B and Gemma 3 12B on reasoning, math, and coding tasks, and outperforms Llama-3-8B. The comparison that ultimately matters, though, is not the scores, but the compute bill.

What it means to co-design

ZAYA1 was the first frontier-scale Mixture-of-Experts model trained entirely on AMD silicon. Every other leading model you have heard of, from GPT-5 to Claude to Gemini, has been trained primarily on NVIDIA GPUs, with Google being the partial exception through its in-house TPU program. NVIDIA’s dominance in the AI training market is structural: its CUDA software stack has a decade-long head start, and switching costs are high. Most AI companies have accepted this lock-in as a cost of doing business.

By opting for AMD hardware rather than the more conventional NVIDIA stack, ZYPHRA took a different route to train ZAYA1. According to an AMD technical blog post, the model was trained on a cluster of 128 compute nodes, each equipped with 8 AMD Instinct MI300X chips and 8 Pensando Pollara 400 NICs, with infrastructure provided by IBM Cloud.

The MI300X offers 192GB of high-bandwidth memory per chip, roughly 2.4 times that of NVIDIA’s. That additional memory allowed ZYPHRA to rely on a simpler parallelism strategy, avoiding some of the more complex engineering typically required in NVIDIA-based training setups.

ZYPHRA achieved this by engineering its model around the hardware, with its research describing architectural choices specifically designed to fit AMD silicon: Compressed Convolutional Attention (to reduce the memory burden of long-context reasoning), a more expressive routing system for the Mixture-of-Experts layer, and lightweight residual scaling.

These choices allowed ZYPHRA’s models to save checkpoints up to 10 times faster than comparable NVIDIA-based setups, reducing both training costs and the risk of failure during long runs. The broader aim of this design is to achieve comparable model quality using far less compute. If that holds at a larger scale, it materially shifts the economics of frontier AI.

It is important to note here that the case for massive compute isn’t just industry hype; leaders like Dario Amodei and Sam Altman argue that billion-dollar training runs produce fundamentally more capable models, making scale the most reliable path to advanced AI. So far, the financials support that view. Anthropic has crossed $30B in annualized revenue, while OpenAI is at around $24B and pursuing a massive IPO, with both companies spending heavily on compute in the belief that superior models will secure long-term demand.

For hyperscalers like Microsoft, Google, Amazon, and Meta, this strategy goes beyond building models; it underpins their entire cloud businesses, enabling them to profit from AI workloads regardless of who trains the models. That dual incentive has driven massive infrastructure spending, reinforcing the idea that if scale truly determines capability, smaller, efficiency-focused players like ZYPHRA may struggle to compete.

The efficiency thesis, in practice

The counter-argument is that compute is not free, and that the companies currently winning on compute are losing money at a rate that eventually has to stop. OpenAI’s internal projections, as reviewed by the Wall Street Journal, show the company burning $85B in cash in 2028 and not reaching breakeven until after 2030. Anthropic’s situation is structurally similar, with inference costs (the cost of actually running the models for customers) already exceeding half of revenue.

If these costs do not fall, the economics behind these companies begins to crack, and if they fall, the efficient companies get the benefit too, and arguably more of it.

This is precisely the bet companies like ZYPHRA are making by prioritizing efficiency over brute-force scale. It has raised about $111M, including backing from Jaan Tallinn and firms like Intel Capital, to build an alternative-architecture research pipeline and an inference platform serving models such as DeepSeek, Kimi, and Qwen. Rather than trying to outspend OpenAI, ZYPHRA is betting it can reach comparable capability with a fraction of the compute budget, competing on efficiency rather than scale.

However, before reaching a conclusion on which side will write the future story of AI, it must be remembered that both theses remain unproven at the scale that matters.

ZYPHRA has shown its efficiency approach works at 8.3B total parameters.

Brain Snack (for Builders)

Co-design beats brute force. If you’re picking hardware last, you’re already paying too much.

Outperform the competition.

Business is hard. And sometimes you don’t really have the necessary tools to be great in your job. Well, Open Source CEO is here to change that.

Tools & resources, ranging from playbooks, databases, courses, and more.
Deep dives on famous visionary leaders.
Interviews with entrepreneurs and playbook breakdowns.

Are you ready to see what’s all about?