AI Matures, Excuses Expire

Plus: Pichai on labor, AI glasses confusion, and ElevenLabs’ pivot.

Here’s what’s on our plate today:

  • 🧪 2025: AI shifts from hype to boring infrastructure.

  • 📰 Headlines: Pichai on AI labor, glasses confusion, and ElevenLabs pivot.

  • 📊 Poll: Where does your company really sit on the AI curve?

  • 🛠️ Weekend To-Do: kill vanity pilots, crown real workflows.

Let’s dive in. No floaties needed…

Outperform the competition.

Business is hard. And sometimes you don’t really have the necessary tools to be great in your job. Well, Open Source CEO is here to change that.

  • Tools & resources, ranging from playbooks, databases, courses, and more.

  • Deep dives on famous visionary leaders.

  • Interviews with entrepreneurs and playbook breakdowns.

Are you ready to see what’s all about?

*This is sponsored content

The Laboratory

The state of AI in 2025: breakthroughs that defined the year

Google released the Nano Banana Pro, a higher-fidelity Gemini 3 image model. Photo Credit: Google.

Throughout history, every invention has tried to solve a problem and make lives more efficient and comfortable. While they may succeed in achieving their goals, humans need to adjust and adapt to new inventions and technological advancements.

Sometimes, it can happen within the span of a couple of years; in other cases, it can take decades, maybe more.

During this phase, when a technology is still new and humans have yet to adapt to it, a technology can undergo numerous changes to make sure its users can make the most of it. This is precisely what artificial intelligence has been trying to do throughout 2025.

Throughout 2023 and 2024, AI moved from a novel technology to one that could shift business models. It was scrutinized by regulators, all the while it moved from new frontier models to hardware breakthroughs and into real commercial use cases.

This trend of maturation continued to shape the future of AI in 2025; the technology moved away from promise and into practice.

The core building blocks of AI, such as transformers, attention mechanisms, and reinforcement learning, stayed largely the same. What evolved was the way researchers combined these ingredients and trained models to behave in new and more useful ways.

The year also showed that AI was maturing inside enterprises. Google noted that companies were shifting from experimentation to optimization, focusing less on whether to adopt AI and more on how to extract real value from it.

This movement from pilot projects to full production, when successful, became one of the most meaningful business developments of the year, even as many organizations still struggled with basic implementation.

Driving this maturity were some key developments that shaped AI in 2025.

Deep Think in Gemini 3 achieves leading results on several advanced AI benchmarks. Photo Credit: Google.

The biggest breakthrough of 2025 was the rise of reasoning models, which solve problems step by step instead of guessing the next word.

OpenAI’s o1 series led this shift, using reinforcement learning and extra compute at inference time to dramatically improve accuracy.

On the AIME exam, o1 solved 83% of problems compared with GPT 4o’s 13%, reached the 89th %ile on Codeforces, and showed PhD-level performance on science tasks.

Google followed with Gemini 2.5, which delivered state-of-the-art results across reasoning benchmarks. Its Deep Think system even achieved gold-medal performance at the 2025 International Mathematical Olympiad, solving five out of six problems entirely in natural language.

These gains quickly moved into real-world use. Companies used reasoning models for legal analysis, complex debugging, and multi-step research.

Microsoft reported major accuracy improvements after integrating o1 into Azure, and legal AI firm Harvey said the model could independently plan and complete difficult tasks.

Beyond reasoning, another development captured public imagination and redefined how users view AI.

Agents that actually use computers

OpenAI CEO Sam Altman unveiled the AgentKit, a new toolkit for building and deploying AI agents. Photo Credit: TechCrunch.

Computer use became one of the most important AI advancements of 2025. After debuting in beta with Claude 3.5 Sonnet in late 2024, the technology matured quickly.

Claude 4.5 Sonnet, released in September 2025, showed huge gains, scoring 77.2 % on SWE Bench Verified, sustaining complex work for more than 30 hours. It also reached 61.4% on OSWorld, a major jump from the 42.2 % achieved only months earlier.

Anthropic pushed further with Claude 4.5 Opus, calling it its strongest model for coding, agents, and computer use. The company expanded the feature into practical products. Claude for Chrome allowed the AI to operate across browser tabs, while Claude for Excel let it read and edit spreadsheets directly.

Early testers said Opus handled ambiguity and complex decision-making far better than earlier versions.

Google advanced the field with SIMA 2, which uses Gemini 2.5 flash lite to move beyond simple instruction following. It doubled the performance of SIMA 1 and can reportedly navigate new environments, learn behaviors through trial and error, and even follow emoji-based instructions.

Businesses quickly adopted these tools to automate workflows, test software, and support open-ended research.

Companies such as Replit, Asana, Canva, Cognition, DoorDash, and The Browser Company explored tasks requiring long, multi-step sequences that were previously difficult to automate.

Away from the improved capabilities front, AI also received validation from the scientific community for assisting researchers in their work.

Scientific AI’s validation moment

By late 2025, AlphaFold 3 had predicted over 240 million protein structures with far greater accuracy. Photo Credit: Nature.

AlphaFold’s 2024 Nobel Prize confirmed AI’s transformative impact on science. Five years after AlphaFold 2, the system had become a standard research tool. Before its arrival, scientists had determined only about 180,000 protein structures in sixty years.

By late 2025, AI had predicted more than 240 million, covering every human protein and those tied to major diseases. AlphaFold 3 expanded these capabilities, accurately predicting complex interactions involving DNA, RNA, ligands, and ions with more than 50 % better accuracy than existing methods.

The technology drove real breakthroughs. DeepMind and Google Research worked with Yale on a model that uncovered a potential cancer therapy pathway.

DeepMind also applied AI to fusion research and genetic analysis. By 2025, more than 3 million researchers across 190 countries will rely on AlphaFold. As Fortune put it, biochemists have already found AI’s killer app.

Another important development in AI that came in 2025 was the consolidation of multimodal abilities.

File creation and multimodal expansion

AI moved well beyond text in 2025. Anthropic’s file creation tools allow Claude to generate spreadsheets, presentations, PDFs, and working code inside private environments.

Teams can upload raw data and receive cleaned analysis, charts, and written insights automatically.

Google advanced multimodal capabilities with Nano Banana Pro for high-quality image generation and Veo 3.1 for filmmaking features.

Audio improved too with Google’s Live API, enabling natural dialogue that responded to tone, emotion, and context. These developments showed that AI’s future lies in systems that handle text, images, audio, and files as fluidly as people do.

However, while on one hand, AI was charming the world and grabbing headlines for making advancements at lightning speed, businesses faced a difficult time trying to adapt to the tech.

The persisting deployment gap

An MIT study in 2025 found that 95% of AI pilots delivered no measurable financial impact. Photo Credit: National Review.

Despite technical progress, enterprise adoption lagged. MIT’s 2025 study found that 95 % of AI pilots produced no measurable financial impact, and only 5 % of custom tools reached production.

Companies spent billions but struggled because generic AI tools did not adapt to internal workflows, and budgets went to customer-facing features rather than back-office automation, where ROI was highest.

However, amidst the confusion, successful patterns also emerged.

Buying specialized tools worked twice as often as building them internally, and mid-sized companies that deployed in about 90 days performed better than large firms that took nine months.

The automotive sector mirrored this trend, with Gartner predicting that only 5 % of automakers would sustain strong AI investment by 2029. Yet examples like Banco BV and Deloitte showed AI’s potential when tightly integrated into workflows and supported by strong vendor partnerships.

In 2025, another interesting trend began to emerge. This was the increasing importance of benchmarking. Since older systems could no longer assess the capabilities of AI models, new ones had to be used.

The benchmark explosion

Benchmarks surged in importance in 2025 as Stanford reported dramatic gains across tasks like GPQA, MMMU, and SWE Bench. But rapid improvement created an arms race. Older benchmarks, including MATH and GSM8K, reached ceiling performance, and newer tests such as AIME and Humanity’s Last Exam quickly followed.

Global competition intensified. China narrowed the performance gap with the United States, while regions like the Middle East and Southeast Asia produced notable models. Yet critics questioned whether benchmarks reflected genuine reasoning.

Apple researchers suggested that models might be repeating learned patterns rather than truly reasoning, highlighting the gap between test performance and real-world capability.

The road to real impact

Artificial intelligence in 2025 proved that the technology has moved far beyond spectacle. The year showed that progress is no longer defined by single breakthroughs but by steady integration into products, workflows, and scientific discovery.

Reasoning models, computer-using agents, multimodal systems, and scientific AI all demonstrated what happens when innovation compounds rather than explodes. The foundations of AI did not change, yet the capabilities built on top of them expanded rapidly, reshaping expectations for businesses, researchers, and consumers.

But the year also made one lesson unmistakably clear: maturity brings friction. Enterprises struggled to turn technical leaps into operational value. Benchmarks soared while deployment rates stagnated. Companies learned that adopting AI is not the same as benefiting from it. The journey from pilot to production remains the industry’s most persistent bottleneck.

The advancement and availability of a technology does not necessarily mean it will start having an impact on the day-to-day functioning of enterprises or end-users. It took mobile phones decades to move away from being a novel device with limited use and reach to what it is today, a universal companion capable of more than that of the computers that took man to the moon.

For AI, 2025 was the year when the technology improved and started proving itself. How and when users and enterprises make the most of it remains to be seen.

TL;DR

  • AI stopped being a shiny demo and became infrastructure: the question shifted from “should we adopt this?” to “how do we make it actually pay for itself?”

  • The core ingredients (transformers, attention, RL) barely changed — the real breakthroughs came from how we combined, trained, and deployed them.

  • Enterprises moved from sandbox experiments to serious production pushes, but most still struggle with basics: data, governance, and integration into real workflows.

  • The next decade will be defined less by wild new model tricks and more by who can turn pilots into stable, boring, money-making systems.

Headlines You Actually Need

The context to prepare for tomorrow, today.

Memorandum merges global headlines, expert commentary, and startup innovations into a single, time-saving digest built for forward-thinking professionals.

Rather than sifting through an endless feed, you get curated content that captures the pulse of the tech world—from Silicon Valley to emerging international hubs. Track upcoming trends, significant funding rounds, and high-level shifts across key sectors, all in one place.

Keep your finger on tomorrow’s possibilities with Memorandum’s concise, impactful coverage.

*This is sponsored content

Friday Poll

🗳️ Looking at your own team/company, which best describes your AI reality in 2025?

Login or Subscribe to participate in polls.

Weekend To-Do

  • Kill or crown your pilots. List every AI pilot you touched this year. Mark each one as Kill (no clear value), Fix (promising but blocked), or Crown (real impact). If you can’t name a specific metric it moved, it’s a vanity project.

  • Pick one workflow to industrialize. Choose a single use case that showed real promise in 2025 (not ten). Define what “full production” means for it in 2026: owners, SLAs, guardrails, and how you’ll prove it pays its own bills.

  • Write your 3-line AI thesis for 2026. In three sentences, answer: What will we stop doing with AI? What will we double down on? And how will we know, by next December 26, that it wasn’t just another year of pilots?

Rate This Edition

What did you think of today's email?

Login or Subscribe to participate in polls.