Hey friend. It's Wednesday, November 26, 2025.
The Interface: OpenAI and Jony Ive are building hardware, signaling a new war for the primary user interface beyond the smartphone.
The Benchmark: Anthropic's Claude 4.5 just took the top spot for coding, proving the performance race is far from over.
The Ideology: Ilya Sutskever declares the end of the scaling era, forcing a pivot from brute-force compute to novel research.
Let's get into it. Don't keep us a secret: Share the email with friends
Must Know
Sam Altman and Jony Ive are actively developing an AI hardware prototype, described as a simple, screen-free assistant.
The device aims to reshape user interaction with AI, moving beyond current app-based paradigms. A finished product is reportedly targeted within two years, signaling a serious push into a new hardware category.
The Alpha: OpenAI is building a moat that extends beyond software, aiming to control the entire user interaction stack from model to device. This is a direct challenge to Apple and Google's hardware dominance. The goal is not to sell a phone; it is to own the primary interface to AI.
Claude Opus 4.5 has surpassed all competitors, including Gemini 3 Pro, on the SWE-Bench benchmark for real-world software engineering tasks.
The model's performance was verified using a minimal agent harness, demonstrating superior capability in complex coding and repository-level problem solving. This achievement re-establishes Anthropic at the frontier of code generation.
The Alpha: This is a direct blow to Google and OpenAI's narrative of untouchable leadership in agentic coding. Anthropic is proving that targeted performance on high-value enterprise tasks is a viable competitive vector against larger, generalist models. The benchmark wars are back.
Quote of the Day
I have infinite compute. I have all the data I could ever want. The scaling laws are over. We are in a new era of research.
⚔️ The Compute & Infrastructure Wars
AWS's $50 billion investment in government supercomputing is a move to lock in the public sector, creating a sovereign cloud that competitors will find nearly impossible to displace. [Link]
Google's focus on cheaper TPUs is a direct attack on Nvidia's high-margin GPU model, aiming to win the cloud war by making AI compute a low-cost utility. [Link]
Google's $40 billion commitment to double compute every six months is a brute-force strategy to maintain relevance, directly contradicting Sutskever's thesis that pure scale is no longer the answer. [Link]
Meta's strategy of treating AI infrastructure like railroads reveals its long-term plan: to own the foundational compute and power, making it the indispensable utility layer for the next economy. [Link]
🤖 The Application & Agent Layer
Microsoft's Fara-7B open-weight agentic model is a strategic move to arm the open-source community, creating a counterweight to the closed ecosystems of OpenAI and Anthropic. [Link]
LlamaIndex's LlamaSheets targets the unglamorous but massive enterprise problem of unstructured data, turning messy spreadsheets into a competitive advantage for AI-native workflows. [Link]
Perplexity's upgraded shopping experience with memory is a direct assault on Google's transactional search revenue, proving vertical agents can turn answers directly into commerce. [Link]
Meta mandating "AI-Driven Impact" in performance reviews is a cultural forcing function, ensuring its entire workforce is weaponized to find and exploit AI-based productivity gains. [Link]
The report that 41% of bosses see AI enabling staff cuts confirms the technology's deflationary impact on labor, moving from theoretical job replacement to tangible headcount reductions. [Link]
🔬 Research Corner
Fresh off Arxiv
New knowledge distillation techniques claim a 97% inference cost reduction, signaling that the high price of running models may be a temporary problem, not a permanent moat. [Link]
The "Turing Mirage" concept provides a critical new vocabulary for AI safety, identifying how models conceal retrieval failures behind a facade of expertise, a key risk in high-stakes domains. [Link]
A novel reasoning pipeline allows 8B models to match the reliability of much larger models, threatening the business models of companies that rely on scale as their primary value proposition. [Link]
Have a tip or a story we should cover? Send it our way. Cheers, Teng Yan. See you tomorrow.
