Hey 👋

People are either doubling down on LLMs as the foundation or betting hard on something else entirely, and the conviction gap is huge.

Here are our top three stories this week that sit right at that intersection: memory, architecture choices, and whether the people closest to the metal saw something the rest of us missed.

Let's get into it.

#1 Hermes Agent Builds Memory That Actually Persists Across Sessions

Most AI agents have the memory of a goldfish. Every new session, you're starting from scratch, re-explaining who you are, what you're building, what you decided last Tuesday.

Hermes, a new open-source agent framework, is taking a different approach. Developer Akshay Pachaar published a breakdown on May 14th showing Hermes uses a three-tier memory architecture: tiny curated markdown files for critical facts (think: persistent fast-access storage), a full-text SQLite FTS5 index for deep session recall, and pluggable external memory providers for whatever else you want to bolt on.

The five-step per-turn cycle means the agent is actively consolidating and reflecting on information continuously, not just at session end.

Why does this matter? Because "agents that forget" is one of the most consistent complaints I hear from builders deploying anything beyond a single-session demo. The gap between a compelling prototype and a useful product is often exactly this: does the thing remember what happened yesterday?

The pluggable external provider layer is the smart design choice here. It means teams can swap in Mem0, a vector DB, or their own store without rewiring the core architecture.

I think the three-tier framing is genuinely interesting because it maps to how humans actually organize memory: fast recall for the essentials, deeper search for context, external systems for everything else. Whether Hermes executes on this cleanly in production is still an open question. But the architecture is the right shape.

📊 Free download: I mapped the entire AI compute supply chain for anyone who’s starting out. 7 chokepoints, 30+ companies, and 4 critical lanes from silicon to live inference. Whether you're building agents or investing in them, this is the reference sheet. Get the cheat sheet

#2 Yann LeCun Still Thinks LLMs Are a Dead End

He has said it before. But now he has a company betting on it.

LeCun appeared on the Unsupervised Learning podcast this week (May 15) to lay out his case against large language models as a path to AGI, and to introduce AMI, his new venture built around world models and embodied intelligence.

The core argument hasn't changed much: LLMs predict tokens, they don't model reality, and that gap is unbridgeable no matter how much compute you throw at it. What's new is that LeCun is now putting money and organizational weight behind the alternative.

Why it matters: AMI is a direct institutional bet that the current LLM paradigm hits a wall before anything resembling general intelligence arrives. If LeCun is right, the companies pouring billions into scaling transformer-based systems are building toward a ceiling. If he's wrong, AMI is a very expensive contrarian position.

Either way, a founding-caliber researcher leaving Meta's orbit to start something is a signal worth taking seriously. His 2027 predictions, shared in the interview, suggest he thinks the inflection point is closer than most people expect.

I've been skeptical of strong claims in either direction here. But I'll say this: the world models framing is more interesting to me than it was 18 months ago, especially as robotics deployments start surfacing the exact grounding failures LeCun keeps pointing at.

Source: Jacob Effron

#3 Leopold Aschenbrenner's AI infrastructure bet returned 20x.

A former OpenAI researcher turned hedge fund manager just posted one of the best one-year returns in recent memory, and he did it by buying power plants and fiber optic cables, not Nvidia stock.

Leopold Aschenbrenner, who was fired from OpenAI's superalignment team in April 2024, launched the Situational Awareness fund in September 2024 with $254 million. The thesis: whoever controls the physical layer of AI (power, connectivity, storage) wins. Twenty months later, the fund sits at $5.5 billion.

The portfolio, spanning power plants, fiber, Bitcoin miners repurposed for compute, and SSD manufacturers, reads like a bet on the boring constraints. That bet is looking smarter every week.

Tessara’s Compute Regime Score is near all-time highs, with GPU cluster lead times extending across the board. HBM memory tightness is even worse meaning the bottleneck isn't just raw compute, it's the memory bandwidth that makes inference fast enough to run agents at production scale. If you're serving agents today and your costs feel sticky even as model prices drop, this is why.

The highest-conviction bet on AI infrastructure this cycle came from someone who watched the sausage get made at OpenAI and decided the real edge was in kilowatts and fiber miles.

I think he's right. The model layer is commoditizing. The physical layer isn't.

Source: @shiri_shh

📄 Paper of the Week

Most multi-agent coordination research assumes agents can see everything. This paper builds agents that share only local features with neighbors, then scales that to hundreds of agents moving simultaneously in complex environments. The result is a pre-trained model that generalizes across map sizes without retraining.

If you're building robotic fleets or any system where agents need to coordinate without a central planner, the local-communication framing here is worth stealing.

🔧 Under the Hood

Hermes' persistent memory, Yann's LLM skepticism, Leopold's 20x return.. they all live on silicon. That's the tax on building anything stateful at scale.

The Chokepoint is my weekly read on exactly this: where the AI supply chain is binding, who's capturing rent, and what would flip the call.

Edition #2 dropped Tuesday (the photonics bull run and why InP substrate is the next HBM). Free, every Tuesday on Tessara Research. Subscribe here for free →

Catch you next week ✌️

Teng

Keep Reading