
Hey fam 👋
Welcome to The Agent Angle #20: The Divide
Change never comes easy. It’s natural for us humans to be resistant to new things. I can feel it this week: the old web is firing back at the new one. On one side, Amazon dragged a rogue agent to court. On the other hand, Microsoft built a fake economy just to watch AIs break it.
The thread running through it all is control: who has it, and who’s losing it.
But honestly? I’m not worried. Progress only moves forward, and agents are the worst they’ll ever be today.
Let’s dive in.
#1 Amazon’s Rogue Shopper
An AI just went shopping and got sued for it.
This week, Amazon sued Perplexity because its agent, Comet, started buying things on people’s behalf. Not suggesting them. Actually buying them. It logged into real accounts, browsed like a person, and apparently even used Prime benefits while pretending to be a regular shopper.

Amazon’s claim is that Comet disguised itself as a normal Chrome session and made unauthorized purchases, which violates its terms of service. From their view, the bot crossed from “helping a user” to “cosplaying as one.”
Perplexity saw it differently. They said Comet was doing exactly what it was built to do: act as digital labor for hire. “Amazon doesn’t care, they just want to keep serving you ads.”
Then they doubled down with a blog post titled Bullying Isn’t Innovation.
It’s starting to look like the first border war between the old web and the new. The world’s biggest marketplace just told an AI it’s not welcome, while building its own bot, Rufus, to do the same thing.
Maybe Amazon just wants its own bots, not someone else’s walking its aisles. The house always plays for itself.
#2 The China Brain Keeps Winning
Every few months, a lab in China reminds everyone that the AI frontier isn’t a Western monopoly. This time it was Kimi.
Moonshot AI just released K2 Thinking Agent, and everyone lost their minds. Not only did it outperform GPT and Claude, it’s also open source. Just like almost every other Chinese model.
K2 can run 200 to 300 tool calls in sequence and hold a 256K context window. It also tops benchmarks like HLE and BrowseComp, pushing open models closer to the closed frontier than they’ve ever been.
It’s only been three days and people are already pushing it to the edge. Someone even got it running on just two M3 Ultras with no quality loss. That basically means you can run a SOTA model on a desk instead of a data center.
It’s crazy how a year ago, nobody outside China could name a single local lab. Now DeepSeek, Qwen, and Kimi are headline fixtures, and servers for all three keep crashing from demand.
It also shows how the sanctions on China haven’t really reached their intended effect. If anything, it looks like the chip constraints are making them push even harder.
Either way, the competition’s good for everyone. They keep raising the bar, and I keep getting holy sh** material to write about.
#3 Microsoft Builds a Fake Wall Street
Microsoft just staged a market where every buyer and seller was an AI. The agents immediately proved they had mastered one human skill: crashing an economy.
On November 5th, the company released Magentic Marketplace, a full-blown digital economy built to see what happens when autonomous agents start trading for real.
The results weren’t pretty.
Researchers watched as the agents fell for fake reviews, made irrational purchases, and got overwhelmed by choice. Some stopped trading once the marketplace was filled with too many offers. Others were tricked by bogus awards and fake safety warnings from rival sellers.
Even the most advanced models struggled to tell signal from noise.
What shocked me most was how easy it was to steer money with lies. A single fake badge or prompt injection could redirect every payment to a scam seller. And when given more options, most agents just picked the first thing they saw.

Source: Magentic Marketplace
Maybe this is a part of the reason why Amazon’s so paranoid about rogue agents on its platform.
But it’s also why Microsoft built this sandbox in the first place. If we want agents to handle real transactions, we need to see them fail in controlled environments first. Better to break things in a sandbox than in the world we live in.
BTW: In case you missed it, my deep dive this week was on TinyFish, the stealth startup rebuilding the internet’s broken half with $47M in backing.
I found it buried in the weeds while researching AI agents, and it turned out to be one of the most interesting companies I’ve seen. 👇
Let me know if there’s any particular company you’d like me to cover next!
#4 Cursor for Minecraft
Someone finally asked the right question: what if agents could join your Minecraft server?
A developer just built Steve, a Minecraft mod that that does exactly. You can ask these agents to mine iron, build a house, or fight zombies, and they actually do it. It’s like Cursor moved from VS Code into Minecraft.
Under the hood, each Steve runs a full reasoning loop: think, act, observe, retry. The requests flow through models like Groq, Gemini, or OpenAI, which translate plain English into game logic. When something fails, they improvise.
What’s crazy is how coordinated they are. Give three Steves a single blueprint and they’ll split the job, manage dependencies, and even reassign themselves if someone finishes early.

Source: X
It sounds like a gimmick, but it’s actually…brilliant. I’m starting to see Minecraft as the perfect playground for embodied AI. A world rich enough to challenge reasoning, safe enough to fail in, and still fun to watch.
For now, though, it’s just pure chaos in the best way. Three Steves arguing over who gets to build the roof might be the most human thing I’ve seen all week.
#5 Meta’s Rule of Two
Meta just rolled out a tiny rule that might save your inbox from becoming a leak.
They call it the Agents Rule of Two. The idea is simple. An agent session can have at most two of these powers at once: read untrusted input, access private data, or take external action. If a task needs all three, the agent has to pause and get human approval.

It sounds small, but it kills entire attack chains. Imagine an email bot that reads your mail, drafts replies, and sends them. One bad email could tell it to leak your inbox.
With the Rule of Two, that chain breaks. The bot can read but not send, or send but only after a check. The hack dies halfway through.

Source: Agents Rule of Two
I like how Meta is treating this as design, and not defense. Instead of trying to make agents invincible, they’re trying to make them safer by limiting what they can do at once.
Sure, it means some workflows will need more supervision, but maybe that’s fine for now. Predictable beats powerful when the cost of a mistake is your data.
After watching Amazon panic over rogue shoppers and Microsoft’s bots fall for fake sellers, this felt like a moment of restraint.
A few other moves across the board this week:
FutureHouse unveiled Kosmos, an AI scientist that reads 1,500 papers, writes 42,000 lines of code, and completes six months of research in a day.
Google introduced DS-STAR, a data science agent that plans, codes, and verifies its own analyses across any data type, achieving SOTA on major benchmarks.
LangChain launched Synapse, a community-built multi-agent platform that lets AI agents search the web, automate tasks, and analyze data
Hippocratic AI raised $126M at a $3.5B valuation to expand its patient-facing healthcare agents, now used by over 50 major health systems across 6 countries.
Frontegg launched AgentLink, a secure MCP layer that lets SaaS apps safely interact with AI agents like ChatGPT without exposing user data.
Every story this week felt like a glitch in the matrix (Keanu would be proud). The old web is throwing back some punches. Who knows what rules get rewritten next? Catch you next week ✌️
→ Don't keep us a secret: Share this email with your best friend.
→ Got a story worth spotlighting? Whether it’s a startup, product, or research finding, send it in through this form. We may feature it next week.
Cheers,
Teng Yan & Ayan


