
Welcome to The Secret Agent. It’s a nod to how we give you the actual field notes to build and invest in this technology supercycle.
Lately, I can’t shake this feeling that the internet feels louder but somehow emptier. The same phrases and arguments keep surfacing.
It almost makes me wonder if the “dead internet” theory is actually real.
3 things worth knowing this week:
How agent swarms manufacture consensus without posting obvious spam
The simplest agent reliability feature nobody ships (and why it matters)
Why video editing is becoming a software problem
Last week’s reader poll on who should control lethal decisions in war: humans or agents. I didn’t expect this, but the vote split cleanly! And 36% of you shrugged and said it’s happening anyway.
Let’s go.
#1 The Synthetic Majority
By adaptively mimicking human social dynamics, (AI agents) threaten democracy.
A new Science paper warns that the real misinformation threat online isn’t your AI spam bots. It’s coordinated AI agent swarms. These agents plan together, adapt in real time, and sneak into our communities. And it’s already happening.

Source: Gary Marcus
These agents are built to blend in. They keep persistent identities. They learn local slang. They adjust tone depending on who they’re talking to. And across platforms, they infiltrate vulnerable communities with tailored messages.
Individually, they look normal.
Collectively, they create something new: manufactured consensus.
You might think you’d spot it. But even a sharp guy like Nikita (X’s head of product who’s actively working to fix the platform’s AI spam problem) was publicly replying to AI bots without realizing they weren’t human.
One overlooked risk is that agent swarms can evade detection by modelling group behaviour - i.e. talking mostly to each other. A full social environment shaped by machines. Likes, replies, disagreements, all synthetic. This becomes far harder to detect.
Just look at what has happened in recent years. Researchers documented AI-driven influence campaigns during elections in 2024, including in Taiwan, where AI agents engaged users directly and nudged voters toward disengagement. US elections haven’t been immune either. This problem is only going to multiply several-fold
“It’s just frightening how easy these things are to vibe code and just have small bot armies that can actually navigate online social media platforms and email and use these tools.”
The authors call this “LLM grooming,” and I think that’s pretty accurate. It’s scary to think that there’s a chance your country’s next government could be decided by AI before you cast a vote.
The AI Debate: Your View
Are you good at spotting AI bots?
#2 Let Us Scream
Agents don’t crash like normal software.
Most startups have great error tracking for code. Logs, traces, alerts, dashboards. But agents break in a different way. When they get stuck, they often keep going anyway. They loop. They retry forever. They hallucinate an “answer” and move on. Hiding the real problem for hours/days.
The team behind Promptless let their agent scream instead.
They took inspiration from the 1967 story ‘I Have No Mouth, and I Must Scream’, where a supercomputer becomes sentient but is trapped with no way to act or speak.

Source: Promptless
They built a simple tool — the IMustScreamTool — that lets the agent tell its developers when something goes wrong. Instead of sitting in an infinite login loop, the agent now sends a concise message into a Slack channel with a severity level and description of the issue.
In one case, an agent failed to log into a customer app 84 times. Instead of looping forever, it flagged “I’ve tried repeatedly, and I can’t get in.” That single alert led engineers to a browser automation bug they never would’ve noticed otherwise

Source: Promptless
Promptless engineers now literally monitor a Slack channel full of agent escalations. They talk about “how it feels” when an agent repeats an action 60 times. They’ve started treating agents as teammates.
The agent even started catching bugs in Promptless’s own code. While scanning PRs for documentation updates, it occasionally flagged critical logic issues it wasn’t asked to review at all.
This made me think about how we debug AI. I usually think of observability as something we build for humans. But if agents are going to run long, messy workflows on our behalf, they need a native way to say: “I’m stuck. I don’t know what to do next.”
Letting them scream (metaphorically) is a small feature with an outsized effect.
#3 Directed by an Agent
Video production is still awkward to automate.
I’ve been messing around with creating an AI short video series, and it’s tedious. There’s still a lot of human glue involved. Like stitching together music, visuals, narration, timing, sound effects.
This week, Remotion took a real swing at fixing that.
They introduced Agent Skills that let agents like Claude Code plan, build, and iterate full videos end-to-end.
Remotion basically treats the video as code. Every frame is a React component. With these skills, an agent can scaffold a video, render previews, adjust timing, fix errors, and try again.
What’s new here is that the agent can actually debug. Remotion paired the skills with LLM-readable docs and an MCP server, so when something breaks, the agent doesn’t stall out. It inspects the API, understands the failure, and keeps going.
People really went all out testing this. I saw some genuinely cool demos, from hardcore edits and launch videos to educational explainers.
One that stood out was a breakdown of the Schrödinger equation that was genuinely clearer than most YouTube “physics explainer” channels - I never really understood this concept before. Pair this with text-to-speech from ElevenLabs, and you get something that starts to feel like a new content primitive: explainers generated and refined like code.
We’ve watched agents change how graphics and design work get done. This looks like the same moment for video. Once agents can edit video the same way they edit code, a lot of work that used to be manual and took long hours vanishes.
It also changes who can make a serious video. Anyone can do it with a little effort. The new skill is taste + prompting + knowing what you want the video to explain.
#4 The Agent Internship
It’s clear that AI agents already handle slices of white-collar work pretty well. Research, drafting, analysis, planning.
I’ve been wondering about why that hasn’t translated into massive job losses yet, despite all the fearmongering. It’s probably because agents are good at tasks. But jobs are made of long chains of tasks.
Can an agent hold up across an entire job, end to end?
This week, Mercor offered us a clearer picture with a new benchmark called APEX-Agents, which tests agents on real consulting, investment banking, and legal workflows based on tasks from working professionals.

Source: Mercor
The benchmark forces agents to operate the way people do at work. They have to move across documents, policies, and tools, keep track of details over time, and decide what to do next without having everything handed to them upfront.
Under those conditions, performance drops quickly. Even the strongest models only completed a quarter of tasks correctly. The failure modes are often around state management. Agents lose context. They forget earlier assumptions. They fail to carry information from one tool to the next.
Some models did better than others. Google’s Gemini 3 Flash performed best at around 24%, with OpenAI’s latest models close behind. Useful, but inconsistent.

Source: Mercor
Mercor’s CEO summed it up by calling today’s agents “interns”. They get things right 24% the time, up from 5-10% a year ago.
That sounds modest, until you consider that an intern who gets 24% of a consulting or banking workflow right without supervision would be seen as unusually promising! Especially one that never gets tired and costs close to zero.
The important signal is not that agents are failing, but that they’re already succeeding in workflows that used to be completely out of reach. So.. AI agents aren’t replacing white-collar workers yet. But many jobs are about to be reshaped around them.
#5 3D From a Single Frame
Imagine taking a single image and turning it into a world you can walk through, inspect, and change.
That’s exactly what a new paper introduces with VIGA (Vision-as-Inverse-Graphics Agent). Instead of recognizing what it sees and moving on, the agent tries to rebuild the scene that could have produced the image.
The key idea is the loop. VIGA makes a rough guess of the scene, compares it to the original image, figures out what’s wrong, fixes it, and tries again. It keeps going until the rendered version lines up. Over time, it recovers depth, layout, lighting, and even how the scene evolves.
The output is a 3D model the agent can inspect and reason about.
On BlenderGym, VIGA reached 35.3%. On SlideBench, which measures how well a system can rebuild structured visual layouts from flat inputs, it showed a 117% relative improvement over one-shot baselines.

Source: VIGA
Almost like vibe-coding, but for reality!
If an agent can turn flat images into structured, editable worlds, a bunch of downstream work gets cheaper:
Robotics: less guessing about geometry and contact.
Simulation: less hand-building environments from scratch.
Training: fewer “designer-made” worlds, more worlds inferred from what already exists.
Now that’s a different way of seeing.

Winter hit hard this year (I feel you, my NYC friends). Flights delayed. Trains running late. Mornings getting thrown off for no reason.
I thought this idea was cool. Someone built a smart alarm that only ruins your sleep when it actually has to.

Source: n8n
It’s a n8n workflow powered by Gemini. Every morning at 5:00 AM, it checks local weather and train delay news, then lets Gemini decide if the situation is normal or urgent. If there’s heavy rain or a delay, it alerts you immediately. If not, it leaves you alone until your usual alarm.
You can fork this workflow and tune the rules for your city and commute.
Catch you next week ✌️
Teng Yan & Ayan
P.S. Know a builder or investor who’s too busy to track the agent space but too smart to miss the trends? Forward this to them. You’re helping us build the smartest Agentic community on the web.
I also write a newsletter on decentralized AI and robotics at Chainofthought.xyz.

