• The Neuron
  • Posts
  • 😼 DeepMind mapped AI agent controls

😼 DeepMind mapped AI agent controls

PLUS: Claude robotics, Dean Ball at OpenAI, DeepSeek's raise, and sovereign models.

So apparently Americans are now spending more than twice as much time with AI companions as dating apps. That’s according to data from Sensor Tower’s State of AI 2026 report (page 57) shared by Justine Moore. U.S. users reportedly spent around 700M hours on apps like Character.AI and Talkie in Q1, while Tinder, Bumble, and friends held steady around 300-400M.

Okay, despite it being horrifying news, this actually makes sense… in the bleakest possible way: Dating apps ask you to be charming, attractive, emotionally available, geographically convenient, unreasonably comfortable with constant rejection, or the reverse, unreasonably comfortable with rejecting others, and willing to answer (or dodge) the “what are you looking for?” question for the 900th time. 

AI companions ask you to open the app… and chat (no strings attached). Whenever you want to stop the convo, you stop. And when you come back, it’ll be like you never left: AI have no sense of time (at least not now). 

Both stats show us the obvious: people want connection. Congrats to AI. It has discovered the killer app: responding back! The uncomfortable part? People also want no strings attached, convenient, connection; connection without ghosting, rejection, calendar coordination, awkward first dates, or any effort whatsoever. As X user Signüll calls it: frictionless intimacy. 

Could this trend also correlate with the new paper that tracks the rise of smartphones against the fall of birth rates? Certainly not because of AI companions, as those are too new to be reflected in the data. But smartphone use and the drop in in-person socialization and increased desire for and access to frictionless experiences in general that followed? Probably! Anyway, the whole SensorTower report is epic. Check it out. 

Here’s what happened in AI today:

  • 😼 Google DeepMind published an AI Control Roadmap for securing powerful internal agents.

  • 📰 Anthropic said Claude completed shared robot tasks 18-37x faster than earlier human teams.

  • 📰 Dean Ball joined OpenAI to lead a Strategic Futures team focused on frontier AI policy.

  • 🌟 This week's best skill: make your agents prove their work before they touch the real workflow.

Hey: Want to reach 700,000+ AI-hungry readers? Advertise with us! 

😼 Google DeepMind is preparing for the AI agent that goes off-script

The next phase of AI safety looks less like philosophy and more like IT security. Passwords. Logs. Permissions. Monitors. Emergency brakes.

Google DeepMind published an AI Control Roadmap for securing powerful internal AI agents. The basic assumption is refreshingly practical: even if a model is trained to be helpful, the system around it still needs controls for the day it behaves badly.

Here's what happened:

  • DeepMind framed advanced agents as potential insider threats, similar to employees with access to sensitive systems.

  • The roadmap adds system-level security on top of alignment, meaning training the model to behave well is only one defense.

  • Trusted AI supervisors would monitor an agent's reasoning, actions, and plans.

  • Controls would scale by risk, from delayed review for low-risk actions to real-time blocking for dangerous ones.

This matters because agents are no longer just answering questions. They are reading files, browsing the web, calling tools, writing code, and carrying out multi-step work. Anthropic's robotics update made the same shift concrete from another direction: Claude is starting to coordinate physical tasks, not just screen work. That makes agent safety less theoretical and much harder to fake.

Why this matters: The hard problem is moving from chatbot safety to workflow safety. A wrong answer is bad. A wrong action can delete data, leak a file, approve a payment, or quietly sabotage a project.

DeepMind's roadmap borrows from cybersecurity because cybersecurity already knows a painful truth: trusted systems can still be compromised. The future agent stack will probably look like a mix of model evals, access control, audit logs, sandboxing, and AI supervisors watching other AIs.

DeepMind also named the metrics that make this more than hand-waving: coverage (how much agent traffic is monitored), recall (how much bad behavior gets caught), and time-to-response (how fast the system can stop it). That is safety as operations, not a blog-post promise.

Our take: The companies winning agents may be the ones that make autonomy boring. The magic demo matters less than the permission system that keeps the magic from touching payroll at 2 a.m., or quietly emailing a customer before a human sees the plan.

Claude is currently the most powerful tool of 2026. It's been launching new features every week- Skills, Connectors, Cowork, vibe coding. Yet almost no one knows how to actually use them.

Our expert mentors have condensed 800+ hours of Claude research, articles, YouTube content and real-world practice into a focused 16-hour curriculum. Join the 2-Day Claude AI Mastery Workshop: a live, end-to-end deep dive into Claude plus 10+ AI tools, LLMs and workflows.

It would be silly not to SIGN UP

The professionals who master Claude right now will design carousels in seconds, ship code without being developers, and have AI working for them while they sleep.

You will learn how to:

  • master Claude's three modes : Chat, Cowork and Code.

  • Set up Skills, Connectors and Plug-ins to automate your desktop, Notion and files.

  • Vibe code apps and dashboards without writing code & 10+ AI tools and workflows that pair with Claude.

🧠 Saturday & Sunday

🕜 10 AM – 7 PM EST

When using /goal with any task (knowledge work or coding), a smart fix comes from Matt Berman: use an LLM-as-a-judge loop, which means asking one AI pass to create the “goal” you want to achieve.

Try it like this: ask Codex to optimize a slow dashboard “until the page loads as fast as possible without changing what users see,” or clean up a messy internal process doc “until a teammate could follow it without Slacking you three questions.” Here’s the prompt:

/goal

Work toward this outcome:
[describe the task]

Before starting, define what “good enough” means for this task.

Create 3-5 success criteria Codex can keep checking while it works. Include any hard requirements, tests, files, constraints, style rules, performance targets, or user-facing behavior that must be preserved.

Then work in a loop:
1. Make the next useful improvement.
2. Judge the result against the success criteria.
3. Identify the biggest remaining gap.
4. Continue until the work meets the goal.

Return:
- The final result
- The success criteria you used
- The biggest change you madeoved.

Have a specific skill you want to learn? Request it here. 

  • *Discover what the right AI implementation can do for a company like yours. → Let's build it together.

  • Poolside released Laguna M.1 as open weights under Apache 2.0 with 226B parameters and 256K context.

  • Viktor lives inside Slack and Microsoft Teams as a work agent that can run reports, build dashboards, write code, and connect to 3,200+ tools - free to start.

  • Adapt gives your company a shared brain in Slack and on the web so people and agents can answer questions and act across tools - $100 free credits.

  • Juno turns your Mac voice notes into clean text inside Mail, Slack, Notes, Cursor, and the apps you already use - free/open source.

  • VoiceOS lets you dictate, search, send messages, and automate desktop tasks by voice on Mac and Windows - no pricing details.

Andrew Dai, former Google Brain / DeepMind researcher and now CEO of Elorian, joined us to explain why today’s AI can write code, read docs, and build apps, but still misses visual problems a grade-schooler could spot in seconds.

In our latest episode, Andrew unpacks why image description is not the same as visual reasoning, why models may need a “visual scratchpad,” and why better vision could unlock stronger coding agents, robotics, design review, and physical-world engineering.

🎧 Watch / Listen: YouTube | Spotify | Apple Podcasts

Special shout out to this episode’s sponsors, Dell Technologies and NVIDIA and Outshift!

📰 Around the Horn

  • Anthropic said Claude completed shared robotics tasks more than 18-37x faster than human teams in the original Project Fetch experiment.

  • Dean Ball joined OpenAI to lead a Strategic Futures team focused on frontier AI policy and governance.

  • Kobi Hackenburg reported that frontier AI nearly tripled donations versus professional canvassers in one persuasion campaign.

  • Google DeepMind explored four possible paths from AGI to superintelligence in a new research report.

  • Snap⁠ is spinning off an internal generative-AI video team into Dotmo, a separate company focused on AI models for interactive gaming experiences, with Snap licensing technology to Dotmo, taking a large equity stake, and citing the high cost of doing the work internally as one reason. 

  • Meta Platforms⁠ is under contract, according to Bloomberg, to buy roughly 1.6 GW of AI computing capacity from Crusoe across data centers in Childress, Texas, and Warrenton, Missouri, with the spending amount and delivery timing not yet clear. 

  • Amazon Web Services⁠ is in early talks to sell Trainium AI chips for use in other companies’ data centers, per AWS AI chief Peter DeSantis, after Andy Jassy said Amazon may sell chip racks to third parties and estimated a standalone chips run rate of about $50 billion if sold to AWS and outside buyers. 

AWS Summit DC is bringing together cloud leaders, developers, and innovators for two days of expert-led sessions, keynotes, and public sector insights. Explore AI-driven workflows and modernization at scale.

🌟 Sunday Special: Top of the Week

  • OpenAI helped researchers reopen 376 unsolved rare childhood disease cases and find 18 new diagnoses, while Google pushed their AMIE from diagnosis into ongoing care. Medical AI had a rare week where the story was not “look at this benchmark,” but “look at the families who finally got an answer.”

  • SpaceX reportedly bought Anysphere, the company behind Cursor, for $60B. The deal turns one of the most important AI coding work surfaces into strategic infrastructure for SpaceX, xAI, and the broader developer-agent race.

  • Anthropic’s Fable and Mythos fight turned model access into a national-security, cyber-defense, and market story all at once. The messy part: cutting off powerful U.S. models may also make cheaper open models from China look a lot more attractive.

  • Z.ai launched GLM-5.2, an MIT-licensed open-weight model with a 1M-token context window and strong coding / agent results. Translation: the open-model race is no longer a side quest, and companies now have a serious reason to ask whether they want rented intelligence or something they can run themselves.

  • Salesforce signed a deal to acquire Fin for $3.6B, bringing Intercom’s customer-service agent platform deeper into the Salesforce machine. Enterprise agents are moving from “neat demo” to “which giant platform owns this workflow?”

A Cat’s Commentary

That’s all for now.

What'd you think of today's email?

Login or Subscribe to participate in polls.

P.S: Before you go… have you subscribed to our YouTube Channel? If not, can you?

P.P.S: Love the newsletter, but only want to get it once per week? Don’t unsubscribe—update your preferences here.