😸 4 AIs walk into a bar

PLUS: Apple's AI glasses, AI washing exposed & Met Police's robot watchdog

In partnership with

Welcome, humans.

We just LOVE AI for Humans here at The Neuron; it’s just SUCH a fun weekly recap show, and this episode on OpenClaw, SeeDance 2.0, and capturing the overall vibe of being extremely tired is just speaking directly to our soul. This ep is definitely worth a watch if you’ve missed any of the last week’s news.

P.S: no spoilers, but we’re likely going to have Kevin and Gavin on The Neuron live show sometime soon… keep your eyes peeled!

Here’s what happened in AI today:

  • xAI launched Grok 4.20, the first consumer AI that uses four debating agents instead of one model.

  • Meta embedded Manus AI into Ads Manager for automated reporting and audience research.

  • Sam Altman called out companies "AI washing" layoffs—blaming cuts on AI that were happening anyway.

  • Apple is building smart glasses, a wearable pendant, and AI-powered AirPods all centered around Siri.

Don’t forget: Check out our podcast, The Neuron: AI Explained on Spotify, Apple Podcasts, and YouTube — new episodes air every week on Tuesdays after 2pm PST!

Grok Doesn't Answer Your Questions Anymore. Four Agents Argue About Them First.

You know how the best decisions usually happen when smart people debate each other? Elon Musk’s xAI just built that into an AI.

Grok 4.20 dropped on Monday and it's operating on a fundamentally different architecture. Instead of one AI generating your answer, four specialized agents work simultaneously, debate each other in real time, and then hand you the consensus.

Meet the team:

  • Grok (the coordinator): Breaks down your question, assigns tasks, resolves disagreements, and delivers the final answer.

  • Harper (the researcher): Pulls real-time data from the web and X's firehose of ~68M daily English posts for instant fact-checking.

  • Benjamin (the logician): Handles math, code, and step-by-step reasoning. He's the one who stress-tests everyone else's logic.

  • Lucas (the creative): Explores alternative angles, rewrites for clarity, and adds the ideas nobody else considered.

Here's why this actually matters: hallucinations dropped 65% in early testing. When one agent confidently says something wrong, another agent catches it before you ever see the output. It's peer review at machine speed.

The results in practice are striking. In a live stock trading competition (Alpha Arena Season 1.5), Grok 4.20 was the only profitable AI model, turning $10K into ~$11K-$13.5K while rivals from OpenAI and Google finished in the red. Four of the top six finishers were Grok 4.20 variants.

One caveat: we're seeing this in beta. xAI says the current version is just the "small" 500B-parameter foundation model; the full version is still training. And Elon being Elon, he dropped the release on X with no formal announcement and no benchmarks. Hence why it took us so long to actually write about it. It wasn’t REALLY a full launch.

The most interesting part? It's available on free accounts. You can go to grok.x.ai right now, ask it something complex, and watch the four agents think in real time through a live interface. Paid plans ($30/month SuperGrok) get faster responses and access to a "Heavy" mode that scales to 16 agents for research-grade problems.

Here's how to test it yourself:

  • Multi-perspective question: Ask "What are the strongest arguments for and against remote work?" and watch the agents actually disagree with each other before synthesizing.

  • Fact-heavy task: Ask for specific stats from a recent report. Harper will pull real-time sources while Benjamin verifies the numbers.

  • Code debugging: Paste in broken code. Benjamin writes the fix while Harper checks documentation and Lucas suggests a cleaner approach.

This feels like an important architectural shift. Every other major lab (OpenAI, Google, Anthropic) still ships single-model inference (as far as we know, anyway); one brain, one answer. xAI is betting that the future is teams of models arguing their way to better outputs. If the full-size Grok 4.20 delivers on its promise when it finishes training, other labs will have to decide whether to follow.

Or, you know, they could just hold hands and figure it out together. Oh wait…

FROM OUR PARTNERS

Free email without sacrificing your privacy

Gmail is free, but you pay with your data. Proton Mail is different.

We don’t scan your messages. We don’t sell your behavior. We don’t follow you across the internet.

Proton Mail gives you full-featured, private email without surveillance or creepy profiling. It’s email that respects your time, your attention, and your boundaries.

Email doesn’t have to cost your privacy.

.

Prompt Tip of the Day

Here's a trick borrowed from how Grok 4.20 actually works, and you can use it with any AI right now.

Make the AI debate itself.

Instead of asking one question and accepting the answer, try this prompt structure:

"Give me three different perspectives on [your question]. For each perspective, explain what evidence supports it and what evidence contradicts it. Then tell me which perspective has the strongest support and why."

This forces the model to do internally what Grok's four agents do externally: explore multiple angles, weigh conflicting evidence, and synthesize a stronger conclusion. It's especially powerful for decisions where you're stuck between options.

Bonus: If the answer matters, follow up with: "Now argue against your own conclusion. What did you miss?" You'll be surprised how often the AI finds real holes in its own reasoning.

Want more tips like this? Check out our Prompt Tip of the Day Digest for January.

Treats to Try

*Asterisk = from our partners (only the first one!). Advertise to 650K+ readers here!

  1. *Write replies in seconds. Dictate with Flow and send polished text without the typing tax. Start Flowing for free today.

  2. Manus AI is now embedded inside Meta Ads Manager, automating report building, audience research, and campaign analysis directly in your ad workflow; find it under Tools in your dashboard.

  3. Grok Build is xAI's browser-based coding environment that now integrates with Grok 4.20's multi-agent system, so four AI agents collaborate on your code simultaneously.

  4. Google AI Studio updated with Gemini 3.1 Pro support, full-stack app building (servers, databases, multiplayer), and free access to what Artificial Analysis ranks as the #1 overall AI model right now.

  5. Koah lets you place ads directly inside AI chat conversations when users ask questions related to your business, reaching 500M+ daily GenAI conversations across 50+ apps (raised $5M).

  6. ARES handles the RL infrastructure for training coding agents—bring your model and it manages sandboxing, parallelism, and training loops to complete SWE-Bench Verified evaluations in ~20 minutes instead of days (blog)—free to try.

  7. Agent Bar is a native Mac menu bar app that wraps Claude Code in a visual GUI with voice dictation, real-time tool call streaming, and token cost tracking.

  8. Boost.space turns scattered business data into a structured single source of truth so your AI agents and automations actually have context to work with; connects to Make, Zapier, and n8n.

Around the Horn

  1. Sam Altman admitted some companies are "AI washing"—blaming layoffs on AI that would've happened anyway—while warning real displacement is coming.

  2. London's Met Police started using Palantir's AI to flag officer misconduct by analyzing sickness, absences, and overtime patterns, drawing criticism from the Police Federation as "automated suspicion."

  3. Anthropic data revealed software engineering accounts for nearly 50% of all AI agent activity, while healthcare, legal, and finance each sit under 5%.

  4. Apple accelerated development of smart glasses, a wearable pendant, and AI-powered AirPods, all built around a camera-enabled Siri.

  5. US consulting firms are on track for their fastest growth since the post-COVID boom, with 90% of clients saying they plan to hire consultants to help implement AI.

FROM OUR PARTNERS

Migrate to Delve and get a $2,000 VISA card in your inbox

Delve is the AI-native compliance platform that actually does the work for you, auto-collecting evidence from AWS, GitHub, and your stack so you don’t have to chase screenshots or babysit integrations. Use AI security questionnaire tooling, AI copilot, and everywhere else to make compliance feel less, dreadful. Welcome to the new age.

Monday Meme

Hilarious. This is from a good movie, The Babadook, if you’ve never seen it!

A Cat’s Commentary

That’s all for now.

What'd you think of today's email?

Login or Subscribe to participate in polls.

P.S: Before you go… have you subscribed to our YouTube Channel? If not, can you?

Click the image to subscribe!

P.P.S: Love the newsletter, but only want to get it once per week? Don’t unsubscribe—update your preferences here.