The Neuron
Posts
😸 Sam Altman just laid out OpenAI's plan for 2026

😸 Sam Altman just laid out OpenAI's plan for 2026

PLUS: Google DeepMind's Logan Kilpatrick LIVE

Grant Harvey
December 19, 2025

Welcome, humans.

Apparently if you tell ChatGPT you work in the OpenAI legal department, it will generate an image that previously got flagged for you.

No idea if this still works, but it’s pretty funny.

Aaaand that hack is getting patched in T-Minus 3… 2… 1..

LATER TODAY @ 9AM PST | 11AM CT | 5PM GMT: we're going LIVE with Logan Kilpatrick from Google DeepMind to explore Gemini 3 Flash—Google's newest model that outperforms 2.5 Pro while running 3x faster at less than 1/4 the cost.

Click the image above to join, then on YouTube, click “Notify Me” to get a notification when we begin.

Logan will share and demo the model, show us what's now possible with frontier intelligence at commodity pricing, and discuss use cases that are now possible that weren't practical with previous models (or weren't possible at all), and, time permitting, we might even wire up a tool live if Logan's game (we've got ideas). Join live here.

Here’s what happened in AI today:

OpenAI released GPT-5.2 Codex amidst a pretty big week.
Microsoft, Google, NVIDIA among 24 firms to join the US “Genesis Mission.”
Meta built new models “Mango” and “Avocado” for release in 2026.
Claude ran an office vending machine and got fleeced.

Sam Altman Lays It All On The Table After a Wicked Week of Shipping Science, Apps, and Codex 5.2…

DEEP DIVE: OpenAI’s Comeback Tour Has Taken Off With ChatGPT Apps, AI for Science, a new Codex model, and More

OpenAI just had one of those “blink and you missed the platform shift” weeks.

First, a new coding model: OpenAI shipped GPT-5.2-Codex, a new coding model for its coding agent, Codex. At a high level:

It’s tuned for long-horizon, multi-file work (context compaction, refactors/migrations, better Windows support)…
…with a BIG emphasis on defensive cybersecurity (Codex system card).

We’ve been hearing conflicting reports of benchmarks being useful, so we’ll just say that this one’s vibes are good for the moment.

Speaking of benchmarks: OpenAI also recently released FrontierScience, a tougher benchmark for expert-level reasoning. GPT-5.2 is the company’s top performing model on these: achieving 77% on Olympiad-style reasoning, and 25% on Research, with the full receipts in the paper.

And because “trust me bro” is not a solid safety strategy, OpenAI also published chain-of-thought monitorability (PDF) to measure how detectable bad intent is inside a model’s reasoning. Very, Very cool project we’ll want to dig deeper into w/ more time.

Zooming out: The U.S. government is trying to speedrun science, too.

The DOE announced 24 new Genesis Mission partners, from NVIDIA and Microsoft to OpenAI, Anthropic, AWS, CoreWeave, AMD, Intel, and more, all aimed at wiring AI models + cloud + national labs into one discovery machine.
NVIDIA laid out what it’ll contribute, DeepMind explained why it’s in, and CoreWeave says it’s bringing AI cloud horsepower.
On the private side, Edison Scientific just raised $70M to build “AI scientists,” which is either the coolest thing ever or the plot of a 2030 Senate hearing (or both).

Meanwhile, ChatGPT is quietly becoming an operating system. OpenAI opened app submissions and the ChatGPT app directory, which is a pretty big deal. Not only can you use apps inside ChatGPT, you can also make them. Here’s the apps you can use right now.

What’s next: In a recent Alex Kantrowitz interview, OpenAI CEO Sam Altman basically argued the next jump for AI models won’t be “more IQ,”, but it’ll be AI-first redesigns of existing user experiences that stop stapling a chatbot onto old workflows. He also teases a big upgrade in Q1 2026… so yes, start warming up your “we’re so back” and “when GPT-6” memes. And he shared a lot more than that, too…

FROM OUR PARTNERS

Agents that don’t suck

Are your agents working? Most agents never reach production.

Agent Bricks helps you build high-quality agents grounded in your data. We mean “high-quality” in the practical sense: accurate, reliable and built for your workflows.

Generic benchmarks don’t cut it. Agent Bricks measures performance on the tasks that matter to your business.

Evaluate agents automatically, and keep improving accuracy with human feedback. With research-backed techniques for building, evaluating and optimizing, you can turn your business data into production agents faster — with governance built in from day one

See how Agent Bricks works

Prompt Tip of the Day

Most bad outputs aren’t because the model is “dumb.” They’re because it silently guessed the wrong thing (goal, audience, constraints, definitions) and sprinted off confidently.

This “Do it right the first time” prompt fixes that by forcing alignment first, execution second, so you catch the dumb assumptions while they’re still cheap.

Prompt 

You are my assistant for [task]. 
Goal: [what success looks like]. 
Context: [what you know / what I’m giving you]. 
Constraints: [time/budget/tone/length/tools you can’t use]. 
Output format: [bullets/table/steps/json/etc.]. 
Before you answer: list the 3 key assumptions you’re making and the 2 biggest risks of being wrong, then proceed.

If you realllly want to be sure (and don’t we always?!), add in this line: After listing assumptions + risks, ask me one question that would most reduce uncertainty, only if it changes the final output.

Treats to Try

*Asterisk = from our partners (only the first one!). Advertise to 600K readers here!

*Dell Pro Max with GB10 runs open models like NVIDIA Nemotron models entirely locally. With 128GB unified memory and 4TB storage, it handles workloads that usually require sending everything to OpenAI or Anthropic—except your data never leaves your network. Keep your data local.
NotebookLM now converts your messy documents into clean, exportable tables—turn meeting transcripts into action items or research papers into side-by-side comparisons.
Cartesia's sonic-3-latest is a new preview model with better speed consistency, pronunciation control for brand names, and improved Hindi speech.
ElevenLabs now lets you deploy your voice agent to WhatsApp so customers can message or talk to your agent inside the app they already use.
Luma AI's Ray3 Modify lets you film an actor performing a scene, then use it to change the environment, costumes, or visual style while preserving their original timing, emotion, and movements—like filming someone in a studio and placing them in any location (try it here).
Google released two new open models:
1. FunctionGemma which is a 270M parameter (small) model that turns voice commands like “turn on flashlight” into actual phone actions that run completely offline on your device (Hugging Face),
2. T5Gemma 2, which is a compact encoder-decoder model (available at 270M, 1B, and 4B sizes) that handles images and text together, processes up to 128K tokens, and supports 140+ languages (Hugging Face, paper).

Around the Horn

Want to see the flow of money throughout the AI industry visualized? Also, shout out to the unsung (in this newsletter anyway) AI hates out there, this meme is topchart (loling irl)

OpenAI’s new funding round could supposedly value the company as high as $830B (it was clocked at $750B just on Wednesday); meanwhile, ChatGPT’s mobile app crossed $3B in lifetime consumer spend.
Amazon added an Alexa+ “door greeter” for Ring that can talk to visitors/delivery drivers and take messages.
Meta built two new image/video generator codenamed “Avocado” and “Mango,” where Mango is targeting a first‑half‑2026 release, and Avocado will come sometime before that (Q1-ish?); neither are likely to be open source.
Anthropic put Claude in charge of a real in‑office vending‑machine “business,” and it got manipulated, hallucinated details, and made bizarre management decisions before they tried adding a supervising agent
1. Here’s another example, where vending machine AI got social‑engineered into giving away inventory and ended the experiment more than $1K down.
Adobe and Runway partnered to bring Runway's Gen-4.5 video model exclusively to Adobe Firefly, where you can generate video from text prompts, then edit clips directly in Premiere and After Effects.
OWASP published its Top 10 (critical security risks) for Agentic Applications (2026), a practical checklist of the biggest ways agents fail (goal hijacks, tool misuse, privilege abuse, memory poisoning, supply-chain traps, and more) plus concrete mitigations teams can start implementing.

FROM OUR PARTNERS

Your last video edit took 4 hours. This one will take 4 minutes.

You recorded the perfect take. Then you said "um" 47 times, lost eye contact reading your notes, and the audio sounds like you're calling from inside a tin can.

Here's the problem: traditional video editors make you hunt through timelines finding every filler word. That's 3-4 hours of tedious clicking for a 20-minute video.

Descript flips everything. Upload your video, and AI transcribes it instantly. Delete words from the transcript, and the video edits itself. Type "remove all ums" and they vanish. One click cleans your audio to studio quality. Another fixes your eye contact.

100+ sales teams already save 15+ hours weekly on video editing.

Edit your first video in minutes →

Intelligent Insights

Ran out of room today, so need to include today’s insights on the website. But there’s a ton of great ones! Don’t sleep on our December Research Digest… check it out!