• The Neuron
  • Posts
  • 😺 OpenAI leaked GPT-5.4 three times

😺 OpenAI leaked GPT-5.4 three times

PLUS: 10 things to know before vibecoding your next app

Welcome, humans.

OpenAI can't stop accidentally leaking GPT-5.4.

On Monday evening, our very own Corey Noles got a cybersecurity block in Codex that referenced a model called gpt-5.4-ab-arm1-1020-1p-codexswic-ev3. That's a mouthful, but the important part is those first three characters after "gpt": 5.4

GPT-5.3-Codex only launched three weeks ago, and it was already OpenAI's first model classified as "High Cybersecurity Capability." Now its successor is showing up in error messages. Not exactly the stealth rollout they probably had in mind.

This wasn't a one-off, either. In the past week, two separate pull requests in OpenAI's public Codex GitHub repo referenced GPT-5.4 by name, one setting a minimum model version to (5, 4) and another including a slash command described as "toggle Fast mode for GPT-5.4." Both were scrubbed within hours via force pushes. 

And OpenAI's own Codex employee Tibo accidentally posted (then deleted) a screenshot showing GPT-5.4 in the model selector. If this is a secret, somebody forgot to tell... basically everyone on the Codex team.

For his part, Corey had his own fairly entertaining exchange with Tibo (see below) Read the full story here.

Also, in case you’re counting: five major GPT-5 models in seven months?! At this rate, GPT-5.9 will ship before most of us have time to fully learn the quirks of the last three models.

Here’s what happened in AI today:

  • We share the top questions every vibe coder should ask before shipping anything.

  • Apple's AI servers are reportedly going unused because nobody’s using Apple Intelligence.

  • Iran strikes hit an Amazon data center in the UAE, testing the Gulf's trillion-dollar AI bet.

  • Scientists turned human brain cells into a computer that can play DOOM.

How to Vibe Code Your First App (and Make Sure It Actually Works)

ā€œVibe codingā€ is the… (let’s call it divisive) term for building software only by describing what you want to an AI, then letting it write the code for you. You don't need to know JavaScript. You don't need a computer science degree. You describe the app in plain English, and an AI tool generates it.

It sounds too good to be true. It's not, but there's a catch.

Say you build an app with AI. The button buttons. The demo demoes. You're feeling dangerous. Then someone types something weird into a form field… and your app crashes so hard it might qualify as performance art.

That's the gap Vercel is trying to close. The company recently rebuilt v0 from the ground up, specifically because vibe coding has outgrown its demo phase.

The new v0 lets anyone on a team (not just engineers) work on real codebases, open pull requests, and deploy through proper workflows. That's a big deal.

Here’s how you can use it:

  • Go to v0.dev and describe what you want to build. "A personal budget tracker that shows spending by category" works great. Be specific.

  • Iterate in conversation. Don't try to get it perfect in one prompt. Say "make the chart interactive" or "add a dark mode toggle." Each request refines the app.

    • Pro tip: request the app includes editable toggles to let you customize the design so you don’t have to waste prompts making minor UI tweaks.

  • Connect your tools. The new v0 lets you import GitHub repos (other ppl’s code), connect databases (like Snowflake, AWS), and deploy through proper workflows, not just generate throwaway demos.

  • Ship through pull requests. v0 now opens PRs (pull requests, basically a "review my changes before they go live" system) so your team can review changes before they go live.

Here's the catch with all this vibin’: Veracode just found AI-generated code introduced security flaws in 45% of tests. A Stanford study found people with AI assistants wrote less secure code and were more confident it was safe. Like the Dunning-Kruger effect, but for software.

That's why our editor Corey Noles wrote a checklist of 10 questions every vibe coder should ask before shipping anything. The TL;DR:

  • "Explain this code like I'm the person who'll get paged at 2am if it breaks."

  • "What happens when someone enters something unexpected?"

  • "Where does user data go, who can access it, and is any of it logged?"

  • "Show me how to deploy, monitor, and roll this back if something goes wrong."

Copy those four prompts into any AI coding tool after it generates your app. They work in v0, AI Studio, Claude Code, Codex… anything. The full 10-question guide goes deeper, but these four alone will catch most of the problems that turn weekend projects into Monday morning incidents. 

Want to try it? Vercel is giving Neuron readers $25 in free v0 credits. Use code NEURON-V0 at v0.dev (expires one month after you redeem it; not spon-con, just a free promo code they offered you, so thank them!). Now go build something!

FROM OUR PARTNERS

AI automated alert triage, not decisions. Now your SOC makes 6Ɨ more judgment calls. Learn which agentic architectures shift your SecOps program from monitoring coverage to execution coverage.

Download the 2026 SecOps Trends report to move from alert management to risk reduction.

NEW SECTION: AI Skill of the Day!

The people have spoken: Prompt tip of the Day is now AI Skill of the Day!

What does this mean? As some suggested, we will use this section to rotate between the top workflows, new skills, prompt advice, and general tips for learning AI in 2026; as we said over the weekend, using AI is no longer JUST about prompting, it’s about building agentic systems. But don’t worry, we’ll still cover prompts here, too.

Here’s today’s: We were inspired by today’s Claude outages to write a piece on Openrouter. Openrouter is basically a single source where you can easily switch between multiple models and model cloud providers. This is helpful for when new models come out and you want to easily switch to try it out without totally remaking your whole workflow.

You can chat with various models on their website, or use it as a gateway for your API calls. If you don’t know, an API is how two software programs talk to each other; in this case, how you use AI inside other tools.

Think of it like a universal remote for AI models. Instead of five remotes for five streaming services, you get one that controls everything, plus a private key to authenticate your access, kind of like a password, for security.

The setup to use OpenRouter anywhere takes five minutes:

  • Create an account at openrouter.ai

  • Add credits (prepaid; they deduct as you go)

  • Generate an API key (or bring your own; more on that in the full article).

  • Change two lines of code if you already use the OpenAI SDK: swap the base_url to https://openrouter.ai/api/v1 and use your OpenRouter key

Treats to Try

*Asterisk = from our partners (only the first one!). Advertise to 650K readers here!

  1. *Explore the AI-driven SDLC at Sonar Summit, a free virtual event on March 3rd. Learn to verify LLM output, automate code quality, & more. Register now. 

  2. Alibaba's Qwen 3.5 Small Series can run multimodal AI models (text, image, video) locally on your phone (0.8B) or laptop (9B on 6GB RAM), with the 9B beating OpenAI's gpt-oss-120B despite being 13.5x smaller (GGUFs, Ollama); as for how to use it, watch this; it’s for the larger version, but it’ll still work.

  3. OpenPencil lets you edit designs offline with AI chat, Figma file compatibility, and a scriptable CLI for inspections.

  4. Field Theory runs portable commands across Claude, ChatGPT, and Cursor, transcribes voice locally, and auto-improves text (free Basic, then $14/month); mac only rn.

  5. NotebookLM rolled out custom styles for infographics with 10 presets (editorial, clay, brick, kawaii) plus custom creation.

  6. Nozomio indexes your code, docs, PDFs, and Slack to reduce hallucinations in AI agents with semantic search and dependency tracking (free tier, then credit-based).

  7. Martini Art lets you generate videos and images with AI models like Kling 3.0 and Sora on an infinite canvas with team collaboration (free to try, then $240/year).

Around the Horn

This is wild.

  1. ARC Prize tested major Chinese AI models on its ARC-AGI-2 benchmark and found Kimi K2.5 (12%), Minimax M2.5 (5%), GLM-5 (5%), and DeepSeek V3.2 (4%) all scored below where frontier labs were back in July 2025. Wharton professor Ethan Mollick called it "good empirical evidence" that Chinese open-weight models are "quite fragile, good at some narrow areas but much less capable in general tasks or out-of-distribution work."

  2. OpenAI and the Pentagon added surveillance protections to their AI deal, prohibiting intentional tracking of U.S. persons including via commercial data.

  3. Iran's retaliatory strikes set an Amazon data center in the UAE on fire, knocking it offline for 24+ hours and testing the Gulf's pitch as a safe harbor for $2 trillion in AI investment.

  4. Anthropic's standoff with the White House now puts $60B in venture capital at risk after Trump blacklisted the company and Hegseth moved to designate it a supply chain threat (a penalty usually reserved for foreign adversaries like Huawei).

  5. Apple's Private Cloud Compute servers are reportedly sitting unused on warehouse shelves because Apple Intelligence usage is so low, with the company now in talks with Google to host the new Siri instead.

  6. The US Supreme Court declined to hear an AI copyright case, upholding that creative works require human authorship.

  7. Scientists trained human brain cells on a microchip to play Doom.

  8. Math, Inc. formally verified optimal sphere packings in dimensions 8 and 24 using AI.

  9. Researchers found that many "Instruct" LLMs secretly generate thousands of reasoning tokens even when thinking mode is turned off, meaning models marketed as cheap can become expensive in the wild.

  10. Svenska Dagbladet and Gƶteborgs-Posten investgated Meta’s AI glasses, and it turns out they are a privacy nightmare due to user confusion about when data is and isn’t being recorded, and the result of where that data ends up (basically, anywhere Meta wants to send it), per the terms and conditions.

FROM OUR PARTNERS

1,000 comments. 0 spreadsheets.

SurveyMonkey Enterprise has built-in AI to give your team the ultimate decision-making hack. It reads every single comment, spots the sentiment, and summarizes what matters into digestible bullet points before you’ve even finished your coffee.

You get instant, ready-to-share reports that make it easy to know what to do next.

Want to see how their AI handles the grunt work so your team can spend their time solving problems instead of just finding them? See SurveyMonkey Enterprise

Tuesday Tweets

Good article linked to the image from Shakeel

It literally does this ā€œ4 weekā€ plan to me so often; bro, you KNOW you’re going to write this for me, don’t pretend it’s gonna take you 4 weeks haha

Gary Marcus so loved this tweet that he shared it in his own article on the same topic

Why are we sharing this last one? Because this part: ā€œaccess to vast swaths of information does not equate to reasoning or computational intelligence.ā€

This is one reason why you might not want to trust today’s large language models with anything that requires strict adherence to details across massive swaths of data. You just can’t trust it to be 100% accurate, for all the flaws of transformer-based models (the architecture of today’s language models that run ChatGPT / Claude / Google).  

So if AGI = doing 50% of productive workforce tasks, like how Microsoft and OpenAI want to define it, then it’s conceivable we could reach it (or something indistinguishable from it) this year or early next. But if AGI = an AI system that can think, plan, and generalize across domains on its own with no harness and no steering or pre-training or post-training or reinforcement learning or human feedback of any kind… then please stay calm and adjust your expectations, because we’re still faaaar off from actual AGI.

A Cat’s Commentary

That’s all for now.

What'd you think of today's email?

Login or Subscribe to participate in polls.

P.P.S: Love the newsletter, but only want to get it once per week? Don’t unsubscribe—update your preferences here.