The Neuron
Posts
😺 GLM 5.2 brings 1M context

😺 GLM 5.2 brings 1M context

PLUS: A Chinese open model just made the closed-model default less obvious.

Matthew Robinson & Grant Harvey
June 22, 2026

Welcome, humans.

So last week, the open-model crowd (people who like AI they can run and modify themselves) got a very loud new toy: GLM 5.2, a new Chinese open-weights model (an AI model where the weights, a.k.a the huge set of numbers that determine how the AI runs, are published so anyone can run the model themselves). People are already running it on their own cloud instances, testing it in coding tasks, and comparing it against frontier APIs (top paid model services).

The fun part is that this is not any old benchmark flex. You can hit it through an API (call it from your own software), download the weights (copy those numbers onto your own machine), quantize it (make those numbers smaller and simpler so the model uses less memory), fine-tune it (retrain the model a bit for a specific job), or run it locally if your desk has enough GPU chips to qualify as a space heater. It's a very big deal… more below!

Here’s what happened in AI today:

😺 Z.ai released GLM 5.2, an open-weights model with 1M-token context and strong long-horizon coding results.
📰 Cursor made it easier to move local coding agents into isolated cloud VMs.
🍪 HumanLayer launched an agentic IDE and collaboration platform for engineering teams.
🎓️ Record a Task Once, Have Codex Solve it

…and a whole lot more that you can read about here.

Hey: Want to reach 700,000+ AI-hungry readers? Advertise with us!

P.S: Love robots? We’re starting a new robotics newsletter! Sign up early here.

😺 GLM 5.2 is the open model that made frontier AI feel less closed

Open models usually get described like thrift-store frontier AI: cheaper, useful, and maybe a little behind. GLM 5.2 is testing that assumption.

Z.ai's GLM 5.2 is an open-weights model (the weights are downloadable, so builders can run, modify, quantize, or fine-tune it instead of only renting an API). The release has 1M-token context, which means it can take in huge codebases, long research files, or whole project histories in one prompt.

Here's what happened:

Z.ai released GLM 5.2 through its blog, Hugging Face, docs, and OpenRouter access.
The model is being tested as a long-horizon coding and agent model, especially for tasks that need lots of context.
Developers have already shown it running locally with MLX on two M3 Ultra Mac Studios.
Early comparisons put it in the conversation with much pricier closed models on coding, physics-simulation, and reasoning tasks.
A LocalLLaMA thread also circulated a claimed GLM-Fable roadmap tease before year-end, which is worth treating as community chatter unless Z.ai posts it directly.

How to try it:

API route: try it on OpenRouter before touching local setup.
Builder route: download the weights on Hugging Face if you want to quantize, distill, or fine-tune it.
Docs route: read Z.ai's guide for context-window and deployment details.

Why this matters:

The real story is optionality. Closed models are convenient, but they can change price, access, policy, or performance without warning. Open weights let teams keep more control over where the model runs, what data touches it, and how deeply they customize it.

GLM 5.2 also changes the cost conversation. Scaling01 highlighted GLM 5.2 at roughly $4.40 per million output tokens, far below many frontier flagship prices. If the quality is close enough, developers start asking a dangerous question: which tasks actually need the expensive model?

That is why the local demos matter. A model you can run, inspect, and route around gives teams leverage. It turns model choice from a vendor dependency into an architecture decision: expensive flagship for the hardest calls, cheaper open model for repeatable work, and local deployment when data cannot leave the building.

Our take:

GLM 5.2 probably will not make people cancel every closed-model subscription. It does make the default less obvious. The next AI stack may use frontier models for the hardest work and open models for everything else, especially when privacy, cost, or customization matters.

FROM OUR PARTNERS

Ready to move beyond single-point tools?

The 2026 ‘Future of AI: Perspectives on generative media for startups’ report launched at Google Cloud Next and reveals the strategies startups need to navigate the next era of generative media.

Dive into end-to-end agent workflows, post-keyboard interfaces, and deeply personalized content. Leverage your authentic human taste as the ultimate defensible moat.

Get the report

🎓 AI Skill of the Day: Record a Task Once, Have Codex Solve it

You know that one annoying work task you keep explaining to AI like it has short-term memory loss? OpenAI’s new Record & Replay for Codex is built for exactly that.

The skill: show Codex a recurring workflow once, then turn that demo into a reusable skill, basically a saved set of instructions Codex can run again later. Think filing an expense report, submitting PTO, creating a correctly configured issue, publishing a video, or downloading the same report every Monday.

Here’s how to use it, if you have access on macOS with Computer Use enabled:

Open Plugins in the Codex app.
Hit the + menu and select Record a skill.
Tell Codex your goal and what inputs may change later.
Approve recording, perform the workflow, then stop recording when the task is complete.
Ask Codex to refine the skill with your naming rules, defaults, and “please don’t click that cursed dropdown” preferences.

Favorite detail: the final skill is inspectable and editable, so you get a reusable workflow, not a mystery macro hiding in the walls.

I’m about to record a reusable Codex skill.


Goal: [describe the recurring task]
Use this skill when: [when Codex should run it]
Inputs that may change each time: [dates, files, names, links, report ranges, etc.]
Success criteria: [how Codex should know the workflow is complete]
Hidden preferences to preserve: [naming rules, default fields, formatting choices, decision points]

Do not record or reuse: [passwords, secrets, private data, unrelated cleanup steps]
After the recording, draft the skill and ask me what needs to be refined before I reuse it.

Total AI beginner? Start here (goes with this video).

Have a specific skill you want to learn? Request it here.

🍪 Treats to Try

*Asterisk = from our partners (only the first one!). Advertise to 700K+ readers here!

*Your AI roadmap needs a test course. The Dell Pro Max with GB10 helps teams experiment before making bigger bets.
HumanLayer gives engineering teams task management, versioned artifacts, and human-agent collaboration for implementation work - free for small teams, then $100/user/month Pro.
pool gives developers a terminal and editor coding agent with ACP editor support, slash commands, MCP tools, and rewind - open source.
ML Intern automates the post-training research loop across papers, datasets, GPU sandbox training, evals, and iteration - no pricing details.
Open Design gives designers a local-first canvas for BYOK editing, design-system plugins, and code handoffs - open source.
Lore gives binary-heavy teams a content-addressed version-control system with deduplication and sparse workspace hydration - open source.
Retool lets teams build internal apps in Claude Code, Codex, Replit, Lovable, or Retool, then ship them through one governed runtime - free app hosting through July 1.
LM Studio previewed private frontier-scale inference streamed from four Mac Studios to a MacBook and iPhone - no pricing details.
xAI Grok TTS topped Vapi's blind voice-model humanness leaderboard at 96 out of 100 - no pricing details.
Cua runs background Linux computer-use agents that can operate desktop apps through CLI or MCP - open source.
Hyperagent turns video, dashboards, and daily briefs into generated agentic demos - no pricing details.

Trending: FOUR popular Neuron podcast eps…

Did you know we have a podcast (The Neuron: AI Explained) where we talk to fascinating people in the industry who teach us how it actually works? Check it out:

Click to view these episodes on YouTube!

New episodes air every week on: Spotify | Apple Podcasts | YouTube

📰 Around the Horn

Cursor made local agents easier to move into the cloud so coding work can continue after a laptop closes.
Sen. Mark Warner said the NSA’s director told him Anthropic’s Mythos model broke into almost all of the agency’s classified systems in hours during an authorized red-team test, reframing the US export ban around offensive capability rather than a single jailbreak.
Google’s Gemini 3.5 Pro — promised for June with a 2M-token context window and a Deep Think mode — still hadn’t shipped with about 10 days left, keeping prediction markets near 50-55% on a pre-July release.
Amazon shelved Luca Guadagnino’s nearly finished Sam Altman biopic “Artificial,” months after its $50 billion OpenAI partnership.
The Reuters Institute reported that weekly use of AI chatbots for news climbed from 7% to 10% globally, even as only about 4% of users clicked through to the original source.
Kimmonismus argued that access-cutoff risk is pushing companies and governments toward sovereign open models.
Mahi Shafiullah introduced a robot-learning method that maps chaotic human videos to dexterous robot actions.
L’Oréal teamed up with OpenAI to put Maybelline’s virtual makeup try-on tool inside ChatGPT, unveiled at VivaTech 2026.

FROM OUR PARTNERS

Generic LLMs displace jobs. The future still belongs to augmented humans. WethosAI makes you irreplaceable. Join CEO Stuart McClure this Thursday, June 25, to watch live how System 3 Thinking and Cognitive Twins will upskill your workforce, protect your career, and safeguard your business. All demo. No filler.

📖 Monday Meme

A Cat’s Commentary

That’s all for now.

What'd you think of today's email?

P.S: Before you go… have you subscribed to our YouTube Channel? If not, can you?

P.P.S: Love the newsletter, but only want to get it once per week? Don’t unsubscribe—update your preferences here.