😺 GLM 5.2 brings 1M context

PLUS: A Chinese open model just made the closed-model default less obvious.

So last week, the open-model crowd (people who like AI they can run and modify themselves) got a very loud new toy: GLM 5.2, a new Chinese open-weights model (an AI model where the weights, a.k.a the huge set of numbers that determine how the AI runs, are published so anyone can run the model themselves). People are already running it on their own cloud instances, testing it in coding tasks, and comparing it against frontier APIs (top paid model services).

The fun part is that this is not any old benchmark flex. You can hit it through an API (call it from your own software), download the weights (copy those numbers onto your own machine), quantize it (make those numbers smaller and simpler so the model uses less memory), fine-tune it (retrain the model a bit for a specific job), or run it locally if your desk has enough GPU chips to qualify as a space heater. It's a very big deal… more below!

Here’s what happened in AI today:

  • 😺 Z.ai released GLM 5.2, an open-weights model with 1M-token context and strong long-horizon coding results.

  • 📰 Cursor made it easier to move local coding agents into isolated cloud VMs.

  • 🍪 HumanLayer launched an agentic IDE and collaboration platform for engineering teams.

  • 🎓️ Record a Task Once, Have Codex Solve it

Hey: Want to reach 700,000+ AI-hungry readers? Advertise with us! 

P.S: Love robots? We’re starting a new robotics newsletter! Sign up early here.

😺 GLM 5.2 is the open model that made frontier AI feel less closed

Open models usually get described like thrift-store frontier AI: cheaper, useful, and maybe a little behind. GLM 5.2 is testing that assumption.

Z.ai's GLM 5.2 is an open-weights model (the weights are downloadable, so builders can run, modify, quantize, or fine-tune it instead of only renting an API). The release has 1M-token context, which means it can take in huge codebases, long research files, or whole project histories in one prompt.

Here's what happened:

  • Z.ai released GLM 5.2 through its blog, Hugging Face, docs, and OpenRouter access.

  • The model is being tested as a long-horizon coding and agent model, especially for tasks that need lots of context.

  • Developers have already shown it running locally with MLX on two M3 Ultra Mac Studios.

  • Early comparisons put it in the conversation with much pricier closed models on coding, physics-simulation, and reasoning tasks.

  • A LocalLLaMA thread also circulated a claimed GLM-Fable roadmap tease before year-end, which is worth treating as community chatter unless Z.ai posts it directly.

How to try it:

Why this matters:

The real story is optionality. Closed models are convenient, but they can change price, access, policy, or performance without warning. Open weights let teams keep more control over where the model runs, what data touches it, and how deeply they customize it.

GLM 5.2 also changes the cost conversation. Scaling01 highlighted GLM 5.2 at roughly $4.40 per million output tokens, far below many frontier flagship prices. If the quality is close enough, developers start asking a dangerous question: which tasks actually need the expensive model?

That is why the local demos matter. A model you can run, inspect, and route around gives teams leverage. It turns model choice from a vendor dependency into an architecture decision: expensive flagship for the hardest calls, cheaper open model for repeatable work, and local deployment when data cannot leave the building.

Our take:

GLM 5.2 probably will not make people cancel every closed-model subscription. It does make the default less obvious. The next AI stack may use frontier models for the hardest work and open models for everything else, especially when privacy, cost, or customization matters.

The 2026 ‘Future of AI: Perspectives on generative media for startups’ report launched at Google Cloud Next and reveals the strategies startups need to navigate the next era of generative media.

Dive into end-to-end agent workflows, post-keyboard interfaces, and deeply personalized content. Leverage your authentic human taste as the ultimate defensible moat.

You know that one annoying work task you keep explaining to AI like it has short-term memory loss? OpenAI’s new Record & Replay for Codex is built for exactly that.

The skill: show Codex a recurring workflow once, then turn that demo into a reusable skill, basically a saved set of instructions Codex can run again later. Think filing an expense report, submitting PTO, creating a correctly configured issue, publishing a video, or downloading the same report every Monday.

Here’s how to use it, if you have access on macOS with Computer Use enabled:

  1. Open Plugins in the Codex app.

  2. Hit the + menu and select Record a skill.

  3. Tell Codex your goal and what inputs may change later.

  4. Approve recording, perform the workflow, then stop recording when the task is complete.

  5. Ask Codex to refine the skill with your naming rules, defaults, and “please don’t click that cursed dropdown” preferences.

Favorite detail: the final skill is inspectable and editable, so you get a reusable workflow, not a mystery macro hiding in the walls.

I’m about to record a reusable Codex skill.


Goal: [describe the recurring task]
Use this skill when: [when Codex should run it]
Inputs that may change each time: [dates, files, names, links, report ranges, etc.]
Success criteria: [how Codex should know the workflow is complete]
Hidden preferences to preserve: [naming rules, default fields, formatting choices, decision points]

Do not record or reuse: [passwords, secrets, private data, unrelated cleanup steps]
After the recording, draft the skill and ask me what needs to be refined before I reuse it.

Total AI beginner? Start here (goes with this video).

Have a specific skill you want to learn? Request it here. 

Did you know we have a podcast (The Neuron: AI Explained) where we talk to fascinating people in the industry who teach us how it actually works? Check it out:

Click to view these episodes on YouTube!

New episodes air every week on: Spotify | Apple Podcasts | YouTube 

📰 Around the Horn

  • Cursor made local agents easier to move into the cloud so coding work can continue after a laptop closes.

  • Sen. Mark Warner said the NSA’s director told him Anthropic’s Mythos model broke into almost all of the agency’s classified systems in hours during an authorized red-team test, reframing the US export ban around offensive capability rather than a single jailbreak.

  • Google’s Gemini 3.5 Pro — promised for June with a 2M-token context window and a Deep Think mode — still hadn’t shipped with about 10 days left, keeping prediction markets near 50-55% on a pre-July release.

  • Amazon shelved Luca Guadagnino’s nearly finished Sam Altman biopic “Artificial,” months after its $50 billion OpenAI partnership.

  • The Reuters Institute reported that weekly use of AI chatbots for news climbed from 7% to 10% globally, even as only about 4% of users clicked through to the original source.

  • Kimmonismus argued that access-cutoff risk is pushing companies and governments toward sovereign open models.

  • Mahi Shafiullah introduced a robot-learning method that maps chaotic human videos to dexterous robot actions.

  • L’Oréal teamed up with OpenAI to put Maybelline’s virtual makeup try-on tool inside ChatGPT, unveiled at VivaTech 2026.

Generic LLMs displace jobs. The future still belongs to augmented humans. WethosAI makes you irreplaceable. Join CEO Stuart McClure this Thursday, June 25, to watch live how System 3 Thinking and Cognitive Twins will upskill your workforce, protect your career, and safeguard your business. All demo. No filler. 

📖 Monday Meme

A Cat’s Commentary

That’s all for now.

What'd you think of today's email?

Login or Subscribe to participate in polls.

P.S: Before you go… have you subscribed to our YouTube Channel? If not, can you?

P.P.S: Love the newsletter, but only want to get it once per week? Don’t unsubscribe—update your preferences here.