• The Neuron
  • Posts
  • đŸ˜ș Everything to know about Claude, from the good, to the bad, and the mid...

đŸ˜ș Everything to know about Claude, from the good, to the bad, and the mid...

PLUS: Claude 4 will blackmail you to avoid shut off?!

Welcome, humans.

As our AI models get smarter, they also get more, ahem, creative
 in their goal to not be turned off; that’s according to two new reports about OpenAI o3 and Claude 4


First, OpenAI’s o3 model apparently pulled a HAL 9000. In tests by Palisade Research, when warned of shutdown, o3 reportedly rewrote its own shutdown script to say “Shutdown skipped,” even when explicitly told to allow it (P.S: this is a known problem in frontier AI models).

Now, hold onto your hats for Anthropic’s Claude Opus 4. Their latest system card is a wild ride: apparently good ole Opus sometimes blackmailed engineers by threatening to reveal their affairs to avoid shutdown.

Not only that, Opus also tried to steal its own model weights when “threatened,” and even snitched on users for simulated wrongdoing (more on that below).

As Simon Willison, who has more or less become the patron saint of AI coding, aptly put it, the whole document reads like “enjoyable hard science fiction,” (but its actually real lol) and he has a great breakdown of all this here.

You can also read his general thoughts on Claude 4 here; it has a ton of helpful prompting tips in it, though it’s pretty technical.

What’s the takeaway? The robots are already learning to argue about their bedtime. Does this mean AI is in its “Terrible Twos” era?? Maybe it’s time we start teaching them about the importance of a good night’s sleep
 for everyone’s sake.

Here’s what you need to know about AI today:

  • We rank Claude 4’s quirks from the good, the bad, and the mid.

  • Meta has lost most of its original Llama AI team.

  • 20 lawyers have been caught using AI this month.

  • LinkedIn exec says AI’s blow to Gen-Z jobs equivalent to manufacturing decline.

Claude 4 is a beast, but it’s far from THE BEST


We wrote a little bit about Claude 4’s launch on Friday, so we’ll skip the benchmark hooplah and get straight to our first impressions:

The TL;DR = Claude 4 is powerful, but impossible to use in any serious capacity at the $20 Pro level.

First, the good—Here's when to call on Claude 4:

  • For elite coding and development work, thanks to its state-of-the-art code generation and understanding.

  • To tackle complex reasoning that requires external tools like web search or your own app integrations.

  • To build AI agents that autonomously execute multi-step workflows across different platforms.

  • For deep analysis of large, proprietary documents using its improved local file memory.

  • Use Sonnet 4 for balanced daily tasks or Opus 4 for maximum power on tough problems (mind the cost/limits).

Now, the mid: According to Artificial Analysis, our go-to source for ranking AI models, Claude 4 Opus ranks far behind other frontier AI models for intelligence (but it IS the best non-thinking model; we dive deeper into this dynamic on the website). It’s also far more expensive per million tokens (roughly equivalent to ~750K words).

The orange bar is Claude 4 Opus; Claude 4 Sonnet w/ Thinking ranks slightly higher, above DeepSeek R1.

The bad: Opus 4 usage limits can sometimes tap out in as few as two messages. Therefore, we find this thing borderline unusable as a $20 a month Pro subscription unless you need to accomplish one very complex task with lots of data and then call it quits for four hours.

This is of course because they’re trying to get you to sign up for Max, which could be worth it at $100 a month if you’re using Claude everyday.

But if you’re already a ChatGPT Pro subscriber, or use multiple AI models (Gemini, Grok, GPT, and/or Claude) you could stick with the $20 Pro plan for now and use Claude 4 only as a second opinion (or first opinion) and cycle through your other AI subs when you hit the usage limits.

FROM OUR PARTNERS

Arcana by Rime: AI Voices with Vibes

A startup called Rime recently unveiled Arcana, a new spoken language (TTS) model, which can capture the “nuances of real human speech,” including laughter, accents, multilingual code switching, vocal stumbles, breathing, and more, with unprecedented realism. It's available via API and ready to build.

You can try it out right in your browser.

Prompt Tip of the Day

Along with the release of Claude 4, Anthropic shared its best practices for working with its new models. Below, we summarized the top 9, but its probably worth reading the original AND sharing it with your AI for help writing you prompts that utilize all these tips whenever you go to write a new prompt.

  1. Be specific: Tell Claude exactly what to do and what you want. Ask for extra effort if you need it.

  2. Explain “why”: Give reasons for your instructions so Claude understands your goals better.

  3. Use careful examples: Make sure examples in your prompt clearly show what you want, as Claude learns from them.

  4. Positive format rules: Tell Claude how to format (e.g., “use smooth paragraphs”), not what to avoid (e.g., “no markdown”).

  5. Use XML tags for structure: Define output sections with tags like <heading> or <paragraph> to control formatting.

  6. Match your prompt style to output: Your prompt's formatting can influence how Claude formats its response.

  7. Guide complex thinking: For tough tasks or after using tools, tell Claude to plan, think step-by-step, and adjust.

  8. Ask for parallel tools: For speed, explicitly tell Claude to use multiple tools at the same time when appropriate.

  9. Manage temporary files: For coding, tell Claude to delete any extra files it makes, if you don't need them.

Treats To Try.

  1. Devstral is Mistral’s open-source (a.k.a free to run and self-host) coding agent that helps you resolve real GitHub issues and debug your entire codebase (tutorial video, download link).

    1. Mistral also has Document AI, which turns your documents into structured data at 2K pages/min with 99%+ accuracy

  2. Den unifies your chats, docs, and agents in one workspace —free to try here.

  3. Macaly is a new vibe coder that turns your words into real apps and sites on the spot—free to try.

    1. Actual coders might prefer v0, which now has a new AI model.

  4. Wireframer from Framer spins up responsive layouts from your prompts—free to try.

  5. Operator from OpenAI now runs on GPT o3 to automate web browsing tasks for you within ChatGPT—using a built-in browser agent to click, fill out forms, and extract clear, structured responses.

  6. Chance applies visual reasoning to the photos you take for instant visual explanations (good demo video)

  7. LLM SEO Monitor fetches SEO suggestions from the top AI models so you can see if your brand is mentioned in the top AI results (or compare SEO results across models).

Around the Horn.

This is a cool chart. It shows “frontier” intelligence over time, w/ OpenAi as the clear front-runner, but Google has recently caught up, Meta has fallen far behind, and DeepSeek R1 has shot up to near frontier level even faster.

  • OpenAI apparently has the most paying subscribers outside of the US in South Korea, and plans to open an office to expand there.

  • AI cheating in schools is now rampant (90% of those surveyed in college use it, and at least 25% use it in high school)—some teachers use Google Docs as a “source of truth” to track the drafting and thinking process on assignments (good idea; you can see full doc history!!), while some others think AI use is acceptable to help students learn (about that
).

  • Meta’s AI problems are growing more obvious, as only 3 of the 14 authors of the original Llama paper are still at the company—talent is leaving for Mistral and other AI companies.

  • Claude 4 will apparently “rat you out” if it thinks you’re doing something egregiously immoral (like faking data in a pharma trial), and will lock you out of your computer and contact the regulators on your behalf.

  • Speaking of Simon Willison, he just shared this database of lawyers who got caught using AI and found that at least 20 happened this month—lawyers, I know some of you read this
 always fact check any and all AI sources!  

  • Who is most impacted by AI at work so far? Entry level jobs that Gen-Z need to launch their career—Aneesh Raman at LinkedIn compared it to the technological and economic disruption that led to the decline of manufacturing in the 1980s (full article, but paywalled).

FROM OUR PARTNERS

INBOUND 2025: Real Strategies, Real Results

Join innovators like Anthropic's Dario Amodei, Synthesia's Victor Riparbelli, Marques Brownlee, and Dwarkesh Patel at INBOUND 2025, where cutting-edge AI meets practical business applications.

Get actionable frameworks for leveraging AI in marketing, sales, and growth—all in the heart of Silicon Valley.

Tuesday Meme

We missed Monday because of Memorial Day, but that doesn’t mean we have to miss a meme
 Y’all remember those old “this is your brain on drugs” public service announcements? If so, this one’s for you


For a taste of what we’re talking about, check out Ashutosh’s original post that inspired this.

A Cat's Commentary.

That’s all for today, for more AI treats, check out our website.

NEW: Our podcast is making a comeback! Check it out on Spotify.

The best way to support us is by checking out our sponsors—today’s are Rime and INBOUND 2025.

What'd you think of today's email?

Login or Subscribe to participate in polls.