- The Neuron
- Posts
- šŗClaude spills its own secret
šŗClaude spills its own secret
PLUS: A Chinese robot that can BMX down mountains...
Welcome, humans.
Heads up! We need more submissions for #Where Do You Neuron (if you want to keep this part of the newsletter alive, that is!). Hereās the rules to get featured:
Share your monitors / phones with The Neuron on them in a unique location.
If you include your face, weāll protect your identity.
Cats = heavily encouraged.
Dogs = case by case basis.
Sound fun? Submit here for a chance to be featured!
Now, hereās a robot that LITERALLY can BMX bike down mountainsides:
Hereās what you need to know about AI today:
We break down how Claudeās AI guardrails work.
Apple plans AI-powered home display for March launch.
Microsoft shared 200+ real life business AI case studies.
Slack survey: 17K workers report cooling AI adoption.
Claude's brain just showed us how it really worksā¦
Ever say hello to a friend, and then all of a sudden, they start lecturing you about copyright law? No? Well, that's exactly what happened this week when a Reddit user greeted the chatbot Claude:
Instead of just saying āYO!ā back, like a normal person, Claude went on a hallucinatory rant. This wasn't a bug, thoughāit was a sophisticated safety system accidentally showing its cards.
Here's what's happening under the hood:
When you type a message to Claude, it doesn't read your text like humans doāit breaks your message into ātokensā (smaller pieces of text) and analyzes them for patterns.
Sometimes, Claude spots patterns that could lead to troubleāeven if they seem totally innocent to us.
These events trigger a pre-filter alert system that checks messages before they even reach Claude's main language model.
Thatās because there are actually two types of hidden prompts working together:
System prompts: Always present, these guide Claude's overall behavior.
Injections: Added only when the safety system spots potential issues.
ā¦so when the Redditor said āYo!ā, they triggered an injection flagging potential copyright infringement. Does that mean someone owns the rights to āYO!ā?? Jealous.
Savvy Redditors investigated this behavior, and they found Claude was actually willing to discuss its own limitations. One user got Claude to analyze its own responses by printing out their conversation history, revealing the hidden prompts that get injected before every response. They even got Claude to critique its own prompt.
Why should YOU care? Because these safety features sometimes conflict with legitimate tasks. Redditors said theyāve gotten increased errors when using Claude for proofreading, and the system can be especially tricky when you need it to make small changes to text (since that might look like trying to dodge copyright).
Hereās some pro tips for when your AI is āgetting triggeredā:
Break a complex task into smaller pieces.
Use specific command patterns to make responses more consistent.
Switch languages to avoid certain English triggers.
Ask Claude to show you what's causing weird responses (like above).
For more, hereās another thread that explains all about Claudeās inner workings.
FROM OUR PARTNERS
Want to Know Where AI-Driven Business Intelligence Is Heading?
Join Zebra AI's free event The New Era of Business Intelligence: AI-Powered Business Decision-Making, on November 27th, 2024, to explore how AI is revolutionizing Business Intelligence.
Their expert panel features:
Nicholas Boucher, AI Finance Club founder, former advisor at Mercedes-Benz, Chanel, and KMPG.
Andrej Lapajne, Zebra BI CEO.
Benjamin Džubur, Zebra AI Team Lead.
At the event, youāll uncover the truth about the current state of Artificial Intelligence in BI:
How AI can really help your data analysis efforts.
The current limitations of AI in Business Intelligence.
Which AI Business Intelligence tools you need.
Learn to truly empower your teams by making data analysis accessible to everyone, regardless of experience. Network with peers and explore real-world AI-driven BI implementation success stories.
Treats To Try.
*OpenAI, Character.ai, and Anthropic build with Statsig to consolidate performance, latency, and cost metrics in one platform. Say goodbye to scattered data and guessing what works and hello to smarter tracking. Try the same tool trusted by the biggest names in AI - get started today with 2M events free.
Lamatic builds and deploys custom apps using a visual builder, without you writing code, handling everything from the database to hosting.
Check out Googleās Machine Learning Crash Course, a free course to learn AI with interactive exercises and video lectures in ~15 hours.
Writer creates and customize enterprise content using AI models that understand your company's data and workflows (just raised $200M).
Particle helps you better understand news stories through summaries, explanations, and interactive Q&Asāand supposedly drives traffic back to publishers, as some accuse Perplexity of NOT doing (raised $15.3M).
Segwise monitors your gaming app campaigns and alerts you when performance metrics drop below expected levels.
Agree lets you send, sign, and collect payments for any business agreement.
*This is sponsored content. Advertise in The Neuron here.
Around the Horn.
Slack released a new survey of 17,000+ office workersāworkplace AI adoption is plateauing and enthusiasm is cooling (dropping 6% globally) due to workers hiding their AI use from managers, fears of increased workload, and lack of training (61% have spent > 5 hours learning AI tools).
Microsoft released 210 use-cases of organizations ālarge and smallā and how they use AI in their businesses today.
Apple developed a 6-inch wall-mounted smart display that functions as a home control center with FaceTime, Siri integration, and AI features (w/ plans for future versions that feature a robotic arm) that is set to debut next March.
Want to see what bias in AI output looks like in real life? Check out this data dump of what happens when you ask open-source AI to āimagine a person.ā
FROM OUR PARTNERS
Cold Email Setup Offer
Send thousands of cold emails per month
You own the system
No agency retainers
Book 30-50 calls per month
Sign 10+ clients
1/10th the cost of ads
Completely automated
AI Gone Wild
ChatGPT can be so sassy sometimesā¦
Hereās another good one, but it wonāt make sense unless you click this linkā¦
A Cat's Commentary.
For sureā¦BMX riders? Yāall are officially DONE!
Thatās all for today, for more AI treats, check out our website. The best way to support us is by checking out our sponsorsātodayās are Zebra AI, Statsig, and ListKit. See you cool cats on Twitter: @noahedelman02 |
|