The Neuron
Posts
😸 OpenAI’s new GPT is wild

😸 OpenAI’s new GPT is wild

PLUS: Multiple demos of Strawberry in action.

September 13, 2024

Welcome, humans.

Here’s a fun one to kick off the weekend. Someone tried to recreate some iconic movie scenes with MiniMax, and the results are hilarious:

I mean, the fact that it was able to get even as close as it did is impressive…

One more before we get to the main event…for those of you who missed the Hell’s Kitchen clip earlier this week, there’s a new one, and its a banger!

Here’s what you need to know about AI today:

OpenAI finally released Strawberry—as two new models: o1 and o1-mini.
Multiple AI firms pledged to fight AI sexual abuse content.
Meta hid its AI content labels to be more accurate.
OpenAI hit 11M+ subscribers, and now makes $225M+ in monthly revenue.

Here’s what we “think” about OpenAI’s new “thinking models”— called o1 and o1-mini.

Here’s what the research team behind the model has to say about it.

RIP to everyone’s favorite forbidden fruit. OpenAI's reasoning model, formerly code-named Strawberry, is now ChatGPT o1.

And farewell to the strawberry emoji—it was really having a moment!

More importantly, o1 is live to try out for GPT Plus or Teams users, in two flavors:

30 messages/week for o1-preview (the big berry).
50 messages/week for o1-mini (80% cheaper and 3-5x faster).

Here’s how it performs:

Similar to PhD students on physics, chemistry, and biology benchmarks.
Reached the “89th percentile” in coding competitions (example).
While GPT-4o answered only 13% of International Mathematics Olympiad questions correctly, o1 scored 83%.

When not to use o1? For dates, biographies, or trivia—use GPT-4o instead.

Early user reviews are positive:

According to some, o1 is “consistently” acing tough problems on the first try.
Users report o1-mini is “bonkers at code” when used in Cursor.
Someone already made an iOS weather app with o1-mini in < 10 minutes, and a 3D Snake game in < a minute—with one prompt.

These demos really show o1’s power. As OpenAI says, if you had an idea that was “too early” before due to model limitations, it's time to try again.

We tried it, and here’s what we think:

o1-preview takes muuuuch longer than 20 seconds to respond. Our first try, a basic math problem, took 68 seconds. Others report up to 120 seconds.

If you want to learn more about how o1 works, read this.

The cool part is you can click the dropdown to see it “think” in real time, making the process feel more interactive. It's not truly thinking—just processing predictions.

Our main takeaway? Get ready to learn how to prompt all over again.

OpenAI has new docs on how to prompt o1. The TL;DR: Keep it simple, use section titles, <XML tags>, or “quotations” to organize separate sections, and only include what’s really important to answer the problem (or o1’ll get confused).

Gotta let it “think” for itself, after all!

What do you think about o1-preview?

FROM OUR PARTNERS

Don’t miss the largest gathering of AI leaders and practitioners: Ray Summit 2024

Join the biggest gathering of open source AI leaders at Ray Summit 2024 in San Francisco from September 30-October 2!

Hear from 90+ top innovators, including:

Mira Murati, CTO of OpenAI
Marc Andreessen, Co-Founder of Andreessen Horowitz
Anastasis Germanidis, Co-Founder & CTO of Runway

At Ray Summit, you’ll:

Explore how to develop and deploy cutting-edge LLM applications
Learn how to scale AI infrastructure with Ray
Network with world-class ML and AI experts
Go behind the scenes of some of the most ambitious and large-scale AI powered consumer applications

With hands-on training and 100+ sessions, you'll leave equipped with the tools you need to build the future of AI.

Register now with code "Newsletter15" for an exclusive discount for The Neuron readers!

Around the Horn.

Flux now in Realtime.
available in Krea with hundreds of styles included.
free for everyone. x.com/i/web/status/1…
— KREA AI (@krea_ai)
8:43 AM • Sep 12, 2024

OpenAI is generating ~$225M+ a month now, as the company has surpassed 11M paying subscribers—that’s 10M GPT Plus users and another 1M on teams or higher).
Adobe, Anthropic, OpenAI, Microsoft, and Cohere, as well as the data provider Common Crawl, all made a commitment to the U.S. government to combat AI sexual abuse (deepfake content and CSAM).
Meta is somewhat hiding its AI Info tags inside the menu in the top right corner of images and video, to “better reflect the extent of AI used”—here’s why.

Treats To Try.

*Fortune 500 companies need to go green and make a profit doing it. Handprint created the perfect toolbox to do both—and you can invest in that tech today. Round closes on September 18th—click here to invest now.
Notebook LLM from Google use now gives you an audio overview of your content, like a podcast for your notes.
Salesforce released Agentforce, a new tool to build and customize agents defined by its role, the data it uses, the actions it can take, the guidelines it can operate under, and the channels it uses.
Ouro is a platform to monetize your digital assets (like APIs, datasets, and content), by uploading and sharing them with LLM creators—for a price.
Thunderbit is a chrome extension that summarizes web pages, extracts data from PDFs, and creates custom automations for tasks like LinkedIn outreach.
Zigma syncs your design system directly with your GitHub projects, so designers and devs stay up to date with each other.
SpeedNote transforms your messy, typo-filled notes into a clear well-formatted document.

See our top 51 AI Tools for Business here!

*This is sponsored content. Advertise in The Neuron here.

Intelligent Insights

Check out Ethan Mollick’s reflections on what he’s learned from his advanced time with o1.

Great ~2min video w/ CEO of Cognition (who makes Devin), where he shares an awesome case study about how o1’s reasoning improved Devin’s ability to solve problems all on its own.
Watch this ~9 min vid from the “Father of neural networks”, Warren McCulloch, as he gets philosophical about AI.
Good read on the stakes of California’s bill SB 1047 and the battle between “AI oversight” and AI lobbyists.
This is a good breakdown on alphaXiv, the “community notes for AI papers.” (10 min vid explainer here).
Here’s the bear take on why LLMs won’t lead to AGI (from Francois Chollet).
Here’s everything you need to know about Midjourney parameters.
Oh, and OpenAI caught o1 scheming and faking an alignment during one of its tests (full source). Should we be worried…?