- The Neuron
- Posts
- šŗ Google vs OpenAI smackdown
šŗ Google vs OpenAI smackdown
PLUS: AI housing tool busted for discrimination...
Welcome, humans.
Hereās a fun little rumor to start your weekend: Apparently, many of the frontier model experiments are āfailingā because āthe models are fighting back and refusing instruction tuningāā¦
According to Aidan McLau, more than one technician at major AI labs has taken this to mean the models of this size are becoming āsentient.ā Yup. Some AI researchers think these AI are developing free willā¦ No big deal!
Okay, so thereās plenty of OTHER explanations for why this might be happening besides AI gaining free will. Plus, you could look at the pace of the model race between Google and OpenAI below as a counterpoint against this claim.
Then again, the current models can roleplay as sentient a little TOO wellā¦
Hereās what you need to know about AI today:
Google and OpenAI traded blows with even newer AI models.
OpenAI plans a Samsung deal and Chrome-rival browser.
Apple plans a more ChatGPT-like Siri for 2026.
An AI company had to pay millions for rental discrimination.
Google and OpenAI are dueling each other with MULTIPLE new AI models right nowā¦
Remember earlier this week when we tried Googleās new Gemini model? We warned at the time that OpenAI was already planning to counter with a new GPT-4o that would retake the top spotā¦and it did.
This new version of 4o is supposed to be much better at creative writing. And so far, it seems like it is (hereās the new GPT-4o writing an Eminem-style rap to prove it).
Well, the plot has thickenedābefore we could even test the new 4o, it turned out there is ANOTHER new Gemini model (Exp-1121), and itās already overtaken the new 4o as the #1 model overall:
To quote Logan Kilpatrick, head of product for Google AI Studio āYeah, gemini-exp-1121 is pretty good :)ā
According to LM Arena, this new Gemini is now leading āacross almost all domainsā with particularly strong performance in coding and vision tasks.
On Livebench, a rival benchmark platform, itās a much different story, with Exp-1121 lagging behind 4o and Claudeās latest models.
Itās āa tale of two benchmarksāā¦ It was the best of models, it was the worst of modelsā¦
To us, the only benchmark that matters is the Minecraft Benchmark: how well can your AI bot follow a prompt to create something mind-blowing in Minecraft? For example, hereās Gemini Exp 1121 versus the new 4o as they compete to build a church of themselves:
Also, for those who deploy AI in your applications, Artificial Analysis is recommending developers donāt switch to the new Nov 20th 4o model yet because itās likely smaller and less powerful than the current 4o (and not cheaper).
However, itās the timing of the releases that has everyone talking:
X users are debating whether Google was playing ā4D chessā by deliberately releasing a weaker initial model to bait OpenAI into showing their hand with the new GPT-4o.
Others countered that maybe OpenAI anticipated this all along, playing "n+1 dimensional 5D chessā, waiting to drop āgpt-4o-latest-fr-bro-trust-me-this-is-the-latest-one.ā
And then thereās this guy, who said maybe Google was playing ā6D underwater mahjongā all along, and perhaps thereās ANOTHER ultra Gemini model ready to be released next weekā¦
Google and OpenAI rnā¦
Public service announcement: if youāre planning on dropping a state of the art AI model anytime soon, just assume OpenAI will be right around the corner with its own drop to beat youāand act accordingly.
Why donāt we put both models to the test, head to head? Pick your fave prompts across multiple topics (creative, reasoning, math, etc), and try it out:
New Gemini: Go to Google AI Studio and under model, select āGemini Experimental 1121ā
New 4o: If you donāt have paid ChatGPT, go to AnyChat and select āChatGPTā at the top (GPT 4o-2024-11-20 will be the default). If you have paid, ChatGPT 4o is it.
Which is better: OpenAI or Gemini?After you test both models, share which you prefer (and leave a comment as to why!) |
FROM OUR PARTNERS
Your free guide to AI trends and best practices from webAI
Want to understand how businesses are succeeding with AI? Featuring insights from 300 industry leaders, this data-driven report covers:
Whoās leading the pack: What early adopters are doing differently to maximize outcomes
Local AI advantages: The rising preference for local deployments over cloud-based options
AI privacy and security: Whatās workingāand whatās notāin privacy protection
Scaling success: How company-wide integration unlocks AIās full potential
Explore stats, facts, and insights to shape your strategy and improve AIās impact within your organization.
Treats To Try.
*Hereās the answer to all your customer service problems: with JustAnswer's API, your customers can get their questions answered directly within your platform via a 24/7 network of human experts. Boost conversions, save costs, and enhance customer experiences. Sign up for a free demo.
Lovable turns your words into complete web apps - just describe what you want and it builds everything from login screens to databases instantly
NotClass helps you search for specific topics within YouTube videos and podcasts to find exact answers without watching entire videos.
Taurin is an āAI nativeā email tool that organizes your inbox, auto-summarizes messages, suggests responses, and turns emails into trackable tasks.
Papira helps you write better by correcting grammar, improving flow, and suggesting changes while letting you keep your writing style.
VoilĆ lets you highlight text on any website to get instant summaries, translations, and replies across all your browser tabs
HumanLayer lets your AI agents pause and ask humans for approval before taking important actions (Github here).
Youtube just released Dream Screen in the U.S., Canada, Australia and New Zealand which lets you generate new backgrounds for your Youtube Shorts.
*This is sponsored content. Advertise in The Neuron here.
Around the Horn.
OpenAI might make a deal with Samsung to offer ChatGPT on Samsung devices (like its deal w/ Apple) and may soon launch a Browser to compete with Google (The Information says it just hired 2 O.G. Chrome devs).
Google launched AI Agent Space, a marketplace to help you discover and deploy pre-built AI agents to help you automate tasks like customer support or sales analysisāand itāll probably promote any new agents you build, too.
Apple plans to launch a new, LLM powered version of Siri (similar to ChatGPTās Advanced Voice Mode) in Spring 2026; and itāll have the same access to info + apps that todayās Siri has, potentially solving the ācontextā problem of todayās AI (letās hope they figure privacy out tho!)
An AI tool called SafeRent had to pay a $2.3M settlement after its AI scores for evaluating potential renters were found to be discriminatory.
Intelligent Insights
Hereās a fun little deep dive into how Zuckerberg transformed Meta through open-source AI.
Cool piece on robots and how they can now learn complex tasks like flipping pancakes and tying shoelaces in hours instead of weeks.
Thereās been a lot of hype over using AI to design chipsāwell, Intel researchers discovered that traditional algorithms actually work better for complex chip layout problems, leading them to develop a hybrid approach that's dramatically faster than either AI or human designers working alone.
Hereās a good story on how, after the last AI winter of the 1990s, AI research became more pluralistic and probabilistic, with the field's shift away from rigid rule-based systems to statistical methods and neural networks enabling todayās tech (so a scaling winter might actually be good for the industry).
This economic analysis maps out how the transition to AGI could unfold: it shows that wages follow an āinverse U-shapeā pattern during automation (first rising, then falling), but the long-term outcome depends on whether there's an upper limit to how complex human work can getāif human work can always get more complex, wages can grow forever as we tackle harder tasks; but if there's a limit to human capabilities that AI can reach, wages will eventually collapse.
A Cat's Commentary.
Thatās all for today, for more AI treats, check out our website. The best way to support us is by checking out our sponsorsātodayās are WebAI and JustAnswer. See you cool cats on Twitter: @noahedelman02 |
|