The Neuron
Posts
😺 Gemini Ultra vs. ChatGPT-4

😺 Gemini Ultra vs. ChatGPT-4

PLUS: Meta resists the "AI boys club".

Noah Edelman & Pete Huang
February 09, 2024

Welcome, humans.

Unlike most AI research labs, Meta’s is primarily comprised of women, including 60% of its leadership team. Exciting times!

Among other benefits, this could mean that Meta's AI models end up excelling in EQ and multitasking. :)

Here’s what you need to know about AI today:

Bard is now Gemini, and Google launched its ChatGPT-4 competitor.
ChatGPT-4 outperforms Gemini Ultra in most work-related tasks.
OpenAI's next move might be GPT-4.5...or something entirely new.
Tech execs say they have to spend money to make money on AI.

Introducing Gemini Ultra—how it stacks up against ChatGPT-4. 👇

Gemini

After 10 months of anticipation, 1 faked demo, and 100s of hype-y TechCrunch pieces, Google’s rival to ChatGPT-4 is here: Gemini Ultra.

Also, Bard has gotten a name upgrade to “Gemini”.

Here’s a breakdown for the name-challenged (like us):

Hacker News

Gemini Pro 1.0 (previously Bard) is akin to ChatGPT-3.5, making the best free chatbot on the market Claude 2.1—yes, you heard that right.

Now, Google claims that Gemini Ultra is the best available chatbot, outperforming ChatGPT-4 on benchmarks. So, we put the two head-to-head across various work-related tasks to see what’s what.

*TLDR: Gemini Ultra is a big upgrade from Bard, but is still mildly worse than GPT-4.

Where ChatGPT-4 wins:

Faster responses.
Superior logical reasoning, creativity, & handling complex verbal tasks.
Produces code with fewer errors.
Maintains context better across long convos.
Hallucinates and gets off topic less.
Can handle PDFs, data analysis, & math tasks better.
Better at image generation & understanding.
Less censored—early reports suggest Gemini avoids controversial topics entirely.

Gemini ALSO mixed up the man/woman categories

Where Gemini wins:

Better at translating languages.
Tries harder to answer your question & initiates follow-ups.
Enhanced search capabilities, integrating Google search results.
Better integrations —YouTube, Maps, & soon Workspace apps (Gmail, Docs, Sheets, etc).
You can test Ultra for 2 months at no charge here.

Where they tie:

both cost $20/month.
both have mobile apps (download Gemini on Android and iOS).

Which should you use?

Our guidance mirrors our previous Bard vs. ChatGPT piece: For work tasks, ChatGPT-4 is preferable, but turn to Bard for fact-based questions.

(Perplexity *seems* to edge out Gemini in search capabilities…stay tuned for a detailed comparison.).

If your operations heavily rely on Google Workspace, consider trying Gemini Ultra during the two-month free trial to assess its utility in everyday work tasks.

*we'll update our advice as we continue exploring Gemini Ultra.

FROM OUR PARTNERS

Attend this AI conference and accelerate your business.

Join us at the Imagine AI Live conference, March 27-28, at the new Fontainebleau Las Vegas, and discover how AI is transforming the business landscape.

Learn from industry experts, and network with like-minded professionals.

Discover how AI can speed up your enterprise, expand your business model, and improve your value proposition.

Spots are filling up fast! Don't miss out on this chance to future-proof your business.

Slow & steady doesn’t win the race, at least for now…

Why it matters: Yes—Gemini Ultra means Google is closing the gap with OpenAI, but it’s still behind by almost 12 months.

We don’t see much changing. Maybe Google can convince some Workplace users to pay for Ultra, but ChatGPT+ power users aren’t going anywhere.

What’s next for AI?

Probably an upgrade to ChatGPT-4.5 sometime in 2024.

In the long term, though, we expect all these chatbots to converge on the forthcoming wave of AI: agents.

agents visualization from Abacus

Agents are the type of language Microsoft has been using with Copilot, Amazon with Bedrock, and Google with Workspace.

The Information reported on Wednesday that OpenAI is developing two types of AI agents that can automate entire workflows, not just individual tasks.

Here’s what this could look like:

An AI agent takes over your device → migrates data from documents to spreadsheets → prepares the expense report → inputs it into accounting software.
(Currently, ChatGPT can assist with pieces of this workflow, provided it's prompted well).

We have a lottttt more to say on agents. Stay tuned!

Intelligent Insights.

Feast your eyes on our top picks of the week!

This AI learnt language by seeing the world through a baby’s eyes (link).
Tech execs are telling investors they have to spend money to make money on AI (link).
Mamoon Hamid and Ilya Fushman of Kleiner Perkins: ‘More than 80%’ of pitches now involve AI (link).
OpenAI’s Secret Weapon Is Sam Altman’s 33-Year-Old Lieutenant (link).