Yesterday, a sun kissed and fluffy haired Zuck introduced his ChatGPT killer: Llama 3.1. The latest open source model from Meta beats ChatGPT 4o on most benchmarks. But how would it perform in the real world?
Today, I did a head-to-head test with 3 identical questions to find out.
Question #1: Help Me Shop
Last evening, I was thinking about buying a webcam so I can look a little snazzier on my Zoom meetings. But which one should I get?
Instead of spending half the night browsing product listings, I figured I'd ask my new AI friends…
Here's ChatGPT...
And here's Llama 3.1:
ChatGPT did great, giving me a list of well known brands at great prices. Llama's answer, however, wasn't helpful. It didn't tell me a key piece of data — each camera's cost!
ChatGPT takes this round.
Question #2: Scouting Startups
Next, I used the AI's to help me solve the biggest business problem I have: finding great startups. Specifically, I wanted to know which startups I should check out in the current YC batch.
First, let's try ChatGPT:
Now, we'll try Llama 3.1:
ChatGPT was totally wrong. it pulled companies in the prior batch, W24. That's not useful to me because they've already raised seed funding, and I like to invest at seed.
Llama 3.1 makes the same mistake, also giving me companies from the prior batch.
This one is a tie.
Question #2 shows how hard it is to use LLM's for mission critical tasks. They're just not reliable enough yet.
Question #3: Market Research
One of the main things I use LLM's for is to research unfamiliar markets.
One minute, I'm looking at a consumer social company. The next, it's a SaaS platform for oil refineries.
LLM's are super helpful in getting me up to speed on those unfamiliar corners of the economy. So I asked the dueling AI's about manufacturing SaaS.
First, we'll go to ChatGPT:
Now, let's try Llama:
ChatGPT gave a good and thorough response, calling out 11 different platforms. It wisely listed Excel first. Never underestimate how widely Excel is used to pull off complex tasks!
Llama 3.1 gave almost the exact same response. It mentioned many of the same tools like Oracle and SAP, along with Excel.
I'm calling this one a tie as well.
Wrap-Up
Overall, ChatGPT is still the reigning champion, winning 1 question and tying the other two. Nice try, Meta, but you've still got a ways to go!
What do you think of the new Llama?
More on tech:
The Easiest Way to Help Founders
Why I Passed: "3L"
Retool's YC Demo Day Pitch
Save Money on Stuff I Use:
Fundrise
This platform lets me diversify my real estate investments so I'm not too exposed to any one market. I've invested since 2018 with great returns.
More on Fundrise in this post.
If you decide to invest in Fundrise, you can use this link to get $100 in free bonus shares!
Misfits Market
I've used Misfits for years, and it never disappoints! Every fruit and vegetable is organic, super fresh, and packed with flavor!
I wrote a detailed review of Misfits here.
Use this link to sign up and you'll save $15 on your first order.
No comments:
Post a Comment