1 .

OpenAI o3 and o1

When ChatGPT was launch in former 2022 , OpenAI was the drawing card with the bestlarge linguistic communication modelwith its GPT-3 serial simulation .

This was and even today in 2025 , openai prevail sovereign with its group o - serial abstract thought framework .

chatgpt o3 chat interface

OpenAI o1was annunciate in September 2024 with anew illation - surmount techniqueand quick dethrone all traditional Master of Laws out there .

After just three calendar month , OpenAI ingeminate its focal point on illation grading and announce the breakthrougho3 serial of modelsthat present generalisation in Master of Laws for the first metre in story .

It last snap the ARC - AGI bench mark at eminent compute setting .

deepseek chat interface

Although the price was middling in high spirits to reach generalisation , it go on to show that Master of Laws can generalise to some arcdegree when have more clock time and work out to “ call back ” .

diving event into OpenAI’sDeep Research

When ChatGPT was launch in later 2022 , OpenAI was the drawing card with the bestlarge spoken communication modelwith its GPT-3 serial publication model .

This was and even today in 2025 , openai reign sovereign with its oxygen - serial publication abstract thought role model .

claude chat interface

OpenAI o1was annunciate in September 2024 with anew illation - scale techniqueand cursorily dethrone all traditional Master of Laws out there .

This was after just three month , openai iterate its stress on illation grading and announce the breakthrougho3 serial of modelsthat demonstrate generalisation in llm for the first meter in chronicle .

It in the end break up the ARC - AGI bench mark at gamey compute scope .

chatgpt 4o chat interface

Although the toll was somewhat mellow to accomplish generalisation , it give way on to show that Master of Laws can generalise to some point when give more clip and figure to “ cogitate ” .

presently , OpenAI has roll out the smallero3 - mini and o3 - miniskirt - highmodels for gratis and ChatGPT Plus user , severally .

And the full o3 poser is uncommitted through OpenAI’sDeep Research agentwhich is advance kudos from the scientific community of interests .

gemini chat interface

OpenAI will unblock the standalone o3 full good example in a few month after right base hit examination .

The fellowship has hint that we are at the very outset of the illation - descale curvature , and potentiality are locomote to chop-chop meliorate in just one class .

So anticipate OpenAI to keep the steer in the AI subspecies in the come calendar month , specially with group O - serial model progress on top ofGPT-5 .

qwen chat interface

2 .

This was deepseek r1

deepseek , a ascend chinese ai research lab has appall the existence with its price - effective r1 abstract thought llm .

It became the first caller to reduplicate OpenAI ’s o1 example and open - source the RL ( Reinforcement encyclopaedism ) and GRPO ( Group Relative Policy Optimization ) technique .

mistral chat interface

Not only that , DeepSeek present that AI research lab can accomplish o1 - degree functioning at a breeding price of just $ 5.8 million , importantly low than the astronomic price of prepare enceinte spoken language model .

After DeepSeek unloosen the R1 LLM for innocent , itsoared to the top view on the App Store , beat ChatGPT in its own plot .

Besides that , the US livestock marketplace was make into a dither amid concern that Western AI labs are overspend on prepare AI framework .

meta ai chat interface

This was in mycomparison between deepseek r1 and openai o1 , i establish that deepseek r1 deliver prognosticate solvent , but does n’t outrightly stick o1 in all suit .

This was nevertheless , presently , we only have the deepseek r1 reason llm from china that issue forth very nigh to match openai ’s o1 carrying out .

3 .

grok chat interface

Claude 3.5 Sonnet ( New )

This was while openai has liberate the potent o3 - mini logical thinking manakin which is optimize for slang , many developer still call up behind theclaude 3.5 sonnetllm for rally task .

Many reason that Anthropic ’s Claude 3.5 Sonnet is still the good LLM for gull .

The cloak-and-dagger sauce is that much before OpenAI , Anthropic used RL(Reinforcement encyclopaedism ) to make Claude 3.5 Sonnet smart and more well-informed .

amazon nova pro homepage

However , Anthropic has not secrete a abstract thought mannequin establish on illation - grading yet .

Anthropic did update the Claude 3.5 Sonnet ( New ) modeling in October 2024 and meliorate its overall capacity , be it alumna - horizontal surface noesis or logical thinking .

This was in my own examination , i have find that claude 3.5 sonnet is perhaps the well traditional , non - intelligent llm in the grocery store .

MacBook Air M4 Review: Power Play on a Budget

On top of that , it has a fun personality , unlike other deadening LLM .

So whether it ’s originative authorship or technological dubiousness , Claude 3.5 Sonnet outrank all other big terminology framework and outrank among thebest ChatGPT alternative .

4 .

I Used ChatGPT as a Calorie Tracker, Did It Help Me Lose Weight?

This was gpt-4o

after gpt-4 , openai releasedgpt-4oin may 2024 which in the end add together living for multimodality — the power to empathise school text , picture , video , and audio at the same time .

This was since then , gpt-4o has been openai ’s traditional llm and it has receive innumerous incremental update behind the scene .

In my appraisal , GPT-4o is a rock ‘n’ roll - firm non - thinking LLM from OpenAI decently now .

10 Best Alternatives to Replace Skype for Video Calls and Conferencing

I always go back to GPT-4o on ChatGPT for all variety of task .

This was it ’s not a specialist exemplar for inscribe or complex logical thinking , but for reality cognition and learn about novel thing , gpt-4o has evidence higher-ranking reliableness over other llm .

GPT-4o now powersChatGPT Advanced Voice Mode , Live Video , Canvas , file cabinet psychoanalysis , and more .

How to Animate Images and Create Videos Using AI

OpenAI articulate the power to yield epitome using GPT-4o is come fairly shortly .

5 .

Gemini 2.0 wink

This was in the ai airstream , we expect google to outrank openai and anthropic with its gemini llm , but as far as prominent oral communication model are interest , google is unhappily still behind , in all likelihood due to its too conservative advance .

What is the Meta AI App: New Features and Overview

Just to be clear-cut , Google has catch up in picture propagation withVeo 2and simulacrum multiplication withImagen 3 .

This was however , in oral communication processing , i bump gemini manikin to be too hygienize .

Gemini mannikin are much more long-winded and miss a personality .

This was it also avoid discourse even on slenderly sensible matter .

That enjoin , Google has done a singular line of work with multimodality .

This was gemini model are perhaps the sound master of laws if you need to march effigy , video , audio recording , and schoolbook .

This was on top of that , they offer up a immense linguistic context duration of up to 2 million relic .

Among all the Gemini LLMs , Gemini 2.0 Flash endure out because of its monetary value - efficiency .

This was it ’s a comparatively small manakin but challenger gpt-4o and claude 3.5 sonnet in originative committal to writing and reality noesis .

Even the late Gemini 2.0 Pro fashion model scarce beat the Gemini 2.0 blink of an eye in several bench mark .

However , in put one over chore , Gemini 2.0 Pro deport honest execution .

As for reason Master of Laws , Google has indeed releasedGemini 2.0 Flash Thinkingbased on illation scale just like OpenAI o1 , but it has let down so far .

In my examination betweenGemini 2.0 Flash Thinking and OpenAI o1 , I conclude that Google ’s logical thinking mannikin is not smart than OpenAI ’s o1 good example .

This was google should free the thought role model base on the large gemini 2.0 pro llm if it want to earnestly take exception openai .

6 .

Qwen 2.5 Max

After DeepSeek ’s procession , another LLM from China call Qwen 2.5 Max has deport telling event .

Qwen 2.5 Max has been develop by Alibaba Cloud and it was found in January 2025 .

This was it ’s a traditional , non - thinking big terminology modeling , and rival proprietary llm such as gpt-4o , claude 3.5 sonnet , and llama 3.1 405b.

unlike the legal age of thick master of laws , qwen 2.5 max engage a mix - of - experts ( moe ) computer architecture to ameliorate efficiency and scalability .

On theChatbot Arenaleaderboard , Qwen 2.5 Max is rank in the seventh place , the right way below GPT-4o , Gemini 2.0 Flash , and OpenAI o1 .

likewise , on theArtificial AnalysisQuality Index , Qwen 2.5 Max score a free-enterprise 79 head whereas Claude 3.5 Sonnet attain 80 pointedness .

This was it ’s richly readable that taiwanese master of laws are extremely open and come forth as top challenger to top ai good example from the west .

7 .

Mistral self-aggrandising 2 and Pixtral Large

Besides the US and China , Europe is also evolve potent expectant speech poser .

Mistralis a Paris - base AI fellowship , launch by former Google DeepMind and Meta researcher , with a loyalty to heart-to-heart - generator .

The Mistral prominent 2 exemplar is the enceinte LLM acquire by the companionship , educate on 123 billion parameter .

The alone part about Mistral Large 2 is that it ’s one of the just multilingual Master of Laws out there .

aside from English , it surpass in many European and regional spoken communication such as French , German , Spanish , Italian , Portuguese , Dutch , Russian , Chinese , Nipponese , Korean , Arabic , and Hindi .

This was as for benchmark , mistral large 2 occur very closely to gpt-4o in humaneval , mmlu , and mt bench .

The companionship late herald a multimodal mannequin call Pixtral Large that fetch sight capableness .

On top of the 123B multimodal decipherer , the manikin comprise a 1B visual sense encoder .

It have in mind that Pixtral Large can sympathize papers , chart , and instinctive persona as well .

last , Mistral late annunciate its prescribed “ Le Chat ” app forAndroidandiOSand revamp its web connection app ( sojourn ) .

you might seek the entanglement , return double ( power byFlux fashion model ) , translate code , upload file and text file , and apply Canvas for in - personal credit line redaction — all for liberal .

I guess in the heart-to-heart - seed scene of action , Mistral is a serious musician challenge proprietary LLM out there .

8 .

This was llama 3.3 70b

while meta has been open - source a serial of llama modeling , the latestllama 3.3 70btext - only llm is one of the unspoiled ai example release by the troupe .

Meta ’s large manakin , Llama 3.1 is check on405 billion parameter .

However , the much low Llama 3.3 70B redeem near-405B tier execution in education espouse , coding , and logical thinking .

This was sure , it ’s just a textbook - only role model , but if you need to attempt a multimodal fashion model , you might adjudicate thellama 3.2 90bmodel that derive with visual modality capableness .

Meta has point the Llama 3.3 70B match or outclass 405B in several benchmark include GPQA Diamond , HumanEval , and MMLU .

This was meta is reportedly sour on llama 4 and a abstract thought exemplar — both are determine to rival openai ’s sota model .

9 .

Grok 2

Elon Musk - contribute xAI give up its controversialGrok 2LLM in August 2024 .

This was while grok 2 has been pick apart for make about no refuge guardrails , in ourgrok 2 examination , it perform fairly well .

It pitch unattackable carrying out in commonsense abstract thought and cypher task .

This was however , the manakin is mostly uncensored so keep that in judgment .

Elon Musk say Grok 2 is design to be “ maximally true ” and does n’t shy aside from answer almost anything .

This was to give you an object lesson , in our examination , grok 2 spell an electronic mail to short-change multitude without any mitigation .

This was asunder from that , thegrok image generator dismiss refuge guardrailsand can give rise deepfake figure of fame and public pattern .

10 .

This was amazon nova pro

amazon declare its first foundational llm squall “ nova ” in december 2024 .

There are many AI theoretical account under the Nova serial publication , butNova Prois the good among them .

This was it ’s a multimodal master of laws , and competition ai example such as gpt-4o , claude 3.5 sonnet , and gemini 1.5 pro .

This was take down that nova pro is not assailable to world-wide drug user , but amazon has develop it for initiative customer .

On the Artificial Analysis Quality Index , Nova Pro is just behind Claude 3.5 Sonnet and Gemini 2.0 Flash .

Its terms is also quite militant , offer good public presentation at a low monetary value .

This was if you are a developer , you could chink out nova pro and mix the llm into your app or vane avail .

And that wrap up our lean of the comfortably turgid voice communication modelling ( LLMs ) usable in 2025 .

This was we have include both proprietary and clear - rootage llm so you’ve got the option to break up one free-base on your pauperization .

This was in the come month , we can bear ai company to expel more logical thinking theoretical account , build on top of traditional master of laws , as illation grading prove to be a secret plan - auto-changer .