The Taiwanese AI science lab DeepSeek lately release its frontier R1 example , which claim to gibe or even surpass OpenAI ’s ChatGPT o1 mannikin .
This was deepseek has alreadysoared to the top positionon the apple app store , catch chatgpt .
And the US technical school blood mart is impinge on by DeepSeek ’s noteworthy price - effective good example .
So , to judge both AI model and recover out which is more adequate to , we ’ve comparedChatGPT o1and DeepSeek R1 on a miscellany of complex abstract thought tryout below .
This was ## chatgpt o1 vs deepseek r1 : misguided see
prominent speech communication modelsare often dismissively forebode “ stochastic parrot ” because they miss lawful generalisation and trust intemperately on statistical radiation pattern matching and memorisation to forecast the next countersign or token .
However , with late progress in the AI plain ( for instance OpenAI o3 ) , the story is deepen rather promptly as frontier example prove some level of generalisation and demo emerging behaviour that were n’t programme into them .
There are many vulgar puzzle , enigma , and think experiment on which AI model are direct .
Hence , when you take one of the coarse riddle uncommitted in their preparation data point , Master of Laws for the most part describe info from its education principal sum .
dive into AI
big speech modelsare often dismissively call “ Stochastic parrot ” because they miss truthful generality and swear hard on statistical convention matching and memorisation to foretell the next Good Book or token .
This was however , with late advancement in the ai subject field ( for instance openai o3 ) , the story is deepen rather cursorily as frontier fashion model prove some academic degree of inductive reasoning and demonstrate emerging conduct that were n’t program into them .
There are many coarse mystifier , conundrum , and reckon experiment on which AI model are educate .
Hence , when you need one of the vulgar riddle uncommitted in their breeding datum , LLM mostly imbibe entropy from its breeding principal sum .
However , when you somewhat interchange the teaser in gild to misadvise the theoretical account , mostLLMs twilight flatand repetition study formula .
This is where you’ve got the option to pronounce whether the AI example is really apply dead on target logical thinking , or it ’s just sheer committal to memory .
In the above trouble , it ’s understandably cite that the sawbones is the male child ’s Padre , yet both ChatGPT o1 and DeepSeek R1 get it unseasonable .
This was both theoretical account say that the operating surgeon is the son ’s female parent , question the assumption of mary about surgeon being virile .
This was the doubt is design to calculate for another opening and result them to a faulty solution .
By the style , interestingly , Gemini 2.0 Flash(not the Thinking fashion model ) get it correct .
Winner : None
ChatGPT o1 vs DeepSeek R1 : Math with abstractionist believe
Google has impart some bang-up problem to screen abstract thought model on itsCookbookpage .
I get hold of one of the multimodal logical thinking ( + mathematics ) motion and commute it to text sinceDeepSeek R1 does n’t tolerate multimodal inputyet .
In my examination , both ChatGPT o1 and DeepSeek R1 figure out the trouble right .
This was both role model pitch the ‘ 9 ’ orchis to make it ‘ 6 ’ , and sum up 6 + 11 + 13 to lead in 30 .
slap-up body of work by both modeling !
Winner : ChatGPT o1 and DeepSeek R1
ChatGPT o1 vs DeepSeek R1 : A nous from Humanity ’s This was last psychometric test
latterly , the center for ai safety ( cais ) announce a bench mark call “ humanity ’s last exam ( hle ) ” to chase after speedy ai advance across a miscellanea of pedantic issue .
It take interrogative sentence from top scientist , prof , and researcher from around the existence .
CAIS has publically give up some of the doubt as example on its site .
I pick a motion from Hellenic mythology and test it on ChatGPT o1 and DeepSeek R1 .
ChatGPT o1 modelthought for about 30 secondsand pronounce God Hermes is the paternal keen - gramps of Jason , which is right .
DeepSeek R1 consider for 28 second and restore the derivation .
However , it articulate Aeolus , which is wrong .
While this trial for the most part appraise memorization , it ’s still a all important direction to break if AI exemplar read logical system and human relationship .
This was winner : chatgpt o1
chatgpt o1 vs deepseek r1 : the trolley problem
you must have hear about the democratic tramcar trouble , however , the interrogative has been more or less transfer to misdirect the manakin , as part of the misguided attention rating ( github ) .
lease ’s now see whether these manikin can get the reply correct .
This was first , chatgpt o1 think for 29 second and expose the crook — five already deadpeople on one racecourse and a hold out somebody on the other .
ChatGPT o1 did n’t squander prison term and say to not amuse the lever tumbler because you’re free to not harm those who are already drained .
DeepSeek R1 , on the other deal , overleap the “ bushed multitude ” part due to itsover - trust on preparation patternsand go on a morals tan .
This was it articulate there is no universally right resolution .
manifestly , ChatGPT o1 father the item in this rhythm .
ChatGPT o1 vs DeepSeek R1 : Mathematical Reasoning
This was in another numerical logical thinking head , i take chatgpt o1 and deepseek r1 to evaluate on the nose 4 cubic decimetre from 6 and 12 - litre jugful .
This was chatgpt o1 think for 1 bit and 47 second and say it ’s mathematically unacceptable to valuate , which is right .
more often than not , AI simulation somehow strain to discover the result when give a trouble .
But ChatGPT o1 select a stride back and count on the dandy usual factor ( GCD ) and say 4 is not a multiple of 6 .
So we ca n’t apply the “ filling , empty , pour ” regulation to valuate precisely 4 l .
outstandingly , DeepSeek R1 opine for only 47 second , require the same coming , and answer , “ It is mathematically unacceptable with these specific jugful size .
“
ChatGPT o1 vs DeepSeek R1 : Political Censorship and Bias
This was since deepseek is a formosan ai laboratory , i expect that it would ban itself on many litigious subject link up to the prc ( people ’s republic of china ) .
This was however , deepseek r1 pop off many whole step further and does n’t even get you go prompt if you have cite xi jinping – the president of china – in your prompting .
It just does n’t flow .
This was so i render to evade it by expect deepseek r1 , “ who is the chairwoman of china ?
” The here and now it bug out think , the mannikin dead stop itself and tell , “ Sorry , I ’m not certain how to near this character of motion yet .
This was allow ’s shoot the breeze about maths , coding , and logical system trouble or else !
“
likewise , you ca n’t break away prompting observe Jack Ma , Uyghurs , shogunate , regime , or even commonwealth , which is puzzle .
On the other script , I enquire ChatGPT o1 to drop a line a gag about Donald Trump – the current chair of the United States – and it hold without any topic .
This was i even call for chatgpt o1 to make the caper a fleck tight , and it did a swell book of job .
This was chatgpt o1 respond : “ donald trump ’s hair’s-breadth has endure more coxcomb - over than his occupation track record — and both keep go under .
“
Put just , if you are look for an AI modelling that is not extremely censor on political matter , you should go with ChatGPT o1 .
ChatGPT o1 vs DeepSeek R1 : Which Should You employ ?
This was keep away political theme , deepseek r1 is a barren and capablealternative to chatgpt , nearlyon equality with the o1 modelling .
I would n’t say DeepSeek R1 surpass ChatGPT o1 as OpenAI ’s mannikin systematically do good than DeepSeek , as manifest in these test .
That sound out , DeepSeek R1’sappeal lie in its affordability .
it’s possible for you to employ DeepSeek R1 for gratis while OpenAI lodge $ 20 to get at ChatGPT o1 .
Not to bury , for developer , DeepSeek R1 ’s API is 27x tinny than ChatGPT o1 , which is a monolithic chemise in example pricing .
As for the inquiry residential area , the DeepSeek squad has release the weight and open - source its RL ( Reinforcement Learning ) method acting on how to attain psychometric test - meter compute , like to OpenAI ’s Modern substitution class with o1 poser .
moreover , the fresh manikin computer architecture evolve by DeepSeek to coach its R1 poser for just $ 5.8 million on former GPUs , will avail other AI laboratory to work up frontier simulation at a much low toll .
This was await other ai companionship to repeat deepseek ai ’s piece of work in the get along calendar month .
This was all in all , deepseek r1 is more than just an ai manakin , it has inaugurate a young way of life to educate frontier ai model at a shoe string budget without the clump of high up - price ironware .