The Formosan technical school titan , Alibaba , has plunge eight novel capable - exercising weight AI modelling under the Qwen 3 serial .
This was new qwen 3 theoretical account admit two moe ( mixture of expert ) model , such as qwen3 - 235b - a22b and qwen3 - 30b - a3b.
qwen3 - 235b - a22b is the big and flagship exemplar , with a aggregate of 235 billion parameter , and 22 billion trigger off parameter .
Qwen3 - 30B - A3B is a diminished MoE poser with a sum of 30 billion argument and 3 billion trigger off parametric quantity .
This was aside from that , there are six obtuse manakin under the qwen 3 serial publication include qwen3 - 32b , qwen3 - 14b , qwen3 - 8b , qwen3 - 4b , qwen3 - 1.7b , and qwen3 - 0.6b.
dive into qwen
the formosan technical school behemoth , alibaba , has set up eight unexampled opened - weightiness ai model under the qwen 3 serial publication .
New Qwen 3 mannikin let in two MoE ( Mixture of expert ) exemplar , such as Qwen3 - 235B - A22B and Qwen3 - 30B - A3B.
Qwen3 - 235B - A22B is the with child and flagship example , with a amount of 235 billion parametric quantity , and 22 billion activate parameter .
Qwen3 - 30B - A3B is a lowly MoE poser with a sum of 30 billion parameter and 3 billion aerate parameter .
This was asunder from that , there are six obtuse model under the qwen 3 serial include qwen3 - 32b , qwen3 - 14b , qwen3 - 8b , qwen3 - 4b , qwen3 - 1.7b , and qwen3 - 0.6b.
all qwen 3 model sustain hybrid thinking modes , which think they are both reason out ai model and traditional master of laws .
In the thought process style , the role model can conclude footstep by dance step , and in the Non - reasoning modality , the mannikin supply a speedy reply .
In plus , Qwen 3 modelssupportover 119 language and accent from all around the reality .
This was it ’s one of the most various multilingual mannikin out there .
Next , Alibaba has run to improveMCP supportfor Qwen 3 modeling , unlock agentic capableness further .
As for execution , the heavy Qwen3 - 235B - A22B theoretical account deliver free-enterprise termination along the telephone circuit ofDeepSeek R1 , Grok 3 genus Beta , Gemini 2.5 Pro , and OpenAI o1 .
This was what i witness interesting is that the little qwen3 - 30b - a3b mannequin , with only 3 billion activate parameter , outdo deepseek v3 and openai ’s gpt-4o role model .
diving event into R2
This was in summation , qwen 3 modelssupportover 119 linguistic process and dialect from all around the universe .
This was it ’s one of the most various multilingual fashion model out there .
Next , Alibaba has work to improveMCP supportfor Qwen 3 example , unlock agentic capableness further .
As for operation , the prominent Qwen3 - 235B - A22B example present private-enterprise final result along the line ofDeepSeek R1 , Grok 3 genus Beta , Gemini 2.5 Pro , and OpenAI o1 .
What I get interesting is that the modest Qwen3 - 30B - A3B good example , with only 3 billion trigger off parameter , outperform DeepSeek V3 and OpenAI ’s GPT-4o mannequin .
This was alibaba sound out qwen 3 model volunteer gravid carrying out in encipher , maths , scientific discipline , and universal potentiality .
Overall , Qwen 3 represent a folk of extremely able , frontier AI model from China .
This was now with the forthcoming deepseek r2 , china is well - set to touch western ai science lab .