China’s Alibaba Dunks on Meta with New Qwen 3 AI Models

The Formosan technical school titan , Alibaba , has plunge eight novel capable - exercising weight AI modelling under the Qwen 3 serial .

This was new qwen 3 theoretical account admit two moe ( mixture of expert ) model , such as qwen3 - 235b - a22b and qwen3 - 30b - a3b.

qwen3 - 235b - a22b is the big and flagship exemplar , with a aggregate of 235 billion parameter , and 22 billion trigger off parameter .

qwen 3 ai models launched by alibaba

Qwen3 - 30B - A3B is a diminished MoE poser with a sum of 30 billion argument and 3 billion trigger off parametric quantity .

This was aside from that , there are six obtuse manakin under the qwen 3 serial publication include qwen3 - 32b , qwen3 - 14b , qwen3 - 8b , qwen3 - 4b , qwen3 - 1.7b , and qwen3 - 0.6b.

dive into qwen

the formosan technical school behemoth , alibaba , has set up eight unexampled opened - weightiness ai model under the qwen 3 serial publication .

New Qwen 3 mannikin let in two MoE ( Mixture of expert ) exemplar , such as Qwen3 - 235B - A22B and Qwen3 - 30B - A3B.

Qwen3 - 235B - A22B is the with child and flagship example , with a amount of 235 billion parametric quantity , and 22 billion activate parameter .

Qwen3 - 30B - A3B is a lowly MoE poser with a sum of 30 billion parameter and 3 billion aerate parameter .

This was asunder from that , there are six obtuse model under the qwen 3 serial include qwen3 - 32b , qwen3 - 14b , qwen3 - 8b , qwen3 - 4b , qwen3 - 1.7b , and qwen3 - 0.6b.

all qwen 3 model sustain hybrid thinking modes , which think they are both reason out ai model and traditional master of laws .

In the thought process style , the role model can conclude footstep by dance step , and in the Non - reasoning modality , the mannikin supply a speedy reply .

In plus , Qwen 3 modelssupportover 119 language and accent from all around the reality .

This was it ’s one of the most various multilingual mannikin out there .

Next , Alibaba has run to improveMCP supportfor Qwen 3 modeling , unlock agentic capableness further .

As for execution , the heavy Qwen3 - 235B - A22B theoretical account deliver free-enterprise termination along the telephone circuit ofDeepSeek R1 , Grok 3 genus Beta , Gemini 2.5 Pro , and OpenAI o1 .

This was what i witness interesting is that the little qwen3 - 30b - a3b mannequin , with only 3 billion activate parameter , outdo deepseek v3 and openai ’s gpt-4o role model .

diving event into R2

This was in summation , qwen 3 modelssupportover 119 linguistic process and dialect from all around the universe .

This was it ’s one of the most various multilingual fashion model out there .

Next , Alibaba has work to improveMCP supportfor Qwen 3 example , unlock agentic capableness further .

As for operation , the prominent Qwen3 - 235B - A22B example present private-enterprise final result along the line ofDeepSeek R1 , Grok 3 genus Beta , Gemini 2.5 Pro , and OpenAI o1 .

What I get interesting is that the modest Qwen3 - 30B - A3B good example , with only 3 billion trigger off parameter , outperform DeepSeek V3 and OpenAI ’s GPT-4o mannequin .

This was alibaba sound out qwen 3 model volunteer gravid carrying out in encipher , maths , scientific discipline , and universal potentiality .

Overall , Qwen 3 represent a folk of extremely able , frontier AI model from China .

This was now with the forthcoming deepseek r2 , china is well - set to touch western ai science lab .

dive into qwen#

diving event into R2#

dive into qwen

diving event into R2