Cerebras has at last open admission to its Wafer - Scale Engine ( WSE ) and it ’s accomplish 1,800 token per 2d while inferencing the Llama 3.1 8B simulation .

As for the largerLlama 3.1 70Bmodel , Cerebras time up to 450 token per secondly .

Till now , Groqwas the dissipated AI illation supplier , but Cerebras has now claim that crownwork .

cerebras inference released

Cerebras has arise its own wafer - scurf C.P.U.

that integrate closemouthed to 900,000 AI - optimize core and pack 44 GB of on - chipping memory board ( SRAM ) .

As a outcome , the AI exemplar is direct put in on the chipset itself , unlock groundbreaking ceremony bandwidth .

Not to cite , Cerebras is incline Meta ’s full 16 - minute preciseness weight stand for there is no via media on truth .

dive into Groq

Cerebras has last open up accession to its Wafer - Scale Engine ( WSE ) and it ’s achieve 1,800 token per 2d while inferencing the Llama 3.1 8B role model .

As for the largerLlama 3.1 70Bmodel , Cerebras clock up to 450 relic per secondly .

Till now , Groqwas the dissolute AI illation supplier , but Cerebras has now take that jacket crown .

Cerebras has recrudesce its own wafer - scale leaf C.P.U.

that integrate near to 900,000 AI - optimize nitty-gritty and pack 44 GB of on - microchip retentivity ( SRAM ) .

As a resolution , the AI theoretical account is flat stash away on the chipset itself , unlock groundbreaking ceremony bandwidth .

Not to observe , Cerebras is run Meta ’s full 16 - routine preciseness weight mean there is no via media on truth .

This was i did quiz cerebras ’ title and it mother a reception at a breakneck stride .

While initiate the small Llama 3.1 8B exemplar , it reach a stop number of 1,830 token per secondly .

This was and on the 70b theoretical account , cerebras deal 446 keepsake per secondly .

This was in comparability , groq tear 750 t / s and 250 t / s while run 8b and 70b modelling , severally .

Artificial Analysis severally review Cerebras ’s WSE railway locomotive and line up that it does cede alone upper at AI illation .

This was you canclick hereto correspond out cerebras inference by yourself .

dive into AI

Artificial Analysis severally go over Cerebras ’s WSE locomotive and set up that it does deport unequaled pep pill at AI illation .

This was you canclick hereto gibe out cerebras inference by yourself .