NVIDIA Speeds Up Meta's Llama 4 AI Models With Blackwell GPUs, Hitting 40,000 Tokens Per Second

By Yahoo! Finance | 1 month ago

NVIDIA offers Llama 4 Scout and Llama 4 Maverick models as NIM microservices, catering to multilingual and multimodal tasks. Scout has 109 billion parameters for document summarization, while Maverick with 400 billion parameters is designed for tasks involving images and text.

Did you find this insightful?

Bad

Just Okay

Amazing

Stocks News

NVIDIA Speeds Up Meta's Llama 4 AI Models With Blackwell GPUs, Hitting 40,000 Tokens Per Second

Select an alert type

Setup alert

Exclusive Content