NVIDIA Speeds Up Meta's Llama 4 AI Models With Blackwell GPUs, Hitting 40,000 Tokens Per Second

NVIDIA offers Llama 4 Scout and Llama 4 Maverick models as NIM microservices, catering to multilingual and multimodal tasks. Scout has 109 billion parameters for document summarization, while Maverick with 400 billion parameters is designed for tasks involving images and text.