NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

nvidia/llama-3.3-nemotron-super-49b-v1.5

125th smartest of 178ReasoningToolsJSON

Use via OpenRouter ↗

Intelligence

18.7

125th of 178

Design Elo

—

Speed

167th fastest

Latency

168ms

first token

Input price

$0.400

147th cheapest

Context

131K

16K max out

How it compares

Smarter than30%

of all ranked models

Faster than43%

of all ranked models

Cheaper than51%

of all ranked models

Overview

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via SFT across math, code, science, and...

Benchmarks

independent · via OpenRouter

Artificial Analysis29th percentile

Intelligence Index

18.7

Coding Index

15.1

Agentic Index

9.4

GPQA Diamond

75%

Humanity's Last Exam

SciCode

35%

Tau²-Bench (agentic)

28%

Providers & pricing (1)

Provider	In $/M	Out $/M	Context	Uptime
DeepInfrafp8	$0.400	$0.400	131K	—

Specifications

Context window131K

Max output16K

Knowledge cutoffMar 2024

Input modalitiestext

Output modalitiestext

Prompt caching—

Cache read price—

ModeratedNo

Open weightsnvidia/Llama-3_3-Nemotron-Super-49B-v1_5 ↗

Llama 3.3 Nemotron Super 49B V1.5 FAQ

How much does Llama 3.3 Nemotron Super 49B V1.5 cost?

Llama 3.3 Nemotron Super 49B V1.5 costs $0.400 per million input tokens and $0.400 per million output tokens via OpenRouter, making it 147th cheapest of 298 paid models.

How smart is Llama 3.3 Nemotron Super 49B V1.5?

Llama 3.3 Nemotron Super 49B V1.5 scores 18.7 on the Artificial Analysis Intelligence Index, ranking 125th of 178 benchmarked models, with a GPQA Diamond score of 75%.