modelgrep
M

Meta: Llama 3.3 70B Instruct

meta-llama/llama-3.3-70b-instruct

148th smartest of 178Cheaper than 79% of paidToolsJSON
Use via OpenRouter ↗
Intelligence
14.5
148th of 178
Design Elo
Speed
115
55th fastest
Latency
244ms
first token
Input price
$0.100
62nd cheapest
Context
131K
16K max out

How it compares

Smarter than17%
of all ranked models
Faster than81%
of all ranked models
Cheaper than79%
of all ranked models

Overview

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model...

Benchmarks

independent · via OpenRouter
Artificial Analysis18th percentile
Intelligence Index
14.5
Coding Index
10.7
Agentic Index
9.1
GPQA Diamond
50%
Humanity's Last Exam
4%
SciCode
26%
Tau²-Bench (agentic)
27%

Providers & pricing (13)

ProviderIn $/MOut $/MUptime
DeepInfrafp8$0.100$0.32095.5%
Nebiusfp8$0.130$0.40097%
AkashMLfp8$0.130$0.40099.1%
Novitabf16$0.135$0.40099.9%
Parasailfp8$0.220$0.50097.2%
Cloudflarefp8$0.293$2.25100%
SambaNovabf16$0.450$0.90099.6%
Groq$0.590$0.79099.9%
SambaNovabf16$0.600$1.2099.3%
WandBfp16$0.710$0.710100%
Google$0.720$0.72095%
Google$0.720$0.720100%
Togetherfp8$1.04$1.04100%

Specifications

Context window131K
Max output16K
Knowledge cutoffDec 2023
Input modalitiestext
Output modalitiestext
Prompt caching
Cache read price
ModeratedNo

Llama 3.3 70B Instruct FAQ

How much does Llama 3.3 70B Instruct cost?

Llama 3.3 70B Instruct costs $0.100 per million input tokens and $0.320 per million output tokens via OpenRouter, making it 62nd cheapest of 298 paid models.

How smart is Llama 3.3 70B Instruct?

Llama 3.3 70B Instruct scores 14.5 on the Artificial Analysis Intelligence Index, ranking 148th of 178 benchmarked models, with a GPQA Diamond score of 50%.

How fast is Llama 3.3 70B Instruct?

Llama 3.3 70B Instruct generates around 115 tokens per second with 244ms time-to-first-token (p50), the 55th fastest tracked model.

What is Llama 3.3 70B Instruct's context window?

Llama 3.3 70B Instruct supports a 131K-token context window and can output up to 16K tokens. It accepts text input.

Compare head-to-head