meta-llama/llama-3.1-8b-instruct
Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 8B instruct-tuned version is fast and efficient. It has demonstrated strong performance compared to...
| Provider | In $/M | Out $/M | Context | Uptime |
|---|---|---|---|---|
| DeepInfrafp8 | $0.020 | $0.030 | 131K | 99.8% |
| Novitafp8 | $0.020 | $0.050 | 16K | 99.9% |
| DeepInfrabf16 | $0.020 | $0.050 | 131K | 100% |
| Groq | $0.050 | $0.080 | 131K | 99.9% |
| Cloudflarefp8 | $0.152 | $0.287 | 32K | 99.9% |
| WandBbf16 | $0.220 | $0.220 | 128K | 100% |
Llama 3.1 8B Instruct costs $0.020 per million input tokens and $0.030 per million output tokens via OpenRouter, making it 3rd cheapest of 298 paid models.
Llama 3.1 8B Instruct scores 11.8 on the Artificial Analysis Intelligence Index, ranking 161st of 178 benchmarked models, with a GPQA Diamond score of 26%.
Llama 3.1 8B Instruct generates around 145 tokens per second with 143ms time-to-first-token (p50), the 31st fastest tracked model.
Llama 3.1 8B Instruct supports a 131K-token context window and can output up to 16K tokens. It accepts text input.