| Model / Service | Type | Input | Output | Unit |
|---|---|---|---|---|
| A100 40GB | gpu-hour | - | - | /1M tokens |
| A100 80GB 8x | gpu-hour | - | - | /1M tokens |
| DeepSeek llama3.3-70b | tokens | $0.200 | $0.600 | /1M tokens |
| DeepSeek r1-671b | tokens | $0.800 | $0.800 | /1M tokens |
| H100 SXM 80GB | gpu-hour | - | - | /1M tokens |
| H100 SXM 8x | gpu-hour | - | - | /1M tokens |
| hermes3-405b | tokens | $0.800 | $0.800 | /1M tokens |
| hermes3-70b | tokens | $0.120 | $0.300 | /1M tokens |
| hermes3-8b | tokens | $0.025 | $0.040 | /1M tokens |
| lfm-40b | tokens | $0.100 | $0.200 | /1M tokens |
| lfm-7b | tokens | $0.025 | $0.040 | /1M tokens |
| llama3.1-405b-instruct-fp8 | tokens | $0.800 | $0.800 | /1M tokens |
| llama3.1-70b-instruct-fp8 | tokens | $0.120 | $0.300 | /1M tokens |
| llama3.1-8b Instruct | tokens | $0.025 | $0.040 | /1M tokens |
| llama3.1-nemotron-70b-instruct-fp8 | tokens | $0.120 | $0.300 | /1M tokens |
| llama3.2-11b-vision Instruct | tokens | $0.015 | $0.025 | /1M tokens |
| llama3.2-3b Instruct | tokens | $0.015 | $0.025 | /1M tokens |
| llama3.3-70b-instruct-fp8 | tokens | $0.120 | $0.300 | /1M tokens |
| Llama 4-maverick-17b-128e-instruct-fp8 | tokens | $0.050 | $0.100 | /1M tokens |
| Llama 4-scout-17b-16e Instruct | tokens | $0.050 | $0.100 | /1M tokens |
| qwen25-coder-32b Instruct | tokens | $0.050 | $0.100 | /1M tokens |
| qwen3-32b-fp8 | tokens | $0.050 | $0.100 | /1M tokens |
| RTX A6000 | gpu-hour | - | - | /1M tokens |