Inference Providers
Active filters: 8-bit
mlx-community/DeepSeek-R1-Distill-Llama-8B-8bit
2B • Updated • 121
• 4
mlx-community/Qwen2.5-7B-Instruct-1M-8bit
Text Generation
• Updated • 49
• 4
mlx-community/Qwen2.5-14B-Instruct-1M-8bit
Text Generation
• 4B • Updated • 202
• 10
Text Generation
• 397B • Updated • 4.4k
• 277
MaziyarPanahi/Phi-4-mini-instruct-GGUF
Text Generation
• 4B • Updated • 92.1k
• 12
nvidia/Llama-4-Scout-17B-16E-Instruct-NVFP4
56B • Updated • 60.3k
• 30
Text Generation
• 0.9B • Updated • 305
• 14
tiiuae/Falcon-E-3B-Instruct
Text Generation
• 0.9B • Updated • 952
• 38
nvidia/DeepSeek-V3-0324-NVFP4
Text Generation
• 397B • Updated • 39.4k
• 17
lmstudio-community/DeepSeek-R1-0528-Qwen3-8B-MLX-8bit
Text Generation
• 2B • Updated • 324k
• 16
nvidia/DeepSeek-R1-0528-NVFP4
Text Generation
• 397B • Updated • 6.63k
• 44
Text Generation
• 19B • Updated • 31k
• 8
mlx-community/DiffuCoder-7B-cpGRPO-8bit
Text Generation
• 8B • Updated • 85
• 9
nvidia/Qwen3-235B-A22B-NVFP4
Text Generation
• 133B • Updated • 9.71k
• 16
mlx-community/SmolLM3-3B-8bit
Text Generation
• Updated • 94
• 9
LGAI-EXAONE/EXAONE-4.0-1.2B-GPTQ-Int8
Text Generation
• 1B • Updated • 160
• 11
nvidia/DeepSeek-R1-NVFP4-v2
Text Generation
• 394B • Updated • 5.91k
• 7
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-8bit
Text Generation
• 31B • Updated • 200k
• 15
ramblingpolymath/Qwen3-Coder-30B-A3B-Instruct-W8A8
Text Generation
• 31B • Updated • 430
• 3
Text Generation
• 120B • Updated • 54.2k
• 21
lmstudio-community/gpt-oss-20b-MLX-8bit
Text Generation
• 21B • Updated • 6.19k
• 52
huizimao/gpt-oss-20b-helpful-MXFP4-QAT
21B • Updated • 2
nvidia/Phi-4-reasoning-plus-NVFP4
8B • Updated • 1.35k
• 9
nvidia/Llama-3.1-8B-Instruct-NVFP4
5B • Updated • 116k
• 9
Text Generation
• 5B • Updated • 29.5k
• 17
Text Generation
• 8B • Updated • 55.5k
• 8
Text Generation
• 17B • Updated • 122k
• 15
nvidia/Qwen2.5-VL-7B-Instruct-NVFP4
Text Generation
• 5B • Updated • 24k
• 15
xxrjun/gpt-oss-120b-mxfp4
120B • Updated • 11
• 1
Text Generation
• 5B • Updated • 1.6k
• 2