Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

102

Base only

Active filters: cpu-inference

tatvaAi/gemma-4-E4B-it-IQ4_NL

Text Generation • 8B • Updated Apr 14 • 41

tatvaAi/gemma-4-E4B-it-Q5_K_M

Text Generation • 8B • Updated Apr 14 • 25

ai4all8/VoxCPM2-ONNX

Text-to-Speech • Updated Apr 12 • 3

parrishcorcoran/MedusaBitNet-2B-4T

Text Generation • Updated Apr 13 • 9

sunil-pathak/Qwen3.5-9B-Q6_K

Text Generation • 9B • Updated Apr 19 • 17

sunil-pathak/Qwen3.5-9B-Q5_K_M

Text Generation • 9B • Updated Apr 19 • 18

sunil-pathak/Qwen3.5-9B-IQ4_NL

Text Generation • 9B • Updated Apr 19 • 47

sunil-pathak/gemma-4-E4B-it-Q4_K_M

Text Generation • 8B • Updated Apr 19 • 48

sunil-pathak/gemma-4-E2B-it-Q4_K_M

Text Generation • 5B • Updated Apr 17 • 18

sunil-pathak/gemma-4-E2B-it-Q5_K_M

Text Generation • 5B • Updated Apr 17 • 24

sunil-pathak/gemma-4-E2B-it-Q6_K

Text Generation • 5B • Updated Apr 17 • 9

sunil-pathak/gemma-4-E2B-it-IQ4_NL

Text Generation • 5B • Updated Apr 17 • 56

bdatdo0601/slanet-1m-onnx

Image-to-Text • Updated Apr 16

sunil-pathak/gemma-3n-E2B-it-IQ4_NL

Text Generation • 4B • Updated Apr 19 • 11

sunil-pathak/gemma-3n-E2B-it-Q4_K_M

Text Generation • 4B • Updated Apr 19 • 16

sunil-pathak/gemma-3n-E2B-it-Q5_K_M

Text Generation • 4B • Updated Apr 19 • 7

sunil-pathak/gemma-3n-E2B-it-Q6_K

Text Generation • 4B • Updated Apr 19 • 6

sunil-pathak/Mistral-7B-Instruct-v0.3-Q4_K_M

Text Generation • 7B • Updated Apr 19 • 17

sunil-pathak/Mistral-7B-Instruct-v0.3-Q5_K_M

Text Generation • 7B • Updated Apr 19 • 21

sunil-pathak/Mistral-7B-Instruct-v0.3-Q6_K

Text Generation • 7B • Updated Apr 19 • 7

sunil-pathak/Mistral-7B-Instruct-v0.3-IQ4_NL

Text Generation • 7B • Updated Apr 19 • 9

jasonzhang76/VoxCPM2-ONNX

Text-to-Speech • Updated Apr 20

jasonzhang76/Qwen3-ASR-0.6B-ONNX-CPU

Automatic Speech Recognition • Updated Apr 20 • 2

chfm/VoxCPM2-ONNX

Text-to-Speech • Updated Apr 22

amd/Llama-3.1-8B-Instruct-da8w8-torchao-v0.16.0

Text Generation • Updated 28 days ago • 2.17k • 1

priyankapathak/gemma-4-E4B-it-Q5_K_M

Text Generation • 8B • Updated 29 days ago • 64

priyankapathak/gemma-4-E4B-it-Q6_K

Text Generation • 8B • Updated 29 days ago • 48

OkeyMetaLtd/Reframr-RFM-v1-Base

Text Generation • 44.3M • Updated 25 days ago • 19

amd/Qwen2.5-VL-7B-Instruct-da8w8-torchao-v0.16.0

Image-Text-to-Text • Updated 24 days ago • 94

amd/Phi-4-da8w8-torchao-v0.16.0

Text Generation • Updated 24 days ago • 1.15k