πŸ€— LLM-Perf Leaderboard πŸ‹οΈ

0 100
0 81920
Backends 🏭

β˜‘οΈ Select the backends

Precision πŸ“₯

β˜‘οΈ Select the load data types

Attentions πŸ‘οΈ

β˜‘οΈ Select the optimization

Quantizations πŸ—œοΈ

β˜‘οΈ Select the quantization schemes

Kernels βš›οΈ

β˜‘οΈ Select the custom kernels

Columns πŸ“Š

β˜‘οΈ Select the columns to display

Model πŸ€—
Experiment πŸ§ͺ
Prefill (s)
Decode (tokens/s)
Memory (MB)
Energy (tokens/kWh)
Open LLM Score (%)
4bit-gptq-exllama-v2-eager
0.087
108.613
15763.828
1063016.711
27.20*