"Not all quantized model perform good", serving framework ollama uses NVIDIA gpu, llama.cpp uses CPU with AVX & AMX
v1k
xbruce22
AI & ML interests
None yet
Recent Activity
liked a model about 3 hours ago
EnsueAI/DeepSeek-V4-Flash-Base-INT4 liked a model 10 days ago
google/magenta-realtime-2 liked a model about 1 month ago
antirez/deepseek-v4-gguf