FP8-dynamic, FP8-block, NVFP4, INT4, versions of nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B
Inference Optimization
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
FP8-dynamic, FP8-block, NVFP4, INT4, INT8 versions of Qwen3-Next-80B-A3B-Instruct and Qwen3-Next-80B-A3B-Thinking Models
-
inference-optimization/Qwen3-Next-80B-A3B-Instruct
Text Generation • 81B • Updated • 2 -
inference-optimization/Qwen3-Next-80B-A3B-Instruct-FP8
Text Generation • 81B • Updated • 4 -
inference-optimization/Qwen3-Next-80B-A3B-Instruct-FP8-block
Text Generation • 80B • Updated • 101 -
inference-optimization/Qwen3-Next-80B-A3B-Instruct-FP8-dynamic
Text Generation • 80B • Updated • 126
FP8-dynamic, FP8-block, NVFP4, INT4, versions of nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B
FP8-dynamic, FP8-block, NVFP4, INT4, INT8 versions of Qwen3-Next-80B-A3B-Instruct and Qwen3-Next-80B-A3B-Thinking Models
-
inference-optimization/Qwen3-Next-80B-A3B-Instruct
Text Generation • 81B • Updated • 2 -
inference-optimization/Qwen3-Next-80B-A3B-Instruct-FP8
Text Generation • 81B • Updated • 4 -
inference-optimization/Qwen3-Next-80B-A3B-Instruct-FP8-block
Text Generation • 80B • Updated • 101 -
inference-optimization/Qwen3-Next-80B-A3B-Instruct-FP8-dynamic
Text Generation • 80B • Updated • 126
models
47
inference-optimization/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8
Text Generation
•
32B
•
Updated
•
6
inference-optimization/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
Text Generation
•
32B
•
Updated
•
6
inference-optimization/Qwen3-Next-80B-A3B-Thinking-FP8
Text Generation
•
81B
•
Updated
•
5
inference-optimization/Qwen3-Next-80B-A3B-Thinking
Text Generation
•
81B
•
Updated
•
2
inference-optimization/Qwen3-Next-80B-A3B-Instruct-FP8
Text Generation
•
81B
•
Updated
•
4
inference-optimization/Qwen3-Next-80B-A3B-Instruct
Text Generation
•
81B
•
Updated
•
2
inference-optimization/NVIDIA-Nemotron-3-Nano-30B-A3B-quantized.w4a16
6B
•
Updated
•
13
inference-optimization/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4
18B
•
Updated
•
46
inference-optimization/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8-dynamic
32B
•
Updated
•
75
inference-optimization/Qwen3-Next-80B-A3B-Thinking-FP8-block
Text Generation
•
80B
•
Updated
•
61
datasets
0
None public yet