Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery [pdf]

(research.nvidia.com)

2 points | by gmays 8 hours ago ago

No comments yet.