FairyFuse: Multiplication-Free LLM Inference on CPUs via Fused Ternary Kernels

(arxiv.org)

9 points | by PaulHoule 6 hours ago ago

No comments yet.