Pollux – a natively vector quantized LLM with 0.76 bits per parameter

(github.com)

1 points | by pollux_llm 4 hours ago ago

1 comments