Dispersion loss counteracts embedding condensation in small language models

(chenliu-1996.github.io)

37 points | by E-Reverance 13 hours ago ago

8 comments