Adaptive speculative decoding: picking draft lengths at runtime

(fergusfinn.com)

2 points | by hasheddan 8 hours ago ago

No comments yet.