LocateAnything: Fast Vision-Language Grounding with Parallel Box Decoding

(research.nvidia.com)

2 points | by gmays 7 hours ago ago

No comments yet.