SynthVision: Building a 110K Synthetic Medical VQA Dataset

(huggingface.co)

3 points | by maziyar 11 hours ago ago

1 comments

maziyar 11 hours ago
We annotated 119K medical images with two frontier VLMs (Qwen 3.5, Kimi K2.5), cross-validated at 93% agreement, and produced 110K training records, all for under $500. Fine-tuning 3 small models (2-3B params) improved all benchmarks: best model reaches +15.0% average exact match. Everything is open-sourced: datasets, adapters, and code.