I hope liteparse is working well for you! Funny enough since it doesn't parse as markdown, it can't be benchmarked with ParseBench (that tool is targeting slightly different use case with real-time agents etc.)
Build a benchmark to evaluate how good document parser work on a dataset of 2000 PDFs manually annotated, trying to evaluate accross multiple dimensions: charts, tables, text styling, text correctness, and attribution.
The benchmark evaluate performance on full page (not selected part of the pages), and evaluaye different OSS / crobtier model / commercial approach.
For transparency it is available as a HF leaderbaord.
Yes, it evaluate using frontier model for parsing from all 3 major provider (google, anthropic and openai). It is also easy to extend to evaluaye new model (code/dataset is available)
Nice! Interesting to see LlamaParse topping the leaderboard. We've been evaluating LiteParse for document parsing in Completeflow.ai
I hope liteparse is working well for you! Funny enough since it doesn't parse as markdown, it can't be benchmarked with ParseBench (that tool is targeting slightly different use case with real-time agents etc.)
Build a benchmark to evaluate how good document parser work on a dataset of 2000 PDFs manually annotated, trying to evaluate accross multiple dimensions: charts, tables, text styling, text correctness, and attribution.
The benchmark evaluate performance on full page (not selected part of the pages), and evaluaye different OSS / crobtier model / commercial approach.
For transparency it is available as a HF leaderbaord.
Paper: https://arxiv.org/abs/2604.08538
This is cool. Does it consider frontier VLMs?
Yes, it evaluate using frontier model for parsing from all 3 major provider (google, anthropic and openai). It is also easy to extend to evaluaye new model (code/dataset is available)