a clearly LLM written piece about how frontier models are struggling to get past 76% accuracy on their benchmarks (they call it a "wall") in OCR tasks. that is, feeding it a picture of a document and asking it to extract the text.
Do you know a benchmark that tries to measure the bussines accuracy.
Most benchmarks focus on the charackter level.
IDP Software typically uses metadata to map information that is either not readable or missing in the document, e.g. extracting the VAT and mapping the street, house number, cip and city.
I think there are many models and many providers. However, it's really difficult to measure the accuracy on a porcess not just on a character level.
I saw that the leaderboard is hosted by Nanonets. Totally fine for me. So you might be the expert about Nanonets: Let me know if you want to update your post on my site.
I mean this is for handwritten OCR.. do humans do better?
I've been using Qwen3.6 to OCR stuff, primary receipts and it frequently accurately reads stuff on mangled/faded/folded documents that I have a hard time with... including handwritten stuff (though that's not flawless).
a clearly LLM written piece about how frontier models are struggling to get past 76% accuracy on their benchmarks (they call it a "wall") in OCR tasks. that is, feeding it a picture of a document and asking it to extract the text.
The benchmark site is here https://www.idp-leaderboard.org/
They say some specialist models get better results on their benchmarks (Nanonets OCR-3 85.9%)
I linked your board already. You are right.
Do you know a benchmark that tries to measure the bussines accuracy.
Most benchmarks focus on the charackter level.
IDP Software typically uses metadata to map information that is either not readable or missing in the document, e.g. extracting the VAT and mapping the street, house number, cip and city.
I think there are many models and many providers. However, it's really difficult to measure the accuracy on a porcess not just on a character level.
https://idp-software.com/vendors/nanonets/
I saw that the leaderboard is hosted by Nanonets. Totally fine for me. So you might be the expert about Nanonets: Let me know if you want to update your post on my site.
I mean this is for handwritten OCR.. do humans do better?
I've been using Qwen3.6 to OCR stuff, primary receipts and it frequently accurately reads stuff on mangled/faded/folded documents that I have a hard time with... including handwritten stuff (though that's not flawless).