AI & ML interests
VLMs and long context, document processing and understanding, confidence, calibration, alignment, and decision making.
Recent Activity
Papers
GutenOCR: A Grounded Vision-Language Front-End for Documents
PubMed-OCR: PMC Open Access OCR Annotations
Organization Card
Data and models for optical character recognition
-
PubMed-OCR: PMC Open Access OCR Annotations
Paper • 2601.11425 • Published • 12 -
GutenOCR: A Grounded Vision-Language Front-End for Documents
Paper • 2601.14490 • Published • 37 -
rootsautomation/TABMEpp
Viewer • Updated • 122k • 93 • 5 -
rootsautomation/pubmed-ocr
Viewer • Updated • 1.55M • 3.51k • 70
Data and models for optical character recognition
-
PubMed-OCR: PMC Open Access OCR Annotations
Paper • 2601.11425 • Published • 12 -
GutenOCR: A Grounded Vision-Language Front-End for Documents
Paper • 2601.14490 • Published • 37 -
rootsautomation/TABMEpp
Viewer • Updated • 122k • 93 • 5 -
rootsautomation/pubmed-ocr
Viewer • Updated • 1.55M • 3.51k • 70
datasets 13
rootsautomation/pubmed-ocr
Viewer
• Updated
• 1.55M • 3.51k • 70
rootsautomation/TABMEpp
Viewer
• Updated
• 122k • 93 • 5
rootsautomation/websrc-test
Viewer
• Updated
• 40.4k • 35
rootsautomation/websrc
Viewer
• Updated
• 360k • 945 • 7
rootsautomation/RICO-ScreenAnnotation
Viewer
• Updated
• 22.1k • 59 • 12
rootsautomation/RICO-ScreenAnnotation-f
Viewer
• Updated
• 22.1k • 56 • 7
rootsautomation/RICO-ScreenQA-Complex
Viewer
• Updated
• 11.8k • 330 • 16
rootsautomation/RICO-ScreenQA-Short
Viewer
• Updated
• 86k • 556 • 4
rootsautomation/RICO-ScreenQA
Viewer
• Updated
• 86k • 177 • 11
rootsautomation/RICO-Screen2Words
Viewer
• Updated
• 22.4k • 135 • 9