The early versions were basically OCR + prompting, but I kept running into the same structural issues. Most of the work ended up being in post-processing and block detection rather than the OCR layer itself.
Still working on better handling of diagrams and more complex chemistry-style layouts.