I’m looking for recommendations for an OCR or HTR software that does not utilize AI or if it does, has strong privacy protections.
If you’re technically inclined, try running Surya, which was recently incubated by Mozilla Builders in 2024.[1] I’ve seen folks recommend Mitta OCR, OpenParse, and PDFPlumber, but personally, I’ve not used those.
If you’re okay with an “okay” solution, try Simon Wilson’s Tesseract OCR in Browser.
Disclaimer: A FOSS project I co-develop was also once incubated by Mozilla Builders in 2020, but other than that, no real affiliation with Vik & his startup. ↩︎