Offline Handwriting Recognition

Can Paperwork be used for recognizing handwritten text from input image file or pdf?


Basically, it’s also the wrong “approach”.

Handwriting recognition on scanned documents is inherently more difficult than on vector-based formats, e.g. devices with a stylus because these devices DO have a pathway how the handwriting was created. Which is very much context that is missing from a bitmap scan.


Short answer: As @yacc143 said, it can’t.

Long answer: Paperwork uses Tesseract 4 for the OCR, and Tesseract doesn’t support handwritten text.

@yacc143 : I don’t think having the pathway or not matters anymore. With deep-learning, it looks like the MNIST challenge (single handwritten characters) is pretty much a solved problem nowadays. My guess is that the problem currently is more about segmenting the words/characters. I’m not sure how well that works at the moment.

Ah, the certainties of childhood when I played with (and coded for) an 8-bit computer what a computer can do or not are being struck down. Where will this end? With self-driving cars? Computers calling businesses on the phone and inquire about changed opening times on their own? Ah, I’m living in a science fiction dystopia ;(