Offline Handwriting Recognition

Can Paperwork be used for recognizing handwritten text from input image file or pdf?


Basically, it’s also the wrong “approach”.

Handwriting recognition on scanned documents is inherently more difficult than on vector-based formats, e.g. devices with a stylus because these devices DO have a pathway how the handwriting was created. Which is very much context that is missing from a bitmap scan.


Short answer: As @yacc143 said, it can’t.

Long answer: Paperwork uses Tesseract 4 for the OCR, and Tesseract doesn’t support handwritten text.

@yacc143 : I don’t think having the pathway or not matters anymore. With deep-learning, it looks like the MNIST challenge (single handwritten characters) is pretty much a solved problem nowadays. My guess is that the problem currently is more about segmenting the words/characters. I’m not sure well that works at the moment.