Accuracy rates can be measured in several ways, and how they are measured can greatly affect the reported accuracy rate.

This technique can be problematic if the document contains words not in the lexicon, like proper nouns. Get a custom font that matches your printout. Pattern Recognition Letters.

This relies on the input glyph being correctly isolated from the rest of the image, and on the stored glyph being in a similar font and at the same scale. Reading the Amount line of a cheque which is always a written-out number is an example where using a smaller dictionary can increase recognition rates greatly.

Early optical character recognition may be traced to technologies involving telegraphy and creating reading devices for the blind. For example, recognizing entire words from a dictionary is easier than trying to parse individual characters from script. Once your printed document is scanned and converted, it's easy to extract text to cut and paste into another application or export to Microsoft Office to edit as a text file. Pachinko allocation Latent Dirichlet allocation Latent semantic analysis. International Journal on Document Analysis and Recognition.

This means that if the software does not achieve their desired level of accuracy, a user can be notified for manual review. Optical character recognition software. Computer-assisted Example-based Rule-based Neural. Software such as Cuneiform and Tesseract use a two-pass approach to character recognition.

Learn more about editing scanned documents. Journal of Electronic Imaging. The extraction features reduces the dimensionality of the representation and makes the recognition process computationally efficient. Research and Advanced Technology for Digital Libraries. Recognition Technologies Users Association.

Advanced systems capable of producing a high degree of recognition accuracy for most fonts are now common, and with support for a variety of digital image file format inputs.


Tesseract uses its dictionary to influence the character segmentation step, for improved accuracy. Users would need to learn how to write these special glyphs. These features are compared with an abstract vector-like representation of a character, which might reduce to one or more glyph prototypes. This additional information can make the end-to-end process more accurate.

Optical character recognition

Speech recognition Speech synthesis Optical character recognition Natural language generation. This technique works best with typewritten text and does not work well when new fonts are encountered. Automated online assistant Chatbot Interactive fiction Question answering Voice user interface.

Computer recognition of visual text. Optical Character Recognition Unicode block.

Handwriting movement analysis can be used as input to handwriting recognition. Recognition of cursive text is an active area of research, with recognition rates even lower than that of hand-printed text. LexisNexis was one of the first customers, and bought the program to upload legal paper and news documents onto its nascent online databases. From Wikipedia, the free encyclopedia.

Comparison of optical character recognition software. Office for Civil Rights Headquarters U. Wikimedia Commons has media related to Optical character recognition.

Kurzweil decided that the best application of this technology would be to create a reading machine for the blind, which would allow blind people to have a computer read text to them out loud.

Future Challenges in Handwriting and Computer Applications. We apologize for any inconvenience this may cause.

Early versions needed to be trained with images of each character, and worked on one font at a time. Higher rates of recognition of general cursive script will likely not be possible without the use of contextual or grammatical information. New text matches the look of the original fonts in your scanned image.

Segmentation of fixed-pitch fonts is accomplished relatively simply by aligning the image to a uniform grid based on where vertical grid lines will least often intersect black areas. Natural language processing. Timeline of optical character recognition. Need to update a paper document? Knowledge of the grammar of the language being scanned can also help determine if a word is likely to be a verb or a noun, for example, allowing greater accuracy.