Office Lens: Teaching the World’s Computers to Read

Cosmoso takes a look at The Office Lens app, announced earlier this year.

Office Lens turns your phone into an ultra-productive scanner – but how? Optical character recognition (OCR) is the area of developing software able to analyze the printed word. OCR is used in handwriting analysis on your cellphone to mail-sorting work-horses at the USPS.
How does the human mind understand handwriting? Some jobs require humans to recognize and decipher handwriting, like US Postal Service workers, who process roughly 30 million handwritten envelopes a day.  Some of this process is automated and the day is coming where a computer might be even faster than humans at decoding nigh-illegible scratch work. Most of us are used to learning the limitations of a computer system and talking to devices in the language parameters they require. On this subject, computers are forced to learn the language humans use reflexively and it’s tough to teach a computer to see what we do in handwritten messages~!


“I do not know where family doctors acquired illegibly perplexing handwriting; nevertheless, extraordinary pharmaceutical intellectuality, counterbalancing indecipherability, transcendentalizes intercommunications’ incomprehensibleness.”

Linguist Dmitri Borgmann the above sentence about medical doctor handwriting, which is notoriously difficult to read. . You might notice a pattern; each word is a single letter longer than the previous word, all the way up from a single character to twenty.

How does OCR work?

Even if you only take it one letter at a time the demands of OCR are high because everyone has a different way of writing each letter. With type printed text, there are still vast assortment of different typefaces & fonts. Each letter can be written in a huge variety of different ways.

There are two approaches to solving this problem. The computer can try to recognize characters as a whole, using pattern recognition or they can detect and identify individual lines and strokes the letters are made from through a process called feature detection.

Pattern Recognition

The pattern recognition approach has been around since the 1960s, when a special font called OCR-A was invented for use on high volume processing for bank checks and IRS paperwork. OCR-A was a monospace font, wherein each letter in the OCR-A font had to be exactly the same width.  Check-printing machines all used that font, and in a day before personal computers, mechanical OCR equipment was able to recognize it. OCR of the 60’s solved the problem by standardization. Of course this technology is pretty useless for recognizing various fonts and of course it’s even less useful for handwriting! The next development in pattern recognition was to increase the database of recognized fontsto include Times, Helvetica, Courier, and so on. By the 90’s computers could recognize printed fonts but odds were slim they would recognize one not in their cannon, even if it was very similar.

Feature detection

ocr-feature-detectionIntelligent character recognition works a lot better than pattern recognition, on a wider range of fonts and handstyles. Feature recognition breaks down the character into angles and connections between strokes. For example, almost every  capital A  has the same number of strokes and intersections between those strokes. Most of the time feature recognition can recognize capital letter As in any style. The program doesn’t look for the complete pattern of an A. Instead, it analyzes component features, which takes a faster computer but is exponentially more effective in recognizing unfamiliar text.  The omnifont OCR code used in Office Lens uses feature detection rather than pattern recognition, incorporating a neural network style coding platform that imitates the human brain’s decentralized pattern detecting ability.

Handwriting Recognition: The Next Level

Laser-printed fonts are pretty easy to decode compared to hand scrawled handwriting. Human brains out perform even the best CPU, and the methodology behind this is still not completely understood. Chances are, you would be able to figure out what my note says no matter how terrible my writing looks. Humans use a combination of pattern recognition and  feature extraction. Neurologists also believe, quite understandably,  knowledge about the author and the intended meaning of the message come into play, too. A computer might not see why a note from a spouse asking someone to “watch the kids” should definitely not say “whack to bits”, but hopefully we can all trust a spouse to figure it out.

Computers actually do have context for the tasks we ask them to do, though. A mail-sorting computer knows there is a format to the way our addresses are written. By only operating in a  relatively small parameter set, a computer can see what’s on each line and make some educated guesses about numbers, characters and abbreviations. The Post Office strongly requests we all write zip codes legibly, too, to give the computers a chance.


Recognize the Recognition Tech?

If you’ve ever noticed older forms using OCR, like the Iowa tests back in grade school, used boxes for students to write in each letter. Those boxes are called comb fields. The idea is to encourage people write legibly without overlapping letters. 


Smartphones and tablets with handwriting recognition recognize letters as they are written by using feature recognition. A touchscreen senses the shape you draw, feature by feature, using common order of letter drawing to help interpret. The computer recognizes as you form the character and The Office Lens actually boasts a recognition ability that rivals the human mind.

Jonathan Howard
Jonathan is a freelance writer living in Brooklyn, NY
on Twitter