Accuracy: 9 out of 10
Speed: 10 out of 10
Ease of use: 10 out of 10
Pros: Best in category accuracy and speed. Supports most of the input file types as well as has a lot of output file type options. All well known scanner brands are supported. Editing of processed pages even while processing of the rest of the document is going on is a very efficient feature.
Cons: Text embedded within drawn images is usually not recognized. Similarly, hand drawn/decorated text won't be recognized like is present in maps or book front pages.
Ever wondered how big government offices digitize their decades and centuries old data? Scanned as well as hand written documents still form a large part of the official records that government of all countries have. Not just Government offices, but some private institutions too have to deal with huge stacks of scanned documents that they might need to scan and convert to editable text that can be utilized. This is where Optical Character Recognition (OCR) software comes into play. Just go for a web search on OCR software and it is not possible that you don't see ABBYY's FineReader Professional software in that list. ABBYY has been the front runner in field of OCR software since 1989 and came out with their first version of this software in 1993. This time they are out with ABBYY FineReader 12 Professional and we are going to see what they have in store for us.
With ABBYY FineReader12 they have come with a great new functionality that you can now start fine tuning the document immediately after hitting the upload scanned copy button. It means while the rest of the document is still processing you can start working on the starting area of the document for which processing is already complete.
You can convert scanned images or screenshots to editable word documents, PDF or EPUB file and even HTML webpages. Written in C/C++ language the software loads and functions quickly. You can select folder(s) where scanned images are present and FineReader 12 can do conversion in batches.
The software installation package is a little on the heavier side in size at 351 MB and the installation itself took a space of 1.35GB on our system. However, since the software doesn't install on the fly like many software nowadays, it is easy enough to download and install. The user interface of ABBYY FineReader12 is simple and light. The UI is divided primarily into 4 windows and the upper task bar. Though it is not too intuitive as we expected it to be but once you have spent sometime on it you find it quite simple to get the task done.
For our testing we decided to start with a few single page image files with different types of fonts but all in English language. One of the page was of old English where we were expecting a to give ABBYY FineReader12 a lot of challenges but were finally happy to see that even though it might not have a lot of words from the page in its dictionary, that did not prevent it from doing a good conversion of the text. As you can see below there are a only a few mistakes in the conversion. FineReader has a verification tool (Ctrl+F7) that highlights uncertainly recognized characters or words that are absent in its dictionaries and offers alternatives for substitution. For Asian languages, it provides alternatives for ideographic characters. These words as you can see are highlighted in blue and this is where you have to look for possible misses from the software.
After that we decided to give it a 371 page PDF file with pages in image form. The PDF has very clear text and through this we wanted to see the speed of FineReader in terms of processing this file. We started the conversion at 7.57.53PM and it completed at 8.14.23PM which means 16 minutes and 30 seconds which further converts to 960 seconds. That averages out to 2.6 pages per second. Each page had on average 450 words which translates to a cool 173 words per second or 10,000+ words per minute. Since we are churning numbers here let me tell you that the best recorded words per minute by a human stands at 170 wpm. According to some data most OCR software perform at a rate of 2400 wpm. When we had to wait for a good 16 minutes while FineReader was doing its thing at 2-3 seconds per page it looked like too long, but when you compare the performance of ABBYY FineReader12 with other OCR software you are bound to be impressed.
Though the performance was good there were a lot of spelling mistakes which we were expecting since we had given the image quality at 150 dpi, half of what ABBYY suggests. One area where we are a little disappointed with ABBYY FineReader12's performance is on pages which test embedded within drawings. As you can see it failed to recognize any text within 3 pages which actually had computer text like the author name on the first page. May be we just went a little overhead with our expectations!
Few areas where it can't perform is recognizing headings and font size. However, it does recognize Bold, Italics and Underline formatting.
What impressed us the most is its ability to recognize tables, inverted text, bullet lists and bar codes which a vast variety of input files which it is able to reproduce accurately in MS Word, MS Excel, HTML pages, OpenOffice and many more formats. You can even convert the scanned files content into ebook formats like EPUB.
Oh, and yeah, did we mention that it supports more than 190 Latin and non Latin languages! I did miss not having Hindi as a supported language since that is my mother tongue and I would have loved to try converting one of Hindi text pages. Anyhow, I suppose once there is demand ABBYY will be happy to consider that too.
While having images on the system from a smartphone, camera or any other device is the simple to do conversion, you can also convert the images directly from the source, even from a scanner. Most of the known scanners like from Avision, Brother, Canon, Epson, Fujitsu, HP, Kodak, Lexmark, Microtek, Mustek, Oki, Panasonic, Plustek, Ricoh, Visioneer, VuPoint, Xerox and more are recognized by ABBYY FineReader and you can directly get the editable text on your screen.
Help and Support
ABBYY offers almost all forms of support in helping you use their software. Phone calls and email support both are available. They also have the FAQ section on their website along with video tutorials. They have translated their long experience in the business of OCR into a vast knowledge base and the information from this will most of the times be able to answer any questions you might have.
Let me just say it our clearly, if you want the best OCR software and your work is that much important to you that you are willing to spend some money into premium tools, ABBYY FineReader12 is the way to go. The return you can get in terms of saving your time, energy and money is much more that you will pay for this software. I personally had a little more expectation in terms of accuracy and speed and hope to see it get even better in the later versions.
More articles: OCR