TopOCR - Bringing Enhanced Tesseract OCR to Document Cameras
Tesseract OCR is a super accurate multi-lingual OCR classifier that can be used either in Accessible Mode with a Document Camera or in GUI Mode either directly with a UVC Video Interface device such as a Document Camera or WebCam or through the File Interface or from the clipboard or TWAIN compatible scanner.
We have made many modifications to the standard Tesseract OCR system. Among the many enhancements is a new document layout analysis front-end that is based on our own high speed connected component analysis system that is also responsible for providing automatic text orientation correction. We have also added a new back-end to the classification function in Tesseract that lets you switch between two different OCR engines (LSTM OCR and TAO OCR) at run-time for classification.
Tesseract LSTM OCR is a super accurate multi-lingual OCR classifier that has been greatly optimized by TopOCR.
Tesseract TAO OCR is derived from the same OCR engine used in Microsoft's "Seeing AI" application! However, instead of an Android or iPhone app that executes through a cloud interface, TAO OCR executes MUCH more quickly directly on your PC, producing super accurate OCR that runs at warp speed! TAO OCR has been fully integrated into the Tesseract OCR system at the classifier level, allowing the user to select either the LSTM OCR engine or the TAO OCR engine as Tesseract's main recognition engine for document camera scanning and for image file reading.
Whichever OCR engine you select, you can rely on the fact that the accuracy of each individual OCR engine is greatly enhanced by TopOCR's advanced multi-core image processing functions.
Tesseract LSTM OCR (LSTM Recurrent Neural Network + Static Classifier Architecture)
Tesseract LSTM OCR can read eleven different languages (English, Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Spanish, Swedish).
The primary character classifier function in Tesseract OCR is based on an implementation of a Long Short-Term Memory neural network or LSTM network.
LSTM neural networks outperform all other alternative neural network architecture models for this type of pattern recognition and also outperform the more "classical" character recognition algorithms used by the top selling commerical OCR products.
For example, an LSTM network achieved the best known results in unsegmented connected handwriting recognition, and in 2009 won the ICDAR handwriting competition.
The accuracy of an LSTM network is heavily dependent on the training data.
The training data used in the new Tesseract LSTM included a significant amount of degraded images produced by cameras.
If Tesseract's LSTM recognizer fails on a particular character sequence, it can "fall-back" to its generic static shape classifier to make the determination.
The amount of computation required for LSTM network character recognition is about 50 times greater than for character recognition performed using a static classifier. To help speed up the processing, we are utilizing SSE2 instructions for the inner neural network calculations. We have also achieved a significant performance increase by making extensive use of multi-threading (running on multiple-CPUs) in the most CPU intensive portions of the OCR and image processing functions. To optimize multi-threading, TopOCR will automatically scale the number of threads the program uses based on the number of processors or "cores" on your PC. On a low-end DeskTop PC using a 4-core Intel 3.4GHz i7-6700 CPU, our implementation of Tesseract's LSTM neural network OCR engine takes about 6 seconds to read a 5.0 MP image and TopOCR's image pre-processing (binarization and straighten columns) adds about another second. Because of the enormous performance improvement achieved by using multi-processing, we recommend ONLY running TopOCR on a 4-core or better CPU! As 8-core and even 16-core CPUs become more mainstream, TopOCR will be able to automatically maximize performance for these CPUs!
TAO OCR - Tesseract Accelerated OCR
TAO OCR is a high performance multi-threaded, multilingual recognition engine that has been integrated into the Tesseract OCR System at the classifier level.
TAO OCR takes document camera OCR to a whole new level by achieving scanner level accuracy at up to 10 times the speed of a scanner!
It relies upon Tesseract's low-level document layout analysis functions to collect fundamental page information that helps it perform operations like column straightening and automatic text orientation correction.
Compared to Tesseract's standard LSTM classifier, TAO OCR is significantly faster and almost as accurate, especially on lower quality camera images.
If you are using Windows 10, you can select either the TAO OCR classifier or the LSTM OCR classifier in the DocCam dialog or the OCR Settings dialog.
With a 4-core Intel 3.4GHz i7-6700 CPU, TAO OCR has an average reading speed of about 1.8 seconds per page for a 5.0 MP image.
This figure includes all image pre-processing and low-level document layout analysis.
The current version of TAO OCR has a skew tolerance of plus or minus 12 degrees and may reject pages that have skew angles greater than that.
TAO OCR can read curled book pages as well as pages that have poor lighting or poor contrast.
TAO OCR requires software libraries that are only available with Windows 10, so TAO OCR will not run on earlier versions of Windows.
TAO OCR supports all eleven TopOCR supported languages (English, Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Spanish, Swedish). However, it generally will only initially support the system language used by your OS. To add additional languages to TAO OCR, please see how to install a language pack for Windows 10.
TopOCR OCR (Shape Analysis Static Classifier Architecture)
TopOCR OCR is the third OCR engine in TopOCR!
It is our own ultra-high speed fixed-function OCR engine that is used ONLY for reading images from traditional TWAIN flatbed image scanners and from multi-page PDF files.
Please note that these features are only available in the GUI mode.
TopOCR OCR can read eleven different languages (English, Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Spanish, Swedish) and is the fastest OCR engine on the planet!
It works on the principle of analyzing the shape of characters and using a high speed decision tree for classification.
TopOCR OCR is automatically used whenever you scan a document using a TWAIN scanner, or whenever you want to automatically read a multi-page PDF file. The latest release of TopOCR OCR can extract text from PDF files at the rate of up to 10+ PAGES PER SECOND on a high-end PC!
Why not try our Demo and see for yourself the impressive performance that TAO OCR has to offer!