Automatic training

Automatic training, called IntelliTrain, takes input from the corrections a user makes during proofing. The program remembers the character shape and the changed solution for this shape and searches other similar character shapes in the document.

To have training data generated, Enable IntelliTrain must be selected in the Proofing panel of the Options dialog box. You must also make changes as you proof text, either through the OCR Proofreader dialog box or using a shortcut menu on a suspect word. Other editing does not generate training data. Even proofing changes do not always generate training data – IntelliTrain decides which changes to conserve.

Let’s see an example how IntelliTrain works.

Omnipage eng train aut1 Automatic training

OmniPage might interpret this bitmap as ‘bcnefit’. During proofing you change ‘bcnefit’ to ‘benefit’. IntelliTrain remembers the shape of this problem character and the rule: this is not ‘c’, this is an ‘e’. IntelliTrain searches other similar character shapes in the document and considers changing them:
 

 

Similar character shapes for ‘e’ in the same document

Recognized
words

Words changed
by IntelliTrain

Omnipage eng train aut2 Automatic training

thcrc

there

Omnipage eng train aut3 Automatic training

Whcncvcr

Whenever

 

To have training data generated by IntelliTrain:

  1. Open the Options dialog box at the Proofing panel and enable IntelliTrain.

  2. Select three or four pages from the start of a long document, whose typeface and quality is typical of the whole document.

  3. Recognize and then proof those pages. Make corrections as necessary.

  4. Open the Edit Training dialog box and examine the character shapes and the OCR solutions assigned to them. See Training files for more information on editing training data.

 

Automatic training