Training files

A training file contains a set of character shapes each associated with an OCR solution. Training data, both generated by IntelliTrain and created by manual training, can be saved to a training file for future use to improve accuracy. When a training file is loaded, these stored shapes are compared with problem shapes found on pages being recognized. The assigned solutions will be applied when appropriate. Any number of training files can be saved, only one can be loaded at a time. Training files can be edited. You can delete unsuitably trained shapes or change the character(s) linked to a given character shape.

  • OmniPage is a powerful, pre-trained OCR product. For most recognition work, using a training file is unnecessary. Training files are of most use for a set of long documents with text printed in an unusual or styled typeface. Inspect training data and delete misformed character shapes unlikely to be repeated in the documents before saving the data to file.

Saving training to file, loading, editing and unloading training files are all done in the Training Files dialog box. This displays all training files created by the program or added from elsewhere. When you save a training file it is by default saved to the Training subfolder, but you can select a different location, including network drives. This allows you to make your training files available to other users. When a training file is selected, its path is displayed.

You can add a previously removed training file or import a training file, for instance from a network location. You can remove a training file from the list.

To display the Training Files dialog box:

  • Choose Training File... from the Tools menu.

To save training data to a training file:

  1. Select the appropriate item under ‘File name’ in the Training Files dialog box.

[unsaved] [current] appears when training was done.

  1. Click Save and enter a name in the Save Training File dialog box that appears.

To add training data to an existing training file:

  1. Load a training file before training.

  2. Do either automatic or manual training.

  3. Click Save in the Training Files dialog box.

To edit a training file:

  1. Select a training file in the Training Files dialog box.

  2. Click Edit…

The Edit Training dialog box appears. An asterisk in the title bar indicates that there are unsaved training data.

  1. Examine the character shapes and the OCR solutions assigned to them.

    • Double-click a rectangle to change its assigned character value. Key in the new character value and press Enter. Changed values appear red.

    • Click a rectangle to select a character.

    • Press Delete to delete it; it turns gray.

    • Press Delete again to restore it.

  1. Click OK to confirm all changes.

Characters marked deleted are really deleted at this point.

To add a training file to the list:

  1. Click Add… in the Training Files dialog box.

  2. Accept the default folder or select the folder where your training file is located.

  3. Select the training file and click OK.

The training file will appear in the Training Files dialog box.

  1. Click Close.

To remove a training file from the list:

  1. Select a training file in the Training Files dialog box.

  2. Click Remove.

The selected training file will be removed from the list but it will not be deleted from the disk.

To load a training file:

  1. Select a training file in the Training Files dialog box.

  2. Click Set As Current.

or

  • Select a training file in the Training File selection box in the Proofing panel of the Options dialog box.

To unload a training file:

  1. Select [none] in the Training Files dialog box.

  2. Click Set As Current.

or

  • Select [none] in the Training File selection box in the Proofing panel of the Options dialog box.

To delete a training file:

  1. Select the training file to be deleted in your operating system. The default location is

under Vista and Windows 7
Users\<username>\AppData\Roaming\ScanSoft\OmniPage18\Training

under Windows XP:
Documents and Settings\<USERNAME>\Application Data\ScanSoft\OmniPage18\Training

Training files have an .otn extension.

  1. Click Delete. Click Yes to confirm file deletion.
     

  • The program will not be able to offer training files that have been moved to new locations without using OmniPage.

To embed a training file in an OmniPage Document

  1. Load or create a training file.

  2. In the Training Files dialog box, click the Embed.. button.

    The Embed in OPD dialog box appears.

  3. Enter a name, then click OK.

  4. Close the Training Files dialog box.

  5. Process your document and save it to the file type OmniPage Document.

When you load an OPD with an embedded training file, it appears in the list of training files as [embedded]. You can edit it. You can extract and save it.

  • Only one training file can be loaded at a time, but more than one training file can be embedded in an OPD file, to make it easier to move training files between computers.

Training files