Image enhancement for OCR
When your primary images are grayscale or color, black-and-white (B/W) OCR images are generated for OCR purposes. You can view and modify these. Although the OCR process tolerates low quality images, they should preferably contain well-formed characters without “noise” (e.g. spots or smudges or marginal shadow lines).
You can use the following three tools on the SET toolbar to enhance an image for OCR purposes: Despeckle, OCR Brightness and Dropout color. Changes will be applied to the whole image unless any areas are selected.
Use this tool on B/W images to remove dots or spots 1 or 2 pixels large. Move the slider for the best result. Be careful with noise removal, because if it is too strong, the character shape itself can also be destroyed. Choose from Normal, Halftone and Salt & Pepper despeckling.
This tool has an effect on the B/W image, but is useful only when the primary image is color or grayscale, because the program generates a new OCR image using your changed setting. It cannot improve quality when the primary image is B/W. In those cases, you should rescan the document.
Brightness plays an important role in OCR accuracy. After loading an image, check its appearance. If characters are thick and touching, lighten the brightness. If characters are thin and broken, darken it. Use the OCR Brightness tool to optimize the image. The diagram illustrates an optimum brightness.
You can use the OCR Brightness tool also for selected image areas, so brightness can be adjusted in different ways on different parts of an image. Adjusting brightness relates both to characters and background. Generally image margins are darker. In this case select the darker image area, click the OCR Brightness tool on the OCR brightness scale to the left to lighten it.
This is used for preprinted colored forms where a different color is used for the fixed texts. This allows just the respondent data to be recognized without the form instructions, item names, boxes and other shapes.
You can select a predefined color (red, green or blue), or a colored area in the image. Use the Select area tool to draw a rectangle including the page background color and the color to be dropped. The selected color will become invisible in the OCR image.
Other tools impact both the primary and the OCR image and may also improve OCR accuracy.