Paper Vision Capture Admin Guide R74.2

User Manual: PaperVision Capture AdminGuide R74.2

Open the PDF directly: View PDF .
Page Count: 466 [warning: Documents this large are best viewed by clicking the View PDF Link!]

Table of Contents
Chapter 1 - Introduction
- PaperVision Capture Terminology
- Supported Users in the Administration Console
- System Requirements
- Supported Scanners
- Logging In
- Logging Out
- Obtaining Help in PaperVision Capture
Chapter 2 - Global Administration
- Automation Service Status
- Global Administrators
- Licensing
- Maintenance Queues
- Maintenance Logs
- Process Locks
- System Settings
- Automation Service Scheduling
Chapter 3 - Entity Administration
- General Security
- Encryption Keys
- Security Policy
- System Groups
- System Users
- Current Sessions
Chapter 4 - Capture Job Configuration
- Job Definitions
- Job Steps Grid
- Job Menu
- Detail Sets
- Job Steps
- General Properties
Chapter 5 – Capture Step Configuration
- Auto Document Break
- Capture Step Settings
- Custom Code Events (Step Level)
- General Properties
- Indexes
- Manual Barcode and OCR Indexing
- Manual QC
- Operator Permissions
- Scanner Requirements
Chapter 6 - Indexing Configuration
- Custom Code Events (Step Level)
- General Properties
- Indexes
- General (Step Level)
- Index Zones
- Predefined Index Values (Job Level)
- Scanner Setup Settings
- Manual Barcode and OCR Indexing
- Manual QC
- Operator Permissions
Chapter 7 - Barcode Configuration
- Auto Document Break
- General Properties
- Indexes
- Barcode Parsing
- Barcode Zones
- Barcode Explorer
Chapter 8 – Zonal OCR
- General Properties
- OCR Parsing
- OCR Zones
- General OCR Properties
- Nuance OCR Page Properties
- Nuance Zonal OCR Properties
- Nuance OCR Recognition Modules
- Open Text Zonal OCR
Chapter 9 – Nuance Full-Text OCR
- Converter Output Properties
- OCR Page Properties
- Converter Output Formats
Chapter 10 - Open Text Full-Text OCR
- Supported Output File Types
Chapter 11 – Image Processing
- General Properties
- Configuring Image Processing Filters
- Drawing and Configuring IP Zones
- Image Processing Filters
Chapter 12 – Quality Control (QC)
- Automated QC Step
- Automated QC – Order of Operations
- Automated Batch and Document QC
- Automated Image QC
- Indexes
- Manual QC Step
- Custom Code Events (Step Level)
- General Properties
- Indexes
- Manual QC - General Properties
- Operator Permissions
Chapter 13 - Custom Code
- General Properties
- Custom Code Generators
- Digitech Systems' API
- Debugging Custom Code
- Script Editor
- Match and Merge Wizard
- Exports
- Content Types
Chapter 14 – Capture Batches
- Batch Management
- Batch Statistics
- QC Batch Statistics
Appendix A – Additional Help Resources
Appendix B – Supported Nuance OCR Spelling Languages
Appendix C – Modifying the Process Batch Operation
Appendix D – Maximum Image Sizes
Appendix E – Terminal Services Configuration
Appendix F - Supported Open Text Countries and Languages
Index

PaperVision

Capture

Administration Guide

PaperVision Capture Release 74

January 2012

Information in this document is subject to change without notice and does not represent a commitment on the part of

Digitech Systems, Inc. The software described in this document is furnished under a license agreement or

nondisclosure agreement. The software may be used or copied only in accordance with the terms of the agreement.

It is against the law to copy the software on any medium except as specifically allowed in the license or

nondisclosure agreement. No part of this manual may be reproduced or transmitted in any form or by any means,

electronic or mechanical, including photocopying and recording, for any purpose without the express written

permission of Digitech Systems, Inc.

Printed in the United States of America.

PaperVision Capture and the Digitech Systems, Inc. logo

are trademarks of Digitech Systems, Inc.

PaperVision Enterprise is a registered trademark of Digitech Systems, Inc.

Microsoft, Windows, Windows XP, and Vista are registered trademarks of Microsoft Corporation.

All other trademarks and registered trademarks are the property of their respective owners.

PaperVision Capture contains portions of OCR code owned and copyrighted

PaperVision Capture ontains portions of OCR code owned and copyrighted

PaperVision Capture contains portions of imaging code owned and copyrighted

Digitech Systems, Inc.

8400 E. Crescent Parkway, Suite 500

Greenwood Village, CO 80111

Phone: 303.493.6900 Fax: 303.493.6979

www.digitechsystems.com

Table of Contents

PaperVision® Capture Administration Guide iii

Chapter 1 - Introduction .................................................................................................. 6

PaperVision Capture Terminology ............................................................................................ 6

Supported Users in the Administration Console ....................................................................... 9

System Requirements .............................................................................................................. 10

Supported Scanners ................................................................................................................. 11

Logging In ............................................................................................................................... 11

Logging Out ............................................................................................................................ 11

Obtaining Help in PaperVision Capture .................................................................................. 12

Chapter 2 - Global Administration ............................................................................... 13

Automation Service Status ...................................................................................................... 14

Global Administrators ............................................................................................................. 16

Licensing ................................................................................................................................. 19

Maintenance Queues ............................................................................................................... 23

Maintenance Logs ................................................................................................................... 24

Process Locks .......................................................................................................................... 27

System Settings ....................................................................................................................... 28

Automation Service Scheduling .............................................................................................. 30

Chapter 3 - Entity Administration ................................................................................ 33

General Security ...................................................................................................................... 38

Encryption Keys ...................................................................................................................... 39

Security Policy ........................................................................................................................ 42

System Groups ........................................................................................................................ 44

System Users ........................................................................................................................... 47

Current Sessions ...................................................................................................................... 51

Chapter 4 - Capture Job Configuration ....................................................................... 53

Job Definitions ........................................................................................................................ 57

Job Steps Grid ......................................................................................................................... 61

Job Menu ................................................................................................................................. 64

Detail Sets................................................................................................................................ 69

Job Steps .................................................................................................................................. 72

General Properties ................................................................................................................... 75

Chapter 5 – Capture Step Configuration ..................................................................... 81

Auto Document Break ............................................................................................................. 81

Capture Step Settings .............................................................................................................. 82

Custom Code Events (Step Level)........................................................................................... 85

General Properties ................................................................................................................... 86

Indexes..................................................................................................................................... 87

Manual Barcode and OCR Indexing ....................................................................................... 87

Manual QC .............................................................................................................................. 92

Operator Permissions............................................................................................................... 94

Scanner Requirements ............................................................................................................. 95

Table of Contents

PaperVision® Capture Administration Guide iv

Chapter 6 - Indexing Configuration .............................................................................. 97

Custom Code Events (Step Level)........................................................................................... 97

General Properties ................................................................................................................... 99

Indexes..................................................................................................................................... 99

General (Step Level).............................................................................................................. 114

Index Zones ........................................................................................................................... 119

Predefined Index Values (Job Level) .................................................................................... 121

Scanner Setup Settings .......................................................................................................... 124

Manual Barcode and OCR Indexing ..................................................................................... 127

Manual QC ............................................................................................................................ 127

Operator Permissions............................................................................................................. 129

Chapter 7 - Barcode Configuration............................................................................. 131

Auto Document Break ........................................................................................................... 131

General Properties ................................................................................................................. 131

Indexes................................................................................................................................... 131

Barcode Parsing ..................................................................................................................... 132

Barcode Zones ....................................................................................................................... 135

Barcode Explorer ................................................................................................................... 140

Chapter 8 – Zonal OCR ............................................................................................... 147

Auto Document Break ........................................................................................................... 148

General Properties ................................................................................................................. 148

Indexes................................................................................................................................... 148

OCR Parsing .......................................................................................................................... 149

OCR Zones ............................................................................................................................ 152

General OCR Properties ........................................................................................................ 156

Nuance OCR Page Properties ................................................................................................ 157

Nuance OCR Zone Properties ............................................................................................... 160

Nuance OCR Recognition Modules ...................................................................................... 165

Open Text Zonal OCR........................................................................................................... 176

Chapter 9 – Nuance Full-Text OCR............................................................................ 182

Converter Output Properties .................................................................................................. 184

OCR Page Properties ............................................................................................................. 184

Converter Output Formats ..................................................................................................... 189

Chapter 10 - Open Text Full-Text OCR ..................................................................... 242

Supported Output File Types ................................................................................................. 243

Chapter 11 – Image Processing ................................................................................... 251

General Properties ................................................................................................................. 251

Image Processing Properties .................................................................................................. 251

Configuring Image Processing Filters ................................................................................... 252

Drawing and Configuring IP Zones....................................................................................... 261

Image Processing Filters ........................................................................................................ 267

Table of Contents

PaperVision® Capture Administration Guide v

Chapter 12 – Quality Control (QC) ............................................................................ 299

Automated QC Step ............................................................................................................... 299

Automated QC – Order of Operations ................................................................................... 300

Automated Batch and Document QC .................................................................................... 301

Automated Image QC ............................................................................................................ 303

Indexes................................................................................................................................... 305

Manual QC Step .................................................................................................................... 308

Custom Code Events (Step Level)......................................................................................... 311

General Properties ................................................................................................................. 312

Indexes................................................................................................................................... 312

Manual QC - General Properties ........................................................................................... 313

Operator Permissions............................................................................................................. 315

Chapter 13 - Custom Code ........................................................................................... 317

General Properties ................................................................................................................. 317

Custom Code Generators ....................................................................................................... 318

Digitech Systems' API ........................................................................................................... 321

Debugging Custom Code ...................................................................................................... 344

Script Editor .......................................................................................................................... 346

Match and Merge Wizard ...................................................................................................... 352

Exports................................................................................................................................... 358

Content Types ........................................................................................................................ 419

Chapter 14 – Capture Batches ..................................................................................... 425

Batch Management ................................................................................................................ 425

Batch Statistics ...................................................................................................................... 434

QC Batch Statistics ................................................................................................................ 441

Appendix A – Additional Help Resources .................................................................. 447

Appendix B – Supported Nuance OCR Spelling Languages .................................... 448

Appendix C – Modifying the Process Batch Operation ............................................ 453

Appendix D – Maximum Image Sizes ......................................................................... 455

Appendix E – Terminal Services Configuration ........................................................ 456

Appendix F - Supported Open Text Countries and Languages ............................... 457

Chapter 1 - Introduction

PaperVision® Capture Administration Guide 6

The PaperVision Capture Administration Console provides a single location for

global, system, and job administration. The PaperVision Capture Administration

Console helps you manage Capture jobs, batches, statistics, user and group profiles, and

automation service settings. The Job Definitions screen provides for fine-grained control over

image-capture settings when you define PaperVision Capture jobs and job steps as well as

users and groups who are assigned to these steps.

PaperVision Capture Terminology

Batch

A batch is a collection of documents and their associated index name-value pairs and statistics

that are moved as a logical unit of work through a job.

Batch Priority

Batch priority refers to the order in which (1) batches awaiting ownership are displayed in the

PaperVision Capture Operator Console and (2) batches are processed by the PaperVision

Capture Automation Service. Four values are assigned by administrators to calculate the

overall batch priority.

• Job age priority is a number associated with the job and is multiplied by the number

of elapsed minutes since the batch was created.

• The job step's age priority is a value associated with the current job step and is

multiplied by the number of elapsed minutes the batch has been waiting in the

current step.

• The job step priority is a value associated with the current job step and assigned by

an administrator.

• Administrative priority is a value associated with each specific batch. To have a

significant impact on the overall calculation, administrators can assign a wider

range of values (0-999,999) to this priority.

Administrators assign numbers to indicate batch urgency and assist with scheduling and

resource allocation. The system uses these numbers, which range from 0 (not urgent) to 100

(urgent), to schedule system resources and assign higher-priority batches to users. Batch

priority helps administrators efficiently manage job loads and enables the system to

automatically assign prioritized batches to operators in a round-robin fashion.

Chapter 1 – Introduction

PaperVision® Capture Administration Guide 7

The overall batch priority is calculated as follows:

(Job age priority x elapsed minutes since batch was created) + (step age priority x elapsed

minutes batch has been waiting in current step) + job step priority + administrative priority

Note:

If all priority values are set to zero, the overall calculated priority in the PaperVision

Capture Operator Console’s batch creation screen will remain at zero (regardless of

how long batches await ownership in the Batches Waiting list).

Detail Sets

Detail sets expand on the capabilities of standard index fields because they define "many-to-

one" relationships, which allow multiple sets of field data to reference a single document. In a

many-to-one relationship, an index field contains a value that references another field or set of

fields that contain unique values.

Document

A document is the equivalent of a file folder within a filing cabinet. A document holds all of

the pages for a given set of index values.

Image

An image is a visual representation of a picture or graphic, such as an electronic file with the

extension .bmp, .jpg, or .tif.

Index

An index is a value that users apply to a document for reference and retrieval.

Job

A job is a defined process comprised of one or more job steps through which batches are

processed. At a minimum, each job must contain a start step. Each job is unique by name

within an entity.

Job Step

A job step is an automated or manual operation that is performed on a batch. Manual job steps

are performed by assigned users through the PaperVision Capture Operator Console;

automated job steps are completed by the PaperVision Capture Automation Service, and

require no user intervention.

Chapter 1 – Introduction

PaperVision® Capture Administration Guide 8

Master Batch Repository

The Master Batch Repository is the centralized storage area where PaperVision Capture stores

all captured images. When installing PaperVision Capture in an environment containing

multiple PaperVision Capture Gateways or PaperVision Capture Automation Servers, this

location should be a network accessible location (e.g., \\SERVER\SHARE).

Page

One or more images (files with extensions .bmp, .jpg, or .tif,) comprise a single page within a

document. For example, a page can include the originally captured image and a manipulated

version of the image after noise removal.

PaperVision Capture Administration Console

The PaperVision Capture Administration Console provides administration and job

configuration capabilities.

PaperVision Capture Automation Service

The PaperVision Capture Automation Service is a Microsoft® Windows service that performs

automated tasks and batch processing at specified time intervals. Examples of work

performed by the PaperVision Capture Automation Service include the consumption of

statistics when an operator completes a batch and the processing of automated job steps.

Multiple Automatic Services can be installed on distinct machines or multiple PaperVision

Capture Automation Service processes may be configured to run on the same machine.

PaperVision Capture Data Transfer Agent Service

The PaperVision Capture Data Transfer Agent Service is a Microsoft® Windows service that

moves batches in local temporary batch repositories to/from the Master Batch Repository.

PaperVision Capture Gateway Server

The PaperVision Capture Gateway Server is an application server that enables communication

between PaperVision Capture modules and provides access to databases and the Master Batch

Repository in distributed deployment scenarios.

PaperVision Capture Operator Console

The PaperVision Capture Operator Console provides scanning, indexing, and batch

processing capabilities.

Chapter 1 – Introduction

PaperVision® Capture Administration Guide 9

Supported Users in the Administration Console

The PaperVision Capture Administration Console supports the following types of users:

• Global administrators can configure all settings for all entities.

• System administrators can administrate all settings for a particular entity.

• Capture administrators can administrate an entity's job settings, including the

configuration of jobs and job steps within the entity.

• Workflow administrators can log into the PaperVision Capture Administration Console

but cannot perform any functions. In PaperVision Enterprise, workflow administrators

are able to design and configure workflows within an entity. They can configure

workflow definitions for any project and view workflow history and workflow status

reports, but they have no access to documents or functions in any projects unless a system

administrator explicitly grants them access. If they do have access to view documents

within a project, workflow administrators can create workflow instances for a particular

document and view its workflow status.

• Users, also known as operators, work in the PaperVision Capture Operator Console. If

you assign a user to a job step, that user has access to every function configured for that

job step. You assign job steps to users so they are able to perform scanning, indexing, and

batch processing functions. Users created in PaperVision Capture can be viewed in

PaperVision Enterprise and vice versa.

Chapter 1 – Introduction

PaperVision® Capture Administration Guide 10

System Requirements

The following tables outline the minimum software requirements and recommended hardware

requirements for the PaperVision Capture application.

Minimum Software Requirements

Operating Systems

Windows XP Pro SP3 or later (both 32- and 64-

bit operating systems supported)

.NET Framework

Version 3.5 SP1 or later (included on

installation media)

Windows Installer

Version 3.1 or later (included on installation

media)

Microsoft SQL Server

SQL Server 2005 or later

Note: SQL Server 2008 R2 Express Edition is

included on installation media.

Recommended Hardware Requirements

Processor

Current processor technology is recommended

(typically, not older than four years).

RAM

2 GB

Hard Disk Space

1750 MB

Minimum Screen Resolution

1024 x 768

Chapter 1 – Introduction

PaperVision® Capture Administration Guide 11

Supported Scanners

PaperVision Capture supports more than 300 ISIS-compatible scanners. If you need

additional scanner drivers, please contact Digitech Systems’ Technical Support at

support@digitechsystems.com or by phone at (877)374-3569. If the driver is available, our

support personnel will assist you in obtaining the driver.

PaperVision Capture also offers the ability to use TWAIN scanners. The use of TWAIN

scanners is generally intended for extremely low-volume scanners as ISIS drivers are

available for most scanners on the market.

Logging In

When you log in to the PaperVision Capture Administration Console, the system

authenticates you based on the information you provide. When you launch the PaperVision

Capture Administration Console for the first time, you will be prompted to log into the

system. If this is your first time logging in, the user name is ADMIN and the password is

ADMIN.

Note:

Passwords are case-sensitive.

You can configure the PaperVision Capture Operator Console to support a terminal services

environment so that multiple users can log into a single instance of the PaperVision Capture

Operator Console. For information on how to configure PaperVision Capture for a terminal

services environment, see Appendix E –Terminal Services Configuration.

Logging Out

To log out of the PaperVision Capture Administration Console, select File > Exit. If you have

any unsaved changes, you will be prompted to save those changes before you are logged out

of the system.

Chapter 1 – Introduction

PaperVision® Capture Administration Guide 12

Obtaining Help in PaperVision Capture

To obtain Help from any page within the PaperVision Capture Administration Console, click

the Help button or press the F1 key to open a topic related to the screen you are currently

viewing. Additionally, every screen in PaperVision Capture contains the Help menu, which

contains the following items:

• Help > Help Topics opens the Online Help file.

• Help > User's Manual opens a PDF of the PaperVision Capture Administration

Guide.

• Help > About PaperVision Capture Administration Console displays a splash

screen with the copyright and version information for your version of PaperVision

Capture.

Chapter 2 – Global Administration

PaperVision® Capture Administration Guide 13

Global administration encompasses the overall functionality of PaperVision

Capture that affects all entities. To access global administration settings, log into

the PaperVision Capture Administration Console with the appropriate global administrator

credentials, and select the Global check box. Once logged in as a global administrator, you

can access global administration settings for all entities.

Global Administration Settings

• Automation Service Status displays the current status of all automation servers

connected to the PaperVision Capture database.

• Global Administrators contains PaperVision Capture's global administrators.

• Licensing allows global administrators to manage PaperVision Capture licenses for

each entity.

• Maintenance lists maintenance items to be processed by the PaperVision Capture

Automation Service and logs of completed maintenance items.

• Process Locks contains a list of operations currently locked by the system in order to

prevent attempts to run the same operation simultaneously.

• System Settings contains PaperVision Capture's Automation Service Scheduling that

automates the execution of certain operations on timed intervals. System Settings also

contains the Maximum Global Session Idle Time and Maximum Maintenance Log

Age setting for all entities.

Chapter 2 – Global Administration

PaperVision® Capture Administration Guide 14

Automation Service Status

Automation Service Status displays the current status of all automation servers connected to

the PaperVision Capture database. More than one automation server process may be running

on a single computer. You can start and stop automation service operations for any process.

To access this screen, open Global Administration > Automation Service Status.

Automation Service Status

Starting an Automation Service Process

To start a service process:

1. Highlight the process in the list.

2. Click the Start icon.

Stopping an Automation Service Process

Stopping the service operations does not stop the process itself; rather, the process receives a

command to not perform further processing after it has finished its current operation.

To stop a service process:

1. Highlight the process in the list.

2. Click the Stop icon.

Chapter 2 – Global Administration

PaperVision® Capture Administration Guide 15

Deleting an Automation Service Process

This command does not delete the process itself; rather, the status of the process is deleted

from the database.

To delete a service process:

1. Highlight the process in the list

2. Click the Delete icon.

Chapter 2 – Global Administration

PaperVision® Capture Administration Guide 16

Global Administrators

As a global administrator, you can configure any system setting for all PaperVision Capture

entities. You can also access the settings for each job and job step for all entities. To access

this screen and see the list of global administrators, open Global Administration > Global

Administrators.

Global Administrators

Creating a New Global Administrator

To create a new global administrator:

1. Click the Create New Global Administrator icon.

New Global Administrator

2. Enter the User Name that will be used to log into PaperVision Capture.

3. Enter the user's Full Name (optional). The full name is used for PaperVision Capture

reporting capabilities.

Chapter 2 – Global Administration

PaperVision® Capture Administration Guide 17

4. Enter the user's Email Address (optional). This is used to send notifications via email

to the global administrator.

5. Enter the initial Password to access the system.

6. Enter the password again to confirm it.

7. Click OK.

Setting the Global Administrator’s Password

To set a global administrator's password:

1. Highlight the global administrator in the list.

2. Click the Set Password icon.

Set Password

3. Enter the password in the New Password field.

4. Enter the password once again in the Confirm Password field.

5. Click OK.

Deleting a Global Administrator

To delete a global administrator:

1. Highlight the account to delete.

2. Click the Delete icon.

3. Click Yes to proceed with the deletion.

Chapter 2 – Global Administration

PaperVision® Capture Administration Guide 18

Editing Properties of a Global Administrator

To edit the properties of a global administrator:

1. Double-click the global administrator in the list.

2. Make the necessary modifications to the account.

3. Click OK.

Note:

Modifications take effect the next time the global administrator logs into the

PaperVision Capture Administration Console.

Exiting Global Administration

The File menu allows you to exit out of the PaperVision Capture Administration Console.

Select File > Exit to close the PaperVision Capture Administration Console and log out of the

system.

Chapter 2 – Global Administration

PaperVision® Capture Administration Guide 19

Licensing

PaperVision Capture provides Concurrent and Named licenses. Concurrent licenses are

assigned to a specific entity and are available to any user for that entity. Concurrent licenses

provide the greatest flexibility since a license is only consumed when a user is logged into the

PaperVision Capture Operator Console. If no licenses have been added in the Administration

Console, the user will be prompted that none are available for the session in the Operator

Console.

Named licenses are assigned per machine or per process, not to individual users. Named

licenses may be consumed only by the machine or process to which they are assigned. To

ensure that a specific machine is always available to process automated jobs, a named license

could be assigned to your automation server. In this case, a named license would be required

for each instance of an automation server.

When an automation service process is executing custom code that adds new documents to a

batch, then the process requires the appropriate licenses based on job configuration. You can

configure multiple automation service processes to run on a single physical machine. When

named licenses are used, each automation server process consumes a license. For example, if

three automation service processes were running on a machine named WINXP, you would

need three named licenses as follows:

1. WINXP_0

2. WINXP_1

3. WINXP_2

Conversely, for concurrent licensing, each automation service process still requires a license,

but the naming scheme is not relevant.

In most scenarios, a license is consumed when a user works on a manual step in the Operator

Console. A license is released once a user logs out of the Operator Console. Additionally, a

license is released when a user session has timed out or when a user session is “killed” via

Current Sessions in the Administration Console.

Chapter 2 – Global Administration

PaperVision® Capture Administration Guide 20

To access the Licensing screen, expand Global Administration > Licensing.

Licensing

Demo Licenses

If you want to run PaperVision Capture in demonstration mode, please contact Digitech

Systems’ Technical Support to obtain a Demo license key. The Demo license includes all

functionality within PaperVision Capture, including global administration features. The

Demo license cannot be combined with the Concurrent or Named license types.

If you add the Demo license, a watermark will be applied on all images during the batch

submittal process in the PaperVision Capture Operator Console. Since the application writes

a watermark onto each captured image, non-repudiation is not supported in demo mode.

PaperVision Capture’s Demo license is designed specifically to demonstrate the features and

functionality of the product, and is not designed for high-volume, performance testing. To

access non-repudiation technology and remove watermarks or to perform high-volume

testing, you must purchase a license of PaperVision Capture.

WARNING!

Removing the watermark is a violation of the PaperVision Capture End User License

Agreement (EULA).

Chapter 2 – Global Administration

PaperVision® Capture Administration Guide 21

Creating a New License

If you are integrating with PaperVision Enterprise, a global administrator can also add

licenses in the “thick” PaperVision Enterprise Administration Console.

To create a new license:

1. Click the Create New License icon in the toolbar, and the New License dialog

box appears.

New License

2. Enter the License Code that was included with your product documentation and

media.

3. Click the Web Authorization button to obtain the license key online.

4. Or, click the Phone Authorization button and contact Digitech Systems' Technical

Support toll-free at (877)374-3569 or direct at (402)484-7777 to obtain your license

key.

Note:

You must enter the Serial Number and Identifier Code before the license key

will be provided to you.

5. Enter the license key; then click OK. The new license will appear in the Licensing

screen.

6. To assign an entity to the license, double-click the license to open its properties.

7. Select the entity from the Assigned-To drop-down list.

8. Click OK.

Chapter 2 – Global Administration

PaperVision® Capture Administration Guide 22

Deleting a License

To delete a license:

1. Highlight the license in the list. You can also delete multiple licenses at one time.

2. Click the Delete icon.

3. Click Yes to confirm the deletion.

Editing License Properties

To edit the properties of a license:

1. Highlight the license.

2. Click the Properties icon. Licensing properties include the following

information:

• Product Name

• Version

• Quantity

• Serial Number

• License Date

• License Code

• Authorization Code

• Assigned To

• Named System

3. To assign a license to an entity, click the Assigned To drop-down menu to select

another entity.

4. To assign a license to a specific computer, enter the machine name in the Named

System field. Or, click the Browse button to locate the machine name.

5. Click OK.

Chapter 2 – Global Administration

PaperVision® Capture Administration Guide 23

Maintenance Queues

The Maintenance Queue lists batch submittals and other tasks that have been queued to be

processed by the PaperVision Capture Automation Service. Once a task has been completed,

it is automatically removed from the queue. To access maintenance queue items, open

Global Administration > Maintenance > Maintenance Queue.

Maintenance Queue

Deleting Maintenance Queue Items

Only use this command after you have viewed the Maintenance Logs and Windows Event

Viewer to identify and troubleshoot any processing errors.

If you delete a Submit Batch queue item, the batch will remain waiting for automated

processing. To remedy this, access Batch Management to change the status of the batch to

'Not Owned'. Changing the batch status allows another operator to assume ownership of the

batch and to repeat the current job step. For more information, see the section on Batch

Management in Chapter 11.

Note:

When a job step is repeated for a batch, some changes made by the previous

operator may be retained, but batch statistics for the previous operator’s work will

be deleted.

Chapter 2 – Global Administration

PaperVision® Capture Administration Guide 24

To delete a Maintenance Queue item:

1. Highlight the item(s).

2. Click the Delete icon.

WARNING:

Deleting a maintenance queue item can cause unexpected results on data integrity

and should be used only as a last resort. Before proceeding, you may want to

consult with Digitech Systems' Technical Support.

3. To proceed with the deletion, click Yes.

Maintenance Logs

Maintenance Logs provide a recorded history of maintenance jobs performed by the

PaperVision Capture Automation Service.

Viewing a Maintenance Log Entry

To view a log entry:

1. Open Global Administration > Maintenance > Maintenance Logs.

Maintenance Logs

Chapter 2 – Global Administration

PaperVision® Capture Administration Guide 25

2. In the Maintenance Logs list, double-click the maintenance log entry to view. The

Maintenance Log Properties screen opens.

Maintenance Log Properties

3. Click Close.

Chapter 2 – Global Administration

PaperVision® Capture Administration Guide 26

Filtering Maintenance Logs

The Filter command allows you to specify the maximum number of maintenance log records

to display per page.

To filter maintenance logs:

1. Click the Filter icon. The Maintenance Log Filter dialog box appears.

Maintenance Log Filter

2. Enter the maximum number of log entries to display in the screen.

3. Click OK.

Exporting Maintenance Logs

Maintenance logs can be exported to an XML file.

To export maintenance log(s):

1. Highlight the log(s) to export.

2. Click the Export icon.

3. Locate the export directory.

4. Enter the file name.

5. Click Open.

Deleting Maintenance Logs

To delete a maintenance log:

1. Highlight the log(s) in the list.

2. Click the Delete icon.

3. Click Yes to proceed with the deletion.

Chapter 2 – Global Administration

PaperVision® Capture Administration Guide 27

Process Locks

Process locks prevent multiple systems from simultaneously processing the same task. When

a system attempts to run a process, it creates a "lock" that prevents any other system from

starting the same work. For example, when System A attempts to run a task that System B is

currently processing, System A verifies that a process lock has not been placed before it sets

its own lock.

If a system encounters a failure during processing (e.g. power failure), the process lock may

not be released. In this case, you may have to manually release or delete the lock.

To delete a process lock:

1. Expand Global Administration > Process Locks.

2. In the Process Locks list, highlight the lock to delete.

3. Click the Delete icon.

4. Click Yes to confirm the deletion.

Chapter 2 – Global Administration

PaperVision® Capture Administration Guide 28

System Settings

System Settings allows you to configure the Max Global Sessions Idle Time (in minutes) and

the Max Maintenance Log age (in minutes). The Max Global Sessions Idle Time specifies the

number of minutes that a user can remain idle before the PaperVision Capture Automation

Service automatically terminates the user session (logs the user out of the system). The Max

Maintenance Log age (minutes) specifies the number of minutes that maintenance logs can

remain in the system before the PaperVision Capture Automation Service automatically

deletes them (provided that the Maintenance Log Cleanup operation has been scheduled for

completion). For sessions, each entity can have a customized setting that is specified in the

entity’s security policy. However, the global value found in System Settings determines the

maximum value that can be configured for each entity.

To configure the general system settings:

1. Expand Global Administration > System Settings.

2. Double-click the Configure System Settings icon. The System Settings screen

appears.

System Settings

3. Enter the Max Global Session Idle Time (in minutes).

Chapter 2 – Global Administration

PaperVision® Capture Administration Guide 29

4. Enter the Max Maintenance Log Age (in minutes).

5. Click OK.

Chapter 2 – Global Administration

PaperVision® Capture Administration Guide 30

Automation Service Scheduling

PaperVision Capture provides automation services that automate the execution of a number of

operations. Without starting an automation service, no automated processes will run and

backend work, such as processing submitted batches, will not be completed.

To open the Automation Service Scheduling Settings:

1. Expand Global Administration > System Settings.

2. Double-click Configure Automation Service Scheduling. For the selected

automation server, each scheduled operation is listed in the grid along with its

schedule, next/last run time, and status.

Automation Service Scheduling

Note:

More than one automation server can be configured to run on a single PC. The

number of automation servers is configured in the PaperVision Capture Setup Tool,

(Start > Programs > Digitech Systems > PaperVision Capture Setup Tool).

Automation servers on the same PC are distinguished by a trailing index (0, 1, 2,

etc.) in the automation server name.

Chapter 2 – Global Administration

PaperVision® Capture Administration Guide 31

To add a new automation service schedule:

1. Select the Automation Server from the drop-down list, and click the Add button,

which opens the New Automation Service Schedule dialog box.

New Automation Service Schedule

2. Select the Operation type from the drop-down menu. PaperVision Capture provides

automation services that automate the execution of the following operations:

• Maintenance Queue processes any maintenance items listed in the queue.

Maintenance queue items involve one-time operations such as processing

completed batches on the server or updating a specific job step’s list of predefined

index values.

• Maintenance Log Cleanup automatically deletes maintenance logs older than the

entity's specified Max Maintenance Log age setting.

• Process Batch executes automated PaperVision Capture job steps. By default, this

operation executes all associated functions. For information on configuring the

Process Batch operation to perform only specific functions, see Appendix C –

Modifying the Process Batch Operation.

• Destroy Batch automatically deletes batches that have been scheduled for

destruction.

• Session Grant Cleanup removes sessions that have remained idle as specified in

the entity's Max Session Idle Time setting.

3. Enter the Start Time when the operation will commence.

Chapter 2 – Global Administration

PaperVision® Capture Administration Guide 32

4. Select the Schedule, which is the time interval that the service will run.

5. Enter the Repetition Schedule, which is the time interval that the process will repeat.

You can schedule these operations to run at any of the following time intervals:

• Every x minutes

• Every x hours

• Every x days

• Every x weeks on specific days of the week

• On specific days of the month

6. Click OK.

7. In the Automation Service Scheduling dialog box, click Save.

To edit an automation service operation:

1. Highlight the operation in the Automation Service Scheduling list.

2. Click the Edit button.

3. Make changes to the operation.

4. Click OK.

To remove an automation service operation:

1. Highlight the operation in the Automation Service Scheduling list.

2. Click the Remove button.

3. Click Yes to confirm the removal.

Chapter 3 – Entity Administration

PaperVision® Capture Administration Guide 33

An entity is a body (e.g. a corporation or organization) that provides its own

administration. Only global and system administrators can configure an entity's

properties. Each entity contains its own users, groups, and jobs that are not shared among

entities. Entity administration can be performed either remotely or from a direct database

connection.

In general, most PaperVision Capture installations, including large enterprise installations,

will not need more than one entity. However, two entities can be configured for a distributed,

multi-user installation scenario. For example, one office (entity) can be located in Denver,

Colorado, and the other located in Lincoln, Nebraska. Each entity has a separate database, and

manages jobs, users, and batches solely for that entity. Both locations are monitored by a

single global administrator. This scenario can alleviate network congestion since each

location is a separate entity. If the Denver office becomes inundated with work and needs

assistance from Lincoln, Lincoln user accounts can be created for the Denver entity so users

can be assigned to Denver jobs. As a result, Lincoln users can simply log into the Denver

entity and process jobs for Denver.

To open an entity's properties, expand the Entities directory.

Entity Administration

The need for multiple entities can arise in specific circumstances:

• In a hosting environment where an on-demand provider is hosting data for multiple

companies and each company wants to be able to administrate itself and its users

• In a large enterprise that has different departments or cost centers that want the ability

to administrate themselves (separately from other departments) without having to

involve a central IT organization

Chapter 3 – Entity Administration

PaperVision® Capture Administration Guide 34

Creating a New Entity

Entity properties dictate how the server will handle system-level functions relating to that

entity. Configuring entity properties, as well as creating, editing, and deleting entities, can be

performed by global and system administrators.

To create a new entity:

1. After logging into PaperVision Capture as a global administrator, highlight the

Entities directory, and click the New Entity icon. The New Entity screen

appears.

New Entity

2. Enter the Entity Name, which is the name of your company or organization.

Chapter 3 – Entity Administration

PaperVision® Capture Administration Guide 35

3. In the Database Settings section, click the Configure button to assign the SQL

database information. Database settings include configuration settings for the

database where the entity resides. Only under special circumstances (i.e. moving the

database to a different server), should these settings ever be changed once the entity is

created. Changing these settings to another database or server for an existing entity

will NOT create new entity tables. The server will expect them to already exist.

SQL Data Source Information

4. In the SQL Data Source Information dialog box, enter the following information:

• Server IP/Name

• Database Name

• User Name

• Password

• Connection Type (select from the drop-down list)

• TCP/IP Port

5. Click OK in the SQL Data Source Information dialog box.

6. In the New Entity dialog, click the ellipsis button next to each entity path to enter its

location.

Chapter 3 – Entity Administration

PaperVision® Capture Administration Guide 36

The following paths are also used by PaperVision Capture:

• Data Group Path specifies the location where data groups are to be copied. As

PaperVision Enterprise imports data groups, it can optionally copy the data

groups from their source location to a new location. This path also specifies where

new (attached) documents and new document versions are written to.

• Migration Path specifies the path where migration jobs or backup packages are

processed.

• Full-Text Path specifies the path where full-text database indexes are stored.

• Batch Path specifies the path where batches created by PaperVision Capture are

stored.

7. Select the Disable Entity check box to disable any users, including administrators,

from logging into the system.

8. Click OK in the New Entity screen to save the properties.

Deleting an Entity

Deleting an entity removes it from the database. Additionally, deleting an entity removes any

full-text databases and data groups from PaperVision Enterprise (depending on global system

settings).

To delete an entity:

1. After logging into PaperVision Capture as a global administrator, highlight the

Entities directory, and then select one or more entities in the right pane.

2. Click the Delete icon.

3. Click OK to confirm the deletion.

Chapter 3 – Entity Administration

PaperVision® Capture Administration Guide 37

Editing the Properties of an Entity

Global administrators can edit the properties of all entities; system administrators can edit the

properties of one entity at a time.

To edit the properties of an entity:

1. Select the Entities directory, and then highlight the appropriate entity in the right

pane.

2. Click the Properties icon.

3. Make the modifications in the Entity Properties dialog.

4. Click OK to save the changes.

Note:

Changing database settings to a new or different database does not create entity

tables in the new database. However, creating a new entity creates new entity tables

in the database.

Chapter 3 – Entity Administration

PaperVision® Capture Administration Guide 38

General Security

The General Security screen allows you to manage PaperVision Capture’s encryption keys,

security policy, system groups, and system users.

To view the General Security settings:

1. Select Entity > Company > General Security. The General Security screen

appears.

General Security

2. To create encryption keys, double-click the Encryption Keys icon.

3. To assign users and groups who will have access to PaperVision Capture, double-

click the System Users or System Groups icon.

4. To assign the entity’s security settings, double-click the Security Policy icon.

Chapter 3 – Entity Administration

PaperVision® Capture Administration Guide 39

Encryption Keys

PaperVision Capture provides the ability to configure and manage encryption keys in order to

protect your data while it resides inside the application. Once configured, an encryption key

can then be used for the encryption of batches, images, indices, and full-text OCR data. Once

a batch is encrypted, its data will be accessible from within PaperVision Capture (even when

the encryption key is modified or deleted), but you will not be able to open batch images with

any viewer. When encryption is enabled, images, indices, and full-text OCR data that are

exported from PaperVision Capture are decrypted during the export. Generally, encrypted

batches impact overall system performance.

Note:

Encryption keys created in PaperVision Capture can be used in PaperVision

Enterprise and vice versa.

PaperVision Capture’s encryption process utilizes the following design:

• Algorithm: Rijndael – AES (256-bit)

• Encryption Mode: CBC (Cipher Block Chaining)

• Padding Method: FIPS81 (Federal Information Processing Standards 81) scheme

(ISO10126)

• Secret Key Generation: User-defined pass phrase is passed through the SHA-2

algorithm (Secure Hashing Algorithm) to generate a 256-bit hash

To view all encryption keys for an entity, double-click the Encryption Keys icon in the

General Security screen. The Encryption Keys screen appears.

Encryption Keys

Chapter 3 – Entity Administration

PaperVision® Capture Administration Guide 40

Adding Encryption Keys

Once you add a new encryption key, only its description can be edited.

To add a new encryption key:

1. In the Encryption Keys screen, click the Add Key icon. The Add Encryption

Key dialog box appears.

New Encryption Key

2. Enter the Key Name that will be used to identify the key.

3. Select the Key Type, which identifies the type of encryption that will be used for this

key.

4. Enter the Pass Phrase that will be used to generate the key.

5. Optionally, provide a general description of the key.

6. Click OK to save the new encryption key.

Chapter 3 – Entity Administration

PaperVision® Capture Administration Guide 41

Editing an Existing Encryption Key

In order to prevent any previously-encrypted data from becoming unreadable, only the

description of the encryption key can be modified.

To edit an existing encryption key:

1. In the Encryption Keys screen, select the appropriate encryption key, and then click

the Edit Key icon.

2. In the Edit Encryption Key dialog box, make the necessary modifications to the

description, and then click OK. The modifications will take effect the next time a

process loads the key values.

Deleting Encryption Keys

Important!

Data that has been encrypted with an encryption key may become unreadable if that

encryption key is deleted.

To delete an encryption key:

1. In the Encryption Keys screen, select an encryption key.

2. Click the Delete Key icon.

3. Click Yes to confirm the deletion.

Chapter 3 – Entity Administration

PaperVision® Capture Administration Guide 42

Security Policy

Windows Authentication allows users of the PaperVision Capture Operator Console to

authenticate using their Windows domain and user name, eliminating the need to type in their

user name and password during each login. This requires that a PaperVision Capture user

account exists in the “Domain\User” format for the Windows user attempting to login.

Windows Authentication can only be used when PaperVision Capture is connected directly to

the client database (in other words, you cannot be redirecting through a PaperVision Capture

application server).

When PaperVision Capture is connected directly to the client database from a remote station,

you must complete the following steps prior to enabling Windows Authentication:

1. Define the Master Batch Path as a UNC path (e.g., \\ServerName\MasterBatchPathFolder)

in the entity’s general properties.

2. Share the Master Batch Path folder with the appropriate users on the network.

3. Ensure that the PaperVision Data Transfer Agent service on the client workstation has

access to both the Master Batch Path and the Local Batch Path. If these paths do not

reside on the same machine, a domain account is recommended.

4. Ensure that the user specified in the previous step has full control (permissions) over the

Master Batch Path folder.

To configure the security policy for an entity:

1. In the General Security screen, double-click the Security Policy icon.

Chapter 3 – Entity Administration

PaperVision® Capture Administration Guide 43

2. In the Security Policy screen, click the Configure Security Policy icon. The

Entity Security Policy screen appears.

Entity Security Policy

3. In the General System Settings section, select Enable Integrated Windows

Authentication to allow users to be authenticated using their Windows domain and

user name.

4. Enter the Max Session Idle Time (minutes) that the user will remain idle before the

automation service automatically terminates the user session (logs the user out of the

system).

5. Click OK.

Chapter 3 – Entity Administration

PaperVision® Capture Administration Guide 44

System Groups

Groups allow you to select similar users to assign access and functionality to those users all at

once. In the System Groups screen, you can create, modify, and delete system groups. Groups

created in this screen can be assigned to job steps in the Job Definitions screen.

System Groups

To add a new system group:

1. In the General Security screen, double-click the System Groups icon.

Chapter 3 – Entity Administration

PaperVision® Capture Administration Guide 45

2. In the System Groups screen, click the New Group icon in the toolbar. The

New Group dialog box appears.

New Group

3. In the New Group dialog box, enter the new group name.

4. From the Available Users list, highlight the users who will comprise the group, and

then click the right arrow.

5. To add all available users to the new group, click Select All, and then click the right

arrow.

6. To remove a user from the new group, highlight the user in the Group Users list, and

then click the left arrow.

7. To remove all group users, click Select All in the Group Users list, and then click the

left arrow.

8. Click OK.

Chapter 3 – Entity Administration

PaperVision® Capture Administration Guide 46

Deleting a System Group

To delete a system group:

1. Highlight the group in the list.

2. Click the Delete icon.

3. Click OK to proceed with the deletion.

4. Click Save.

Editing Properties of a Group

To edit properties of a group:

1. Highlight the group.

2. Click the Properties icon.

3. In the Group Properties dialog box, select the members who should comprise the

group.

Note:

Group names cannot be edited; only the members can be edited.

4. Click Save.

Chapter 3 – Entity Administration

PaperVision® Capture Administration Guide 47

System Users

In the System Users screen, you can create, modify, and delete system users who have access

to PaperVision Capture. Additionally, you can assign and reset users' passwords in this

screen.

System Users

Creating a New System User

To create a new system user:

1. In the General Security screen, double-click the System Users icon.

Chapter 3 – Entity Administration

PaperVision® Capture Administration Guide 48

2. In the System Users screen, click the Create New User icon. The New User

dialog box appears.

New User

3. Enter the user name that will be used to log in to PaperVision Capture.

4. Enter the user’s full name (optional). The user’s full name is used for some of

PaperVision Capture’s reporting capabilities.

5. Enter the user's email address (optional).

6. Enter the user's password.

7. Enter the password once again to confirm it.

8. To force the user to change the password at the next login, select User must change

password at next login.

9. To allow the user to change the password at any time, select User can change

password when desired.

10. Select the appropriate User Type(s).

Note:

If you select System Administrator, the other user types will automatically be

assigned to the user. See the section on Supported Users in the Administration

Console in Chapter 1 for more information.

11. Click OK.

Chapter 3 – Entity Administration

PaperVision® Capture Administration Guide 49

Setting the User Password

To set the user password:

1. Highlight the user in the list.

2. Click the Set Password icon.

3. In the Set Password dialog box, enter the new password for the user.

Note:

Passwords are case-sensitive.

4. Enter the new password once again to confirm it.

5. Select OK to set the new password.

Deleting a User

To delete a user:

1. Highlight the user in the list.

2. Click the Delete icon.

3. Click OK to proceed with the deletion.

Editing the Properties of a User

To edit the properties of a user:

1. Highlight the user in the list.

2. Click the Properties icon.

3. In the User Properties dialog box, make the appropriate changes to the user account.

4. Click OK.

Chapter 3 – Entity Administration

PaperVision® Capture Administration Guide 50

Importing and Exporting Users

User lists can be imported and exported, populating most of the user’s configuration data.

Users can be imported using a pipe-delimited (“|”) or tab-delimited text file. Each line of the

text file can contain the following information (in this specific order):

• User Name

• Password

• Full Name

• Email Address

• System Administrator (if value is 1)

• Other Administrator (if value is 1, 2, or 3)

Note:

In the Other Administrator column, a Workflow Administrator has a value of 1; a

Capture Administrator has a value of 2; a Workflow and Capture Administrator

has a value of 3.

• User must change password at next login (if value is 1)

• User can change password when desired (if value is 1)

Only the first two fields (user name and password) are required on each line of text. If fields

are not specified, the default values are used. Below is a sample of an import file:

user1|password1|Test|test@test.com|0|1|1|1

user2|password2|Test2|test2@test.com|0|3|1|1

To import users:

1. In the System Users screen, click the Import Users icon.

2. Select the text file containing the user information.

Note:

Existing users are not recreated during the import process.

Chapter 3 – Entity Administration

PaperVision® Capture Administration Guide 51

To export all users:

1. In the System Users screen, click the Export Users icon.

2. In the Export Users dialog box, locate the directory where the text file will be saved.

3. Enter the export file name.

4. Click Save.

Note:

User passwords are not exported from PaperVision Capture; rather, passwords are

exported as empty strings in the text file. Consequently, exported users will be

required to change their passwords the next time they log into the Operator Console.

Current Sessions

As users log into the PaperVision Capture Operator Console, a session is started. Every time a

user accesses the server, PaperVision Capture verifies that the session is still valid, performs

the requested operation, and then updates the Last Activity Time column for the user. If a user

sits idle for too long (as specified by the administrator), the user’s session may automatically

be terminated (essentially, logged off). Current Sessions also displays the number of available

and used concurrent licenses in PaperVision Capture. To view the Current Sessions, select

Current Activity > Current Sessions.

Current Sessions

Chapter 3 – Entity Administration

PaperVision® Capture Administration Guide 52

To kill a user session:

1. Highlight the user session.

2. Click the Kill Session icon.

3. Click Yes to confirm session termination.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 53

In PaperVision Capture, a job is a defined workflow comprised of one or more

job steps. For example, a job can be configured to scan documents, index

documents automatically, and then export documents. At least one job has to be configured in

the PaperVision Capture Administration Console; otherwise, batches cannot be processed in

the PaperVision Capture Operator Console. Each job must contain, at minimum, a Capture

start step. Job steps are configured in the Job Definitions screen that is launched as you add a

new job. Once you configure all job steps and validate the job, you can activate and check the

job in so it is available for use in the PaperVision Capture Operator Console.

Capture Jobs

Creating a New Job

You can create a new job from the main Capture Jobs screen.

To create a new job:

1. Expand Entities > Company.

2. Highlight Capture Jobs.

3. Click the Create New Job icon.

4. Enter the name for the new job.

5. Click OK. The Job Definitions screen appears where you can add and configure job

steps for each PaperVision Capture job. For more information, see the next section on

Job Definitions.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 54

Editing a Job

To edit an existing job:

1. In the Capture Jobs screen, highlight the job.

2. Click the Edit Job icon.

3. Make the necessary changes in Job Definitions.

4. Save the job.

Note:

For information on configuring jobs, see the section on Job Definitions in this

chapter.

Saving a Job

An unsaved job displays an asterisk (*) next to its name. To save the current job open in the

workspace, click the Save Job icon.

Saving All Jobs

Unsaved jobs display an asterisk (*) next to their names. To save all jobs that are open in the

workspace, click the Save All icon.

Deleting Jobs

You can delete one or more jobs from the Capture Job list.

To delete one or more jobs:

1. Highlight one or more jobs.

2. Click the Delete Job icon.

3. To proceed with the deletion, click OK.

Checking Out a Job

To edit a job, the job has to be checked out of the Capture Jobs screen. Only one administrator

can check out a job at a time. To check out a job, click the Check Out Job icon.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 55

Checking In a Job

After editing a job, it has to be checked in before its new version can be used to process

batches in the PaperVision Capture Operator Console. To check in a job, click the Check In

Job icon.

Undoing a Job Checkout

If you make changes to a job and do not wish to save the changes, use the Undo Checkout

command.

To undo a checkout:

1. Click the Undo Checkout icon.

2. Click OK to the message prompt, and your changes will not be saved.

Importing a Job

Existing jobs can be imported into the Capture Jobs screen for the entity.

To import a job:

1. Click the Import Job icon, and the Open dialog box appears.

2. Select the XML document to import.

3. Click Open.

Note:

If you cannot find the XML file, ensure that the job has already been successfully

exported from the Job Definitions screen.

Exporting a Job

To export a job:

1. Click the Export Job icon.

2. In the Save As dialog box, locate the directory to save the exported XML file.

Note:

Users (in the Assigned To field) are not exported with jobs from the PaperVision

Capture Administration Console. When these jobs are subsequently imported

back into Job Definitions, the Assigned To field will not contain any users.

3. Enter a file name.

4. Click Save.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 56

Cloning a Job

Cloning a job copies the components of the open job including its steps, configurations, and

assigned users into a new job.

To clone a job:

1. Highlight the job to be cloned.

2. Click the Clone Job icon.

3. Enter the name of the new job. Job Definitions opens the new job, its steps,

configurations, and assigned users.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 57

Job Definitions

The Job Definitions screen enables you to create and configure jobs and job steps in a

graphical user interface. The Job Step Toolbox holds the job steps that you can drag and drop

directly into the workspace area. The Properties grid displays the settings for each job and job

step. The Job Steps grid summarizes the selected job step by name, type, assigned user, next

job step, mode, age priority, and step priority. You can customize the appearance of the

workspace by moving the Job Step Toolbox, Properties grid, and Job Steps grid.

Job Step Toolbox

The Job Step Toolbox contains PaperVision Capture's job steps that you can drag and drop

into the workspace:

Job Step Toolbox

To insert a job step into the workspace:

1. Highlight the job step in the Job Step Toolbox.

2. Hold the left mouse button while you drag the job step into the workspace.

3. To configure a job step’s properties, double-click the job step. For more

information on configuration, see the section on Job Steps in this chapter.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 58

Job Properties

The Job Properties grid contains the settings specific to the open job. Each property name is

listed in the grid's left column; the right column contains editable fields, drop-down menus, or

ellipses buttons where you configure the properties. Properties that are not applicable to the

job, selected job step, or that contain read-only information are disabled. If you select a job

step in the workspace, the grid reveals the properties applicable to the selected job step.

Tip:

To clear a setting that was configured with an ellipsis button, right-click the ellipsis

button and select Reset.

Job Properties

Active

If the Active status is set to True, the job has been activated. If the status is False, the job has

not been activated.

Note:

Batches can only be created for active jobs that have been checked into the server.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 59

Age Priority

The job's Age Priority value is used in the calculation of the overall batch priority assigned in

the PaperVision Capture Operator Console. For details on the batch priority calculation, see

the section on PaperVision Capture Terminology in Chapter 1.

Comments

This editable field contains additional details, comments, etc. about the job.

Custom QC Tags

You can define the QC tags available for selection in jobs requiring manual inspections on

batches, documents, pages, and indexes.

To add custom QC tags to a job:

1. Click the ellipsis button next to the Custom QC Tags row. The Custom QC Tags

dialog box appears.

Custom QC Tags

2. Select the appropriate category (Batch, Document, Index, Page).

3. To add a custom tag, click the Add button in the Custom Tags section, and

then enter the tag name.

4. The Predefined Tags are listed for your reference. Click the Hide Predefined link to

hide these tags.

5. When you are finished adding tags, click OK.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 60

Note:

The Predefined Tags are provided for informational purposes only. All predefined

tags will be used in an Automated QC step and will be available for selection in the

Manual QC step.

Detail Set

In PaperVision Capture, detail sets define a collection of indexes that allow multiple sets of

field data to reference a single document. To configure a detail set for the job, click the

ellipsis button in the right column of the Detail Set field. For more information, see the

section on Detail Sets in this chapter.

Entity

This read-only field displays the name of the current entity.

Name

This editable field contains the name of the open job.

Number Steps

This read-only field displays the number of job steps that comprise the job.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 61

Job Steps Grid

The Job Steps grid allows you to assign the job step to a user or group, connect job steps, and

assign age and step priorities. Additionally, you can view the job step type and mode (manual

or automated) and change the name of the job step.

Job Steps Grid

Name

This editable field contains the name of the job step.

Type

This read-only field displays the type of job step.

Assigned To

This editable field contains the user or group assigned to the job step.

This editable field displays the job step that immediately follows the selected job step.

Fail

This selection is the job step to which a failed QC step returns.

Mode

The Mode indicates whether a user manually completes the job step or if it is completed

automatically without user intervention.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 62

Age Priority

Age Priority is a value that you assign to the job step. This value is used in the calculation of

the overall batch priority that is assigned in the PaperVision Capture Operator Console. Type

the value directly in the field, or click the up and down arrows to select a value between 0 and

100. For details on the batch priority calculation, see the section on PaperVision Capture

Terminology in Chapter 1.

Step Priority

Step Priority is a value that you assign to the job step. This value is used in the calculation of

the overall batch priority that is assigned in the PaperVision Capture Operator Console. Type

the value directly in the field, or click the up and down arrows to select a value between 0 and

100.

Showing and Hiding Columns

To show/hide columns in the grid:

1. Click the Show/Hide Columns icon in the Job Steps grid, and the Select Columns

dialog box appears:

Select Columns

2. Select the columns to display in the grid.

3. Click the Move Up or Move Down buttons to reorder the columns.

4. Click OK.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 63

Aligning Job Steps

You can align job steps by using the Alignment commands described in the table below:

Alignment Commands

Align Left

Aligns all selected steps to the left side of the last

selected step

Align Center

Aligns all selected steps to the center of the last

selected step

Align Right

Aligns all selected steps to the right side of the last

selected step

Align Top

Aligns all selected steps to the top of the last

selected step

Align Middle

Aligns all selected steps to the middle of the last

selected step

Align Bottom

Aligns all selected steps with the bottom of the last

selected step

Make Same

Width

Aligns all selected job steps to match the width of

the last selected step

Make Same

Height

Aligns all selected job steps to match the height of

the last selected step

Make Same

Size

Aligns all selected job steps to match the size of

the last selected step

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 64

Job Menu

The Job menu in the Job Definitions screen contains the same commands that are available in

the Capture Jobs screen. Additionally, the Close and Exit commands are accessible in the Job

Definition’s Job menu.

Creating a New Job

To create a new job:

1. Click the New Job icon in the toolbar.

2. Select the appropriate entity in the New Job dialog box.

3. Click OK.

4. Enter the name for the new job.

5. Click OK, and a new job tab appears.

Opening a Job

To open an existing job:

1. Click the Open Job icon.

2. Select the entity.

3. Click OK.

4. In the Select Job dialog box, double-click the job to open, and it will open in the

workspace.

Saving a Job

Unsaved jobs will display an asterisk (*) next to the tab's name. To save the current job

open in the workspace, click the Save Job icon.

Saving All Jobs

Each unsaved job displays an asterisk (*) next to its name in its tab. To save all jobs that

have unsaved changes, click the Save All icon.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 65

Deleting a Job

To delete a job:

1. Click the Delete Job icon.

2. To proceed with the deletion, click OK.

Exporting a Job

To export a job:

1. Click the Export Job icon.

2. In the Save As dialog box, locate the directory to save the exported XML file.

Note:

Users (in the Assigned To field) are not exported with jobs from the

PaperVision Capture Administration Console. When these jobs are

subsequently imported back into Job Definitions, the Assigned To field will

not contain any users.

3. Enter a file name.

4. Click Save.

Importing a Job

To import a job:

1. Click the Import Job icon, and the Open dialog box appears.

2. Locate the XML document, and click Open.

Cloning a Job

Cloning a job copies the components of the open job including its steps, configurations, and

assigned users into a new job.

To clone a job:

1. Open the job to be cloned.

2. Click the Clone Job icon.

3. Enter the name of the new job. Job Definitions opens the new job, its steps,

configurations, and assigned users.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 66

Validating a Job

The Validate operation allows you to ensure that all job steps and job step paths have been

configured correctly. Since a job can contain two or more start steps or a QC step with pass

and fail links, all start steps must end at a single job step in order for the job to be valid.

For example, you may see a message when executing the Validate operation if you did not

correctly configure all paths leading from three start steps:

Job Paths Invalid

To validate a job:

1. After configuring all job steps’ properties and paths, click the Validate Job icon.

If any errors exist, a message notifies you that the job is invalid and describes each

error for your reference. Steps containing errors will be highlighted in the workspace.

Tip:

If you hover the mouse over the step containing the error, the error appears in

a tooltip message.

2. Once you fix any existing errors, repeat the first step once again to validate the job.

3. Once no errors exist, a message notifies you that the job is valid.

4. Click OK. The job is ready to be activated and checked into the server.

Activating a Job

To activate a job:

1. After you finish configuring and validating the job, click the Activate Job icon.

Note:

You must activate and check the job into the server to make it available for

use in the PaperVision Capture Operator Console.

2. A message will appear if a job is invalid and will describe the errors found in each job

step. Click OK after you view the error message.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 67

Deactivating a Job

Only an active job can be deactivated. To deactivate a job, click the Deactivate Job icon.

Checking Out a Job

To edit a job, you have to first check out the job. Only one administrator can check out a job

at a time. To check out a job, click the Check Out Job icon.

Checking In a Job

To check in a job, click the Check In Job icon.

Note:

Checking in a job automatically saves the job.

Undoing a Job Checkout

If you make changes to a job and do not want to save the changes, use the Undo Checkout

command.

To undo a checkout:

1. Click the Undo Checkout icon.

2. Click OK to confirm that edits made during the checkout should be discarded.

Closing a Job

To close the current job window, select Job > Close.

Exiting Job Definitions

To exit Job Definitions and close all open Job windows, select Job > Exit.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 68

Cutting, Copying, and Pasting Job Steps

To cut and paste a job step:

1. Select the job step.

2. Click the Cut Job Step(s) icon to place the job step(s) on the Clipboard. A gray

grid will appear over the job step.

3. In the new location, click the Paste Job Step(s) icon.

To copy and paste a job step:

1. Select the job step.

2. Click the Copy Job Step(s) icon to copy the job step(s) to the Clipboard

3. In the new location, click the Paste Job Step(s) icon.

To delete a job step:

1. Select the job step.

2. Click the Delete Job Step(s) icon.

3. Click Yes to confirm the deletion.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 69

Detail Sets

In PaperVision Capture, detail sets define a collection of indexes that allow multiple sets of

field data to reference a single document. Detail sets are configured at the job level within the

Job Definitions screen and can then be applied at the job step level.

For example, in an accounts payable job, index fields may be set up for check number, check

date, payee, invoice number, and invoice date. If you set up all of these fields as index fields,

a single document may be represented as follows:

Check Number

Check Date

Payee

Invoice Number

Invoice Date

12345

08/19/2008

ABC Corp

A0001

08/01/2008

12345

08/19/2008

ABC Corp

A0002

08/02/2008

12345

08/19/2008

ABC Corp

A0003

08/03/2008

The first three index fields (Check Number, Check Date, and Payee) will be duplicated per

changing invoice number. Rather than duplicating the information in the first three fields, you

can represent the first three fields as index fields and assign the remaining two fields, Invoice

Number and Invoice Date, as detail sets.

Index Fields

Check Number

Check Date

Payee

Document ID (system-generated)*

12345

08/19/2008

ABC Corp

654

* This system Document ID is generated behind the scenes, hidden from your view.

Detail Sets

Invoice Number

Invoice Date

Document ID (system-generated)*

A0001

08/01/2008

654

A0002

08/02/2008

654

A0003

08/03/2008

654

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 70

Configuring Detail Sets

To configure detail sets in PaperVision Capture:

1. In the Properties grid for the job, expand the General node.

Note:

Configuring detail sets for the job follows the same general steps as configuring

indexes for the job step.

2. Click the ellipsis button in the right column of the Detail Set property, which

opens the Detail Set Configuration dialog box.

Detail Set Configuration

3. To add an index value, click Add. For more information on configuring the index

properties, see the sections on General (Step Level) and Predefined Index Values

(Job Level) in Chapter 6.

Tip:

To prevent the programming language prompt from appearing each time you

configure custom code events, right-click the ellipsis button, and select Custom

Code Options. Select either the C# or Visual Basic programming language to use

by default, and then choose the option to suppress the dialog when creating new

custom code.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 71

4. After configuring the index properties, click OK.

Tip:

To clear a configured detail set, right-click the ellipsis button in the Properties

grid and select Reset.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 72

Job Steps

A job step is an automated or manual operation that is performed on a batch. Manual job steps

are performed by assigned users through the PaperVision Capture Operator Console;

automated job steps are completed by the PaperVision Capture Automation Service and

require no user intervention. The Job Definitions screen allows you to create and configure

the job steps that comprise each job. You can drag job steps directly from the Job Step

Toolbox and drop them anywhere in the workspace.

Job Step Toolbox

Capture

The Capture job step is a manual step that allows you to define the parameters of the

operator's electronic document capture process such as page rotation, auto document breaks,

maximum documents per batch, etc.

Indexing

The Indexing job step enables you to configure how index value population and validation

will be performed in the PaperVision Capture Operator Console.

Barcode

The Barcode job step allows you to configure a barcode reading process that is executed

automatically by the PaperVision Capture Automation Service.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 73

Nuance Zonal OCR (Optical Character Recognition)

During the OCR process, PaperVision Capture automatically extracts information from

scanned or imported documents. You can configure this step to read textual information from

zonal regions.

Open Text Zonal OCR

During the OCR process, PaperVision Capture automatically extracts information from

scanned or imported documents. You can configure this step to read textual information from

zonal regions.

Nuance Full-Text OCR

During the Nuance Full-Text OCR process, PaperVision Capture automatically extracts pages

of text and converts recognized results to one or multiple file types such as .txt, .rtf, .csv, .pdf,

.doc (and .docx) .htm, .xls (and .xlsx), and others.

Open Text Full-Text OCR

During the Open Text Full-Text OCR process, PaperVision Caputre automatically extracts

pages of text and converts recognized results to one or multiple file types including .pdf, .txt,

PaperVision Enterprise (.txt), and PaperFlow (.txt).

Image Processing

During the automated Image Processing job step, the system removes any unwanted noise,

lines, borders, and other extraneous objects from images as they are scanned or imported.

Additional filters identify color within images and delete or retain colors and pages as your

specified criteria are met.

Custom Code

The flexible and automated custom code capabilities of PaperVision Capture enable you to

define any action (including import, export, match and merge, etc.) through custom code.

Manual QC

The Manual QC step enables operators to visually inspect images and index values in order to

manually tag batches, documents, pages, and index fields for further review or processing in

the Operator Console.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 74

Automated QC

The Automated (QC) job step provides automated functionality for quality control operations

on indexes and images, eliminating the need for user intervention in the Operator Console.

The Automated QC step is designed to greatly enhance QC accuracy and productivity for

PaperVision Capture batches and jobs.

Adding Links

The Add Link command connects two job steps together.

To connect two job steps:

1. Select the two job steps to link together.

2. Click the Add Link icon.

Flipping Link Direction

The Flip Link Direction command reverses the direction of the link that connects two job steps.

To flip a link between job steps:

1. Select the two linked job steps.

2. Click the Flip Link Direction icon.

Removing a Link

The Remove Link command disconnects two linked job steps.

To remove a link between job steps:

1. Select the two linked job steps.

2. Click the Remove Link icon.

Zooming In

To zoom in on the workspace, click the Zoom In icon.

Zooming Out

To zoom out of the workspace, click the Zoom Out icon.

Resetting the Zoom

To reset the view of the workspace, click the Zoom Reset icon.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 75

General Properties

To configure each job step's general properties, select the job step in the workspace, and then

expand the General node in the Properties grid.

General Properties - Indexing Job Step

Age Priority

This value is used to calculate the overall batch priority in the PaperVision Capture Operator

Console. Click the Age Priority drop-down menu to open the slider, and you can rank the job

step on a scale from 0 to 100. For more information on batch priority, see the section on

PaperVision Capture Terminology in Chapter 1.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 76

Assigned To

This property is applicable to all manual job steps. You can assign one or more users or

groups who can complete the selected job step.

To assign the user or group to the job step:

1. Click the ellipsis button in the Assigned To field.

2. In the Job Step Assignment dialog box, select the users and/or groups who will be

assigned the job step in the PaperVision Capture Operator Console.

3. Click OK.

Batch Destruction Offset

The Batch Destruction Offset property can be applied to any job step. This setting is initiated

after the operator submits the batch for the job step. For example, if a Capture step has a

Batch Destruction Offset scheduled for one-hour and the operator subsequently creates a new

batch, scans documents, and then submits the batch. The next time the PaperVision Capture

Automation Service runs (provided that one hour has passed and the Batch Destruction

operation has been scheduled to run), the offset will be applied and the applicable batch will

be purged.

To assign the Batch Destruction Offset to the job step:

1. Click the ellipsis button in the Batch Destruction Offset field.

2. In the Destruction Offset dialog box, enter the days, hours, and/or minutes. These

values represent the duration after which any batches that complete the step are to be

destroyed.

Destruction Offset

3. If you want to keep the batch's statistics, select the Retain Statistics check box.

4. Click OK.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 77

Is Start Step

By default, this property is enabled (and editable) for Capture steps. You must assign a

Capture step as the Start Step; select True from the drop-down menu.

License Requirements

This read-only field displays the software licenses required for each job step. For example, the

Capture step requires, at minimum, the Capture Scan license. However, if image processing

will be performed on scanned images, the Capture step will then require both the Capture

Scan and Image Processing licenses. Automated steps, such as the Image Processing and

Custom Code steps, generally do not consume licenses upon execution, so do not require

licenses.

Until you define a Barcode Zone or OCR Zone within the appropriate step, each step’s

License Requirements property will not display the Barcode or OCR license. The Barcode

step requires either the 1-D Barcode or 2-D Barcode license, depending on the type of

barcode you select. If you select both 1D and 2D barcode types to be recognized, both license

requirements will display in the field. The OCR step requires either the Optical Character

Recognition (OCR) or Intelligent Character Recognition (ICR) license. The OCR license is

required if you choose any of the Omnifont modules, Matrix Matching, or Draft Dot-Matrix

module. The ICR license is required if you select the Constrained Handprint (Numeric) or the

Constrained Handprint (Alphanumeric) module.

Merge Like Documents

The Merge Like Documents command merges pages from multiple documents with the same

index values into a single document. Documents that have not been indexed are not included

in the merge process. The Merge Like Documents command is performed on all documents in

the batch.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 78

To configure the Merge Like Documents setting:

1. Click the ellipsis button in the Merge Like Documents field. The Merge Like

Documents Configuration dialog box appears.

Merge Like Documents Configuration

2. You can determine the page order of the merged document. Select Merge in Reverse

Direction to place the last page at the beginning of the resulting document. If all pages

should appear in the order in which they are merged, do not select this option.

3. All index values defined for the job appear in the Available list. Highlight the index

values to be included in the Merge Like Document operation, and click the right arrow.

Your selected index values will appear in the Selected list.

4. Or, choose Select All, and then click the right arrow.

5. To remove a selected index value, highlight the index value in the Selected list, and then

click the left arrow.

6. Or, choose Select All to remove all index values from the Selected list, and then click the

left arrow.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 79

7. By default, blank index values are not included in the merged document. If blank index

values should be included in the merged document, select the Allow Blank check box for

the appropriate index value. For example, if you select the Allow Blank check box for

the Invoice Number index value, all documents must contain blank Invoice Number

index values in order to be merged into one document. If at least one Invoice Number

index value is defined and the remaining index values are blank (or vice versa), the

documents will not be merged.

8. Click OK.

Mode

The read-only field indicates that the step is either manual or automated.

Name

This editable field contains the name of the job step.

Pre-Caching

Applicable to manual job steps, this setting maximizes operator productivity by facilitating

faster page downloading in the Operator Console. When this setting is configured, your

specified number of pages is downloaded before the remaining pages are downloaded as

operators take/open batches.

For example, if an operator manually indexes only the first page of every 10-page document,

you can enable the Pre-Caching setting in the Indexing step and set the Number Pages

setting to 1. Therefore, when an operator takes/opens a batch, only the first page is

downloaded from each document (before the remaining pages of each document). Pre-caching

maximizes productivity since operators do not have to wait for an entire batch (or entire

documents) to be downloaded to perform their work.

Note:

Although the first page of every document is not yet downloaded, the operator can

still open the batch to begin indexing the initial documents in the batch.

Source Image Step

To display images for a selected job step in the PaperVision Capture Operator Console, select

the job step from the Source Image Step drop-down menu. For example, you can select the

Capture step's images to display in the Operator Console for the Indexing step. When the

operator opens the Indexing step, images from the Capture step will appear.

Chapter 4 – Capture Job Configuration

PaperVision® Capture Administration Guide 80

Step Priority

This value is associated with the current job step and assigned by an administrator. To edit the

step priority, click the drop-down menu to open the slider. You can rank the job step on a

scale from 0 to 100. For more information on batch priority, see the section on PaperVision

Capture Terminology in Chapter 1.

Type

This read-only field displays the type of job step.

Use Non-Repudiation

This property is applicable to all job steps. When this value is set to True, images are

captured, and the SHA-512 hash value is calculated and stored for each image. The hash can

be exported to content management systems such that when a user retrieves an image, the

hash is recalculated against the retrieved image and verified against the stored hash value to

validate that the image has not been tampered with.

WARNING!

When running a demo license, the application writes a watermark onto each

captured image. Therefore, non-repudiation is not supported in demo mode.

Chapter 5 – Capture Step Configuration

PaperVision® Capture Administration Guide 81

The manual Capture job step contains scanning options so you can customize

PaperVision Capture to the scanning needs for any task. You can also configure

index values within the Capture step so operators can simultaneously hand-key index and scan

documents in the PaperVision Capture Operator Console. Auto Document Break settings

allow you to automatically insert document breaks based on page count, file size, barcode

content, and OCR text. Additionally, you can configure custom code events that the operator

can manually execute while scanning.

Note:

You can have multiple Capture steps in the job, but at least one has to be assigned as

the start step.

To view the properties for the Capture job step:

1. In the Job Definitions screen, select the Capture job step in the workspace.

2. In the Properties grid, expand the Auto Document Break, Capture Step,

Custom Code Events (Step Level), General, and Indexes nodes.

Auto Document Break

While scanning documents, you can determine where one document ends and the next

document begins using the Auto Document Break properties. Although you can separate

documents manually, you can select from options that are described below.

• None: This is the default auto document break type for a newly created step. When set to

None, the system will expect you to manually separate new documents. No options are

available for this setting.

• Number of Pages Per Document: To assign a fixed number of pages per document,

enter the number of pages that PaperVision Capture will scan before starting a new

document. You can set the Prompt Operator property to True to display a message that

asks the operator for a fixed number of pages before breaking to a new document. If you

set this property to False, the operator is not prompted.

• Barcode: If you select the Barcode mode, click the ellipsis button to the right of the

Barcode Zone field to define the zone. For the Save Page property, select True to leave

the page with the barcode in the batch, or select False to remove the barcode from the

batch. See the section on Barcode Zones in Chapter 7 for more information.

• Blank Page: To automatically insert document breaks based on the file size of the image,

select Blank Page. Enter the size (in kilobytes) of images to be considered blank. You can

enter the file size in whole numbers with up to two decimal places. Select True to leave

the blank page in the batch, or select False to remove the blank page from the batch.

Chapter 5 – Capture Step Configuration

PaperVision® Capture Administration Guide 82

Note:

A job validation error will appear if both the Auto Document Break and Minimum

Page Size Detection properties are enabled.

Capture Step Settings

Properties specific to the Capture step are described in this section, including those for page

rotation, image file type, page, and batch properties.

Auto Page Rotation

The Auto-Page Rotation setting allows you to configure how pages are rotated as images are

scanned.

To assign the page rotation settings:

1. In the Auto Page Rotation field, click the ellipsis button in the right column, which

opens the Auto Page Rotation dialog box.

Auto Page Rotation

2. Select the page rotation setting from the Apply Rotation To drop down menu.

• None disables the automatic page rotation feature.

• All Pages automatically rotates all pages in a document by the specified rotation

value as the documents are scanned.

• Even Pages automatically rotates only the even numbered pages in a document

by the specified rotation value as the documents are scanned.

• Odd Pages automatically rotates only the odd numbered pages in a document by

the specified rotation value as the documents are scanned.

• Even Pages/Odd Pages automatically rotates the odd and even numbered pages

in a document by the specified rotation values as the documents are scanned.

Even pages and odd pages can be assigned different rotation values.

Chapter 5 – Capture Step Configuration

PaperVision® Capture Administration Guide 83

• First Page Only automatically rotates the first page of a document by the

specified rotation value as the documents are scanned.

• All Pages Except First automatically rotates all pages except the first page of a

document by the specified rotation value as the documents are scanned.

• First Page Only/All Pages Except First automatically rotates the first page of a

document by the specified rotation value as the documents are scanned. The

remaining pages can be assigned a different rotation value.

3. Select the rotation value from the All Pages drop-down list, including 90°, 180°, or

270°.

4. Click OK.

Color Image File Type

You can specify the file type when storing scanned images that are not black and white. Click

the Color Image File Type drop-down menu in the right column to make the selection. If you

change this property after images have already been scanned into the batch, the file type will

change for only those images subsequently scanned into the batch. For example, you change

the Color Image File Type property from .bmp to .jpg after scanning ten out of twenty images

in the batch. Images 1-10 will be .bmp file types; images 11-20 will be .jpg file types.

• BMP files are not compressed and can be large. These files contain pixels and can

degrade when you increase resolution.

• JPG images are compressed, so they contain less data and smaller file sizes than other

image types.

Display Saved Images Only

If you select True, PaperVision Capture only displays the images that are saved (in the

manner that they are being saved). For example, if images are rotated as they are scanned,

only the correct rotation orientation will display. If you select True and you have specified a

minimum page size detection, blank pages will not display. If you select False, all images will

display, including blank images.

Max Number Documents Per Batch

You can limit the number of documents that comprise a batch. In the Max Number

Documents Per Batch field, enter the maximum number of documents that will comprise a

batch.

Chapter 5 – Capture Step Configuration

PaperVision® Capture Administration Guide 84

Minimum Page Size

Blank pages can be scanned accidentally or as the blank side of a duplex page. The Minimum

Page Size Detection setting allows you to delete blank pages as they are scanned. In the

Minimum Page Size field, enter the minimum page size detection (in Kilobytes) to be

deleted. You can enter the size in whole numbers with up to two decimal places.

Note:

Deleting blank pages as they are scanned could make the Number of Pages Per

Document Auto Document Break setting unusable.

New Batch Name (Regular Expression)

The New Batch Name is a regular expression that you can define that validates the batch

name entered by the operator in the PaperVision Capture Operator Console.

To assign a regular expression to batch names:

1. Click the ellipsis button in the right column next to the New Batch Name field.

2. In the Regular Expression dialog box, enter the regular expression.

3. Enter the text to validate. Your entry will automatically be validated.

• A successful validation displays with a green icon.

• Invalid entries display with a red icon.

Prompt for New Batch Information (Auto)

If you enable this setting, the operator will be prompted for batch information once the

maximum number of documents per batch has been reached when a batch is imported or

scanned.

Rotate Before Barcode

If you enable this setting, the Auto Page Rotation setting is applied to the image before

barcoding is performed to read index values.

Note:

This setting does not apply to the Auto Document Break setting; images are not

rotated before barcode document breaks are inserted.

Chapter 5 – Capture Step Configuration

PaperVision® Capture Administration Guide 85

Custom Code Events (Step Level)

You can configure custom code that operators can execute in the PaperVision Capture

Operator Console. Click the ellipsis button next to the appropriate event to select the

programming language and to configure the custom code.

Add Page

The Add Page event executes custom code just before images are appended to the batch,

including rotation or barcode indexing. When the script is enabled for this option, it will be

executed for all images that the operator scans in or when the operator imports a batch. This

script is not executed if the operator performs the Import Images operation.

Barcode Detected

The Barcode Detected event executes custom code after a barcode's value, location, size,

orientation, and type have been successfully read during scanning. When a script is enabled

for this option, it will be executed every time a barcode is successfully read during scanning

(multiple barcodes can be read per page). This event can also be used to apply a page-level

custom tag. The script is not executed if a barcode cannot be successfully read.

Batch Opened

Batch Opened executes custom code when the operator opens a batch in the Operator

Console. The following sample is a custom code event handler that can be inserted into the

code to display a message box, allowing the user to cancel the open batch operation:

CCustomCodeBatchOpeningEventArgs eventArgs

= (CCustomCodeBatchOpeningEventArgs)Parameter;

if (MessageBox.Show("Open Batch?", "Capture",

MessageBoxButtons.OKCancel,

MessageBoxIcon.Question)== DialogResult.Cancel)

{

eventArgs.CancelOpen = true;

}

Note:

The Batch Opened event will not execute if you have enabled the Max Documents per

Batch property and the user completes the Submit and Create New Batch operation.

Chapter 5 – Capture Step Configuration

PaperVision® Capture Administration Guide 86

Batch Submitted

Batch Submitted executes custom code when the operator submits a batch in the Operator

Console. The following sample is a custom code event handler that can be inserted into the

code to display a message box, allowing the operator to cancel the submit batch operation:

CCustomCodeBatchSubmittingEventArgs eventArgs

=(CCustomCodeBatchSubmittingEventArgs)Parameter;

if (MessageBox.Show("Submit Batch?", "Capture",

MessageBoxButtons.OKCancel,

MessageBoxIcon.Question) == DialogResult.Cancel)

{

eventArgs.CancelSubmit = true;

}

Custom Code Execution

The Custom Code Execution event executes when the operator clicks the Execute Custom

Code button in the PaperVision Capture Operator Console.

Match and Merge

The Match and Merge event executes when the operator clicks the Match and Merge button

in the PaperVision Capture Operator Console.

Saving Indexes

The Saving Indexes event executes prior to the operator saving the index values in the

PaperVision Capture Operator Console.

Tip:

To prevent the programming language prompt from appearing each time you

configure custom code events, right-click the ellipsis button, and select Custom

Code Options. Select either the C# or Visual Basic programming language to

use by default, and then choose the option to suppress the dialog when creating

new custom code.

General Properties

For information on the Capture step’s general properties that are applicable to all job steps,

see the section on General Properties in Chapter 4.

Chapter 5 – Capture Step Configuration

PaperVision® Capture Administration Guide 87

Indexes

You can configure index values in the Capture step if you enable the option, Allow Hand-Key

Indexing. For information on general Indexing settings and configuration, see Chapter 6 –

Indexing Configuration.

Allow Hand-Key Indexing

To maximize scanning and indexing efficiency within one step, you can enable this setting to

allow operators to enter index values while they scan documents in the Capture step. If you

enable this setting, you must define at least one index field.

Note:

Enabling this property will cause the Capture step to also consume a Capture Index

license (in addition to the Capture Scan license).

Manual Barcode and OCR Indexing

You can configure the Capture and Indexing steps so that indexing operators (or scanning

operators tasked with indexing) can apply barcode or OCR zones directly on images in order

to populate index fields. By manually applying barcode or OCR zones, operators can easily

extract and index text or barcode data that may shift across pages and documents. When you

enable the Allow Barcode Indexing property, a Capture Barcode (1D or 2D, depending on

the selected barcode type) is also required in addition to the Capture Scan or Capture Indexing

license. Similarly, when you enable the Allow OCR Indexing property, a Capture Nuance

Zonal OCR, Nuance OCR Handwriting (depending on selected Recognition Module), or

Capture Open Text Zonal OCR license is also required in addition to the Capture Scan or

Capture Indexing license.

During configuration, it is only required to draw one barcode or OCR zone to define the

applicable properties. Operators are only restricted to the properties you define for the zone,

such as supported barcode types and OCR recognition languages, but they can apply an

infinite number of zones on an image. Similar to the configuration of the automated barcode

and OCR steps, you can test the zone to ensure its contents can be read successfully.

Configuring Manual Barcode Indexing

When you enable manual barcode indexing, the operator can apply barcode zones on an

image to populate required index values. During configuration, it is only required to draw one

barcode zone to define the applicable properties. Similar to the automated Barcode step, you

can test the zone to ensure barcodes can be read successfully prior to activating and checking

in the job.

Chapter 5 – Capture Step Configuration

PaperVision® Capture Administration Guide 88

To configure manual barcode indexing in the Capture or Indexing step:

1. Expand the Manual Barcode Indexing node in the Properties grid.

Manual Barcode Indexing Properties

2. Select True in the Allow Barcode Indexing drop-down list.

Chapter 5 – Capture Step Configuration

PaperVision® Capture Administration Guide 89

3. Click the ellipsis button in the Barcode Indexing field. The Configure Manual

Barcode Indexing screen appears.

Configure Manual Barcode Indexing

4. Draw the zone, and then configure the applicable barcode zone properties.

5. Click the Save Barcode Zones icon.

Note:

For descriptions of all barcode zone properties, see the section on Barcode

Zone Properties in Chapter 7. For descriptions of each operation in the

Configure Manual Barcode Indexing screen, see the section on Barcode

Explorer in Chapter 7.

Chapter 5 – Capture Step Configuration

PaperVision® Capture Administration Guide 90

Configuring Manual OCR Indexing

When you enable manual OCR indexing, the operator can apply OCR zones on an image to

populate required index values. During configuration, it is only required to draw one OCR

zone to define the applicable properties. Similar to the automated OCR step, you can test the

zone to ensure text can be read successfully prior to activating and checking in the job.

To configure manual OCR indexing in the Capture or Indexing step:

1. Expand the Manual OCR Indexing node in the Properties grid.

Manual OCR Indexing

2. Select the zonal OCR engine from the Engine drop-down list.

Chapter 5 – Capture Step Configuration

PaperVision® Capture Administration Guide 91

3. Click the ellipsis button in the OCR Indexing field. The Configure Manual OCR

Indexing screen appears. Properties specific to your engine selection will be available

for configuration.

Configure Manual OCR Indexing (Nuance Zonal OCR)

4. Draw the zone, and then configure the applicable OCR properties.

5. Click the Save OCR Zones icon.

Note:

For descriptions of all OCR page and zone properties, see the section on

OCR properties in Chapter 8. For descriptions of each operation in the

Configure Manual OCR Indexing screen, see the section on OCR Zones in

Chapter 8.

Chapter 5 – Capture Step Configuration

PaperVision® Capture Administration Guide 92

Manual QC

If you require Indexing operators to review and apply QC tags in the Indexing step, the

following Manual QC properties are available for configuration.

Allow Manual QC

You can enable this setting to allow operators to add your selected QC tags within the

Indexing job step.

Note:

When you enable this property, the Indexing step also consumes a Capture QC

Manual license (in addition to the Capture Index license).

Allow Review QC Tags

Applicable to manual job steps, this property allows the operator to view the Browse QC Tags

window in the PaperVision Capture Operator Console. Select True to allow the operator to

view the Browse QC Tags window. Select False to prevent the operator from viewing the

Browse QC Tags window.

Note:

The Capture QC Manual license is not required for the operator to review QC tags.

QC Auto Play

When the Allow Manual QC property is enabled in the Capture step, you can define how

long (in seconds) each image appears on screen so operators can perform visual inspections.

Click the ellipsis button next to the QC Auto Play field to configure the auto play settings.

QC Auto Play

Chapter 5 – Capture Step Configuration

PaperVision® Capture Administration Guide 93

• The Delay (sec) property determines how long each image or group of images remains

on screen at a time in the Manual QC step.

• The Skip Mode determines whether auto play skips batches or documents:

1. If you select the Batch skip mode, then you can define how pages are skipped. For

page skipping, you can require that operators inspect all pages (None), by page

number (Number, such as 1, 5, 10, etc.), or by a random number of pages

(Random).

2. If you select the Document skip mode, you can define how documents and pages

are skipped.

• For document skipping, you can require that operators inspect all documents

(None), by document number (Number, such as 1, 5, 10, etc.), or by a random

number of documents (Random).

• For page skipping, you can require that operators inspect all pages (None), by

page number (Number, such as 1, 5, 10, etc.), or by a random number of pages

(Random).

When you select the Random option, auto play skips an arbitrary number of pages or

documents (between zero and your assigned number). For example, if you enter “10,” then

three pages/documents may be skipped during the first auto play; nine pages/documents

during the second auto play; ten pages/documents during the third auto play; etc.

Chapter 5 – Capture Step Configuration

PaperVision® Capture Administration Guide 94

Operator Permissions

By default, operators can perform most document and page operations while scanning in the

Capture step. You can determine whether operators can import batches and images in the

Capture step. In addition, you can determine whether operators can view the Browse Batch

window in the Operator Console.

Browse Batch

When set to True, the operator can view the Browse Batch window.

Import Batch

When set to True, operators can import batches into the PaperVision Capture Operator

Console.

Import Images

When set to True, the operator can import images into a document.

Note:

When you enable this property, the Indexing step also consumes a Capture Scan

license (in addition to the Capture Index license).

Chapter 5 – Capture Step Configuration

PaperVision® Capture Administration Guide 95

Scanner Requirements

You can assign specific scanner requirements for a Capture step including color format,

minimum and maximum DPI, and scan type settings. As a result, your specified requirements

will be enforced in the Operator Console’s scanner settings and the operator will not be able

to edit these requirements.

Note:

Some settings may not be available for your scanner. If you select an unavailable

option, the property will become disabled and an error will be logged in the

Windows Event Viewer.

Color Format

You can select the scanner’s color format requirements, such as true color, grayscale, and

black and white.

To select the color format:

1. Click the ellipsis button next to the Color Format field. The Select Required Color

Format Options dialog box appears.

Select Required Color Format Options

2. Select the appropriate options from the list, and then click OK.

Chapter 5 – Capture Step Configuration

PaperVision® Capture Administration Guide 96

Vertical and Horizontal Resolution

You can assign the minimum and maximum vertical and horizontal resolution settings for the

scanner, such as 200 DPI, 1200 DPI, etc. As a result, the operator will not be able to assign a

value above or below your specified values.

Scan Type

You can select the scan type, such as duplex, back-only, front-only, and others. The available

scan types include the following:

• Transparency

• Flatbed

• Front-Only

• Duplex

• Back-Front

• Back-Only

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 97

The Indexing job step allows you to customize PaperVision Capture to the

indexing needs of any task. Configuration properties for the Indexing job step are

designed to enhance productivity in the PaperVision Capture Operator Console, such as

predefined index values, auto-carry/auto-increment, and detail sets. Additional properties can

be configured to monitor and verify operator indexing entries, such as blind index

verification, regular expressions, and re-key verification. Index zones that can be configured

in the Indexing job step will help you define areas on the image that will be zoomed into view

when operators hand-key index values. When you configure individual indexes, four

categories of settings are available, including Custom Code Events (Step Level), General (Job

Level), General (Step Level), and Predefined Index Values (Job Level).

To view the properties for the Indexing job step:

1. In the Job Definitions screen, select the Indexing job step in the workspace.

2. In the Properties grid, expand the Custom Code Events (Step Level), General,

and Indexes nodes.

Custom Code Events (Step Level)

You can configure custom code that operators can execute in the PaperVision Capture

Operator Console. Click the ellipsis button next to the appropriate event to select the

programming language and to configure the custom code. For more information on

configuring custom code, see Chapter 13 - Custom Code.

Add Page

Add Page executes custom code just before images are appended to the batch, including

rotation or barcode indexing. When the script is enabled for this option, it will be executed for

all images that the operator scans in or when the operator imports a batch. This script is not

executed if the operator performs the Import Images operation.

Batch Opened

Batch Opened executes custom code when the operator opens a batch in the Operator

Console. The following sample is a custom code event handler that can be inserted into the

code to display a message box, allowing the user to cancel the open batch operation:

CCustomCodeBatchOpeningEventArgs eventArgs

= (CCustomCodeBatchOpeningEventArgs)Parameter;

if (MessageBox.Show("Open Batch?", "Capture",

MessageBoxButtons.OKCancel,

MessageBoxIcon.Question)== DialogResult.Cancel)

{

eventArgs.CancelOpen = true;

}

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 98

Note:

The Batch Opened event will not execute if you have enabled the Max Documents per

Batch property and the user completes the Submit and Create New Batch operation.

Batch Submitted

Batch Submitted executes custom code when the operator submits a batch in the Operator

Console. The following sample is a custom code event handler that can be inserted into the

code to display a message box, allowing the operator to cancel the submit batch operation:

CCustomCodeBatchSubmittingEventArgs eventArgs

=(CCustomCodeBatchSubmittingEventArgs)Parameter;

if (MessageBox.Show("Submit Batch?", "Capture",

MessageBoxButtons.OKCancel,

MessageBoxIcon.Question) == DialogResult.Cancel)

{

eventArgs.CancelSubmit = true;

}

Custom Code Execution

Custom Code Execution executes when the operator clicks the Execute Custom Code button

in the PaperVision Capture Operator Console.

Match and Merge

Match and Merge executes when the operator clicks the Match and Merge button in the

PaperVision Capture Operator Console.

Saving Indexes

Saving Indexes executes prior to the operator saving the index values in the PaperVision

Capture Operator Console.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 99

General Properties

For information on the Indexing step’s general properties that are applicable to all job steps,

see the section on General Properties in Chapter 4. If Indexing operators are required to

apply QC tags to index fields, the following QC properties are available for configuration.

Indexes

Four groups of properties can be configured for each index value, including Custom Code

Events (Step Level), General (Job Level), General (Step Level), and Predefined Index Values

(Job Level). In the Properties grid, click the ellipsis button in the right column of the Indexes

field, and the Index Configuration dialog box appears.

Index Configuration

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 100

Adding, Removing, and Sorting Indexes

You can add an individual or existing index, all indexes (including or excluding those defined

in detail fields), or a job detail set.

To add an index:

1. Click Add, and the Add Index dialog box appears.

Add Index

2. To add a new index, select New Index, and then enter the field name. Proceed to step

3. To add an existing index, select Existing Index. From the drop-down list, you can

select an individual index or all indexes (including or excluding those defined in detail

fields). Proceed to step 5.

4. To add a new detail set for the job, select Job Detail Set. You can then create and

configure each individual index comprising the detail set. For more information, see

the section on Configuring Detail Sets

5. Click OK. The Index Configuration dialog box will display your new index along

with its associated properties that you can configure.

To remove an existing index:

1. Highlight the appropriate index in the Indexes list.

2. Click Remove.

To sort indexes:

To move an index up or down the list, click the up or down arrow to the right of the

list of indexes.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 101

Custom Code Events (Step Level)

In the Properties grid for the Indexing job step, the Index Populated and the Index Validate

Events allow you to select either Visual Basic or C# code to configure an action triggered

immediately after an index field is populated (and the operator returns to re-enter the index

value) or validated by the system. The Index Validate event is triggered after the operator

returns to edit an index value, re-enters the index value, and then proceeds to a subsequent

index field (or saves the edited index value).

To configure the code:

1. Click the ellipsis button in the right column of the Index Populated or Index

Validate field.

2. Select either Visual Basic or C# programming language, and the Script Editor opens.

See the section on the Script Editor for more information.

Tip:

To prevent the programming language prompt from appearing each time you

configure custom code events, right-click the ellipsis button, and select Custom

Code Options. Select either the C# or Visual Basic programming language to

use by default, and then choose the option to suppress the dialog when creating

new custom code.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 102

General (Job Level)

These settings allow you to configure auto-carry and auto-increment values, index types, and

regular expressions. To view these settings, expand the General (Job Level) node within the

Index Configuration dialog box.

Auto-Carry/Auto-Increment

The Auto-Carry and Auto-Increment settings can greatly increase operator productivity while

hand-keying repetitive or incremental values or characters. Both tools operate during scanning

(optional) and hand-keying. To configure these settings, click the ellipsis button in the Auto-

Carry/Auto-Increment field.

Note:

Auto-Carry settings only apply when the operator saves index values in the Operator

Console.

Auto-Carry/Auto-Increment

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 103

Auto-Carry Entire Index Value

This setting allows you to carry all characters from an index in one document to the

corresponding index in the next document. You can then enable Overwrite Existing

Values and/or Carry Values to Copied Document.

Auto-Carry Characters Preceding Number

This setting allows you to define the number of characters that precede a number. Your

specified number of characters will carry from an index in one document to the

corresponding index in the next document. For example, if you have an index that is

always (or nearly always) the letters ABC followed by a number, you may not want to

continuously re-enter ABC on each index value. You could set the number of characters to

carry to 3. When the operator is keying the information, ABC would automatically get

carried forward to the next document and they would only have to enter the numeric

portion of the index.

Auto-Carry Characters Following Number

This setting allows you to define the number of characters that follow a number. Your

specified number of characters will carry from an index in one document to the

corresponding index in the next document. For example, if you have an index that is

always (or nearly always) a number followed by the letters ABC, you may not want to

continuously re-enter ABC on each index value. You could set the number of characters to

carry to 3. When the operator is keying the information, ABC would automatically get

carried forward to the next document and they would only have to enter the numeric

portion of the index.

Auto-Increment Number

Auto-Increment takes Auto-Carry one step further. For example, if the numeric portion of

the value was an incremental numeric value, you could set Auto-Carry to 3 and Auto-

Increment to 1. This would increment the numeric value of any characters remaining after

the first three characters by a value of one. The Auto-Increment Number can also be used

without Auto-Carry if the value is completely numeric. The value entered in the Minimum

Number Digits field allows you to pad the new value with zeros. The Preview section

shows you how the carried value will appear.

Overwrite Existing Values

By default, Auto-Carry and Auto-Increment do not fill in an index value if there is already

information in the index. Selecting this check box will force Auto-Carry and Auto-

Increment to update the index regardless of whether information previously existed.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 104

Carry Values to Copied Document

By default, when documents are copied, no index values are carried through to the copies.

This allows you to specify that the current index should also be copied, leaving the other

indices blank.

Auto-Fill Cursor Location

If you enable this setting, operators are allowed to append to an existing index value. The

setting places the cursor's focus at the end of the original index value so the original value

is retained.

Note:

This determines whether data will be highlighted or the cursor will be placed at the

end of the data when hand-keying an index that has the Auto-Carry or Auto-Fill

option selected.

Preview

This section displays the original value and displays a preview of the carried value.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 105

Index Masking Regular Expression

The Index Masking Regular Expression property allows you to predefine a specific format for

index values entered during hand-key indexing. As operators enter index values, their entries

will be formatted (masked) automatically. For example, you can predefine social security

numbers to automatically insert dashes; as a result, operators only have to hand-key the 9-

digit social security numbers and not the dashes.

Tip:

Configuring this property does not validate the operator’s index value entries.

Validation is performed as operators enter index values in the Operator Console’s

Index Manager.

To configure index masking:

1. In the Index Configuration dialog box, expand the General (Job Level) node for the

appropriate index value.

2. Click the ellipsis button next to the Index Masking Regular Expression property,

and the Regular Expression Mask dialog box appears.

Regular Expression Mask - 5 + 4-Digit Zip Code

3. If you select a Predefined Value, select from the Masking drop-down list, and then

proceed to step 6.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 106

4. If you select a Custom mask, enter the Pattern Expression. The Pattern Expression is

a regular expression that you define for the index mask. For example, for 5 + 4 digit

zip codes such as 80111-2841, type the following:

(\d{5})(\d{4})

5. If necessary, define a Replace Expression that will automatically format the

operator’s entry. To format an operator’s 9-digit entry to appear as 80111-2841, type

the following:

$1-$2

Note:

If you do not define a Replace Expression, the operator’s entry will not be

formatted.

6. To preview how masking formats the number, enter a sample index value that an

operator would hand-key in the Input Text field. The resulting masked index value

appears in the Mask Result field.

7. Click OK.

Note:

Only the Text, Long Text, and Text (900) index types apply to the Index Masking

Regular Expression property.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 107

Date Regular Expression Mask

The following pattern expression formats either a one- or two-digit month and day followed

by a two- or four-digit year:

(^\d{1,2})(\d{1,2})(\d{2,4}$)

The following replace expression separates the month, day, and year with a dash:

$1-$2-$3

To separate the month, day, and year with a slash mark, you can enter:

$1/$2/$3

Two-Digit Month and Day with Four-Digit Year

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 108

The same pattern expression formats a one-digit month and day followed by a two-digit year:

One-Digit Month/Day and Two-Digit Year

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 109

Credit Card Regular Expression Mask

The following pattern expression formats a 16-digit credit card number:

(\d{4})(\d{4})(\d{4}$)(\d{4})

Enter the following replace expression to separate the digits with a dash:

$1-$2-$3-$4

16-Digit Credit Card Number

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 110

Index Formats and Types

Document indices contain values that enable you to identify key elements of documents

within a project during the capture process. Indices contain values that enable you to identify

key elements of documents during the capture process.

PaperVision Capture supports the following types of indices:

• Boolean stores Boolean values such as yes/no, on/off, and true/false.

• Currency stores currency (monetary) values.

• Date stores date/time values ranging from 12:00:00 midnight, January 1, 0001

through 11:59:59 P.M., December 31, 9999 A.D. This index type also supports

searches on date ranges.

• Double Number represents a double-precision 64-bit number with values ranging

from -1.79769E+308 to 1.79769E+308.

• Long Text stores textual data that exceeds 255 characters in length (up to

approximately 64,000 characters in total).

• Number stores whole-number values between -2,147,483,648 and 2,147,483,647.

This index type supports hyphens or dashes at the beginning of the number to indicate

a negative value, but it does not support hyphens or dashes within the number, such as

dashes within a social security number (555-55-5555). This index excludes these

dashes from the number.

• Text stores textual data up to 255 characters in length. This type of index is the most

common.

• Text(900) stores textual data up to 900 characters in length.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 111

Formatting the Date and Time

When you select a date index type, you can select from a predefined date/time format or you

can customize a date/time format.

To define the date/time format:

1. Click the ellipsis button in the right column of the Index Format field, which opens

the Date/Time Formatting dialog box.

Date/Time Formatting

2. Select either a Predefined Format (proceed to the next step) or a Custom Format

(proceed to fifth step).

3. If you select a Predefined Format, select from the following Date/Time Order

options:

• Date Only

• Time Only

• Date/Time

• Time/Date

4. Depending on your Date/Time Order selection, you can choose from the Date/Time

Format drop-down menus.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 112

5. If you select a Custom Format, enter the format in the blank field.

Note:

Some custom formats may not be supported in PaperVision Enterprise. Custom

formats could be assigned when using Custom Code to export to another format.

6. To preview a Predefined or Custom format, click the Format button in the Preview

section.

7. If you need to preview a calendar, click the Date drop-down menu.

8. If you need to set the time, enter it in the Time field. Or, use the up or down arrows to

set the time.

9. Click OK.

Double Number Formatting

When you select a Double Number index type, you can select a predefined or custom format.

To define the double number format:

1. Click the ellipsis button in the right column of the Index Format field, which opens

the Field Formatting dialog box.

Field Formatting

2. Select either a Predefined Format (proceed to the next step) or a Custom Format

(proceed to the fourth step).

3. If you select a Predefined Format, select from the following format types:

• Currency

• Fixed

• General

• Percent

• Scientific

• Standard

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 113

4. If you select a Custom Format, enter the format in the blank field.

Note:

Some custom formats may not be supported in PaperVision Enterprise.

5. Click OK.

Index Verification Regular Expression

You can create a regular expression to validate operator data entry. A regular expression is a

pattern of text that consists of ordinary characters (for example, letters A through Z) and

special characters, known as metacharacters. The pattern describes one or more strings to

match when searching a body of text. The regular expression serves as a template for

matching a character pattern to the string being searched.

Name

This editable field contains the name of the index value.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 114

General (Step Level)

The General (Step Level) settings for each index value enable you to configure settings for

operators who will index documents within the PaperVision Capture Operator Console.

Blind Index Verification

This setting ensures the index entry of the first operator matches the second entry (or your

specified number of subsequent index entries). If you enable this setting, configure at least

two Indexing job steps.

For example, you assign the following for index field SSN:

1. For the first Indexing step, you select False.

2. Assign True for the second Indexing step.

3. Assign User 1 to the first Indexing step.

4. Assign User 2 to the second Indexing step.

5. User 1 enters 1 in the field and submits the batch.

6. User 2 enters 2 in the field, which differs from the first entry.

• Since Blind Index Verification has been enabled for the second Indexing step, the

original index value for this field is not visible for User 2.

• An error message notifies User 2 that the index values do not match.

Note:

Blind index verification is not an option available with detail fields.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 115

Font Color/Customization

You can customize the font characteristics to modify how each index value and label displays

in the Operator Console. You can also change the cell color for each index value to emphasize

certain index values and assist operators who are visually challenged.

To customize the font and cell color:

1. Expand the Font Color/Customization node.

2. By default, each background cell color is white. To select another color, click the

Background Color drop-down list.

3. To change the label font for the index value, expand the Label node.

4. Click the ellipsis button next to the Label property. The Font dialog box appears.

Note:

You can also configure the individual properties directly in the Index

Configuration dialog box.

Font

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 116

The following font properties can be configured in the Font dialog box or in the Index

Configuration dialog box:

• Font or Name: This property indicates the name of the font, such as Microsoft

Sans Serif (default), Arial, Times New Roman, etc.

• Font Style: The font style defaults to Regular, but you can select from Italic, Bold,

or Bold Italic.

• Size: The font size defaults to 8 point, but you can select a larger font size.

• Effects: To emphasize the font, you can enable the Strikeout and/or the Underline

effect.

• Unit: This is the unit of measurement for the font size, which defaults to Point.

Not all units are available for all fonts.

• Bold: This property is false by default and indicates whether boldface type has

been applied to the font.

• Script: Western script is selected by default, but you can select other scripts such

as Arabic, Baltic, Greek, Vietnamese, etc.

• GDICharSet: Depending on the selected font, this byte value specifies the GDI

character set that the font uses.

• GDIVerticalfont: This property indicates whether the selected font originates

from a GDI vertical font.

• Italic: This property is false by default and indicates whether the font is italic.

• Strikeout: This property is false by default and indicates whether the font displays

with a horizontal line running through it.

• Underline: This property is false by default and indicates whether the font is

underlined.

Note:

For more information on Microsoft's Graphics Device Interface (GDI), see the

Microsoft Software Developer's Network:

http://msdn.microsoft.com/en-us/default.aspx

5. To change the font appearance of the operator’s index value entry, expand the Value

Font node. See the previous step for descriptions of each customizable property.

6. After you have finished configuring the font characteristics, click OK.

Hot Key Default Value

As operators are keying in index fields and press the assigned hot key, the specified default

value will populate the index field.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 117

Ignore Indexing Errors

If this setting is True, incorrect operator input will be ignored and no prompt will appear for

the operator. If this setting is False, the operator will be notified of an incorrect indexing

entry.

No Hand Key Indexing

If this setting is True, the operator will not be allowed to enter index values. If this setting is

False, the operator will be allowed to enter index values.

Re-Key Verification Count

To ensure indexing accuracy, this value forces the operator to enter the index value a

specified number of times, which can range from 0 to 99.

Valid Field Required

If this setting is True, the operator will be required to enter a valid index value for the field

type, such as a date-formatted value for a date field. If this setting is False, the operator will

be allowed to continue and keep the invalid value.

Verification Search Strings

The Verification Search Strings setting is used to validate index values when the operator

saves index values, tabs to the next field, submits the batch, or executes the Verify Index

Values operation. To ensure the accuracy of hand-key indexing, you can define multiple

search strings that can be verified when the operator executes the Verify Index Values

command. For example, you can assign individual characters or numbers to search for during

the index verification process. By default, the verification process will highlight the first

document in the batch that contains a blank value. However, you can exclude blank values

from the index verification process by removing <Blank> from the list of search strings.

Depending on the operator’s index verification settings in Tools > Options > Display

Preferences (Verify Starts from Current Document Forward or Verify Starts at the Beginning

of the Batch), the index verification process starts with the appropriate document in the batch

and will highlight the next document that contains your defined search strings.

To assign verification search strings:

1. For the appropriate index, click the ellipsis button to the right of the Verification

Search Strings field.

2. In the Verification Search Strings dialog box, enter a search string in the first row.

3. Enter any subsequent search strings, if necessary.

4. To remove a search string, highlight the string, and then click the Remove icon.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 118

Zoom Zone

This setting allows you to assign an area of the image that will be zoomed into view when

operators hand-key this index field.

If the Automatic Page Location setting is enabled, you can specify the page of the document

that is displayed when index values are entered, which is useful if index values are located on

different pages of the document. This value has to be greater than zero. If you enter a page

index value greater than the number of pages in the document, the last page will display. For

details on index zone configuration, see the next section.

Index Zone

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 119

Index Zones

Index zones help you define areas on the image that will be zoomed into view when operators

hand-key index values.

To draw an index zone:

1. In the Index Zone dialog box, click the Draw Zone button, and the Select Index

Zone screen opens.

Select Index Zone

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 120

The Select Index Zone commands are listed in the table below:

Select Index Zone Commands

Scanner Setup Allows you to set up the scanner's settings

Scan Image

Allows you to scan an image into the Select

Index Zone screen

Open Image

Enables you to select a test image from disk that

will open in the window

Reset Image Reverts to the original view of the image

Rotate Image Rotates the image 90 degrees clockwise

Zoom In Zooms in the view of the image

Zoom Out Zooms out the view of the image

Zoom In Region

Zooms in on the boundary of your specified

region

Move, Zoom, or Region

Equips the left mouse button with the Zoom,

Move or Region command

• Zoom enlarges a specified area

• Move pans around a zoomed area

• Region defines a boundary to process

2. To scan a sample image, click the Scan Image icon. For more information on

scanner settings, see the section on Scanner Setup Settings in this chapter.

3. To open an existing image, click the Open icon.

4. In the toolbar, select the Region drop-down list.

5. Click the left mouse button and drag the cursor around the region.

6. If necessary, widen or narrow the boundaries of the index zone.

7. When you are finished configuring the index zone, click OK.

8. Click OK in the Index Zone dialog box.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 121

Predefined Index Values (Job Level)

These settings allow you to predefine index field values at the job level. You can predefine

these values for the job as you configure the index field or you can allow operators' entries to

be added to the predefined values list. Your specified predefined values are used for the Auto-

Complete feature that finishes information as the operator types.

Add New Values

If this setting is True, all new operator-entered values can be added to the Predefined Values

list.

Auto-Complete

If this setting is True, the index field will automatically be completed as the operator types.

Force Predefined Values

If this setting is True, the operator can only select from your predefined index values. If the

entered data is not one of the predefined values, the operator will be alerted. If this setting is

False, the operator will be allowed to enter a value in the index field.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 122

Predefined Values

In addition to adding predefined index values, you can also import and export the index

values as text (.txt) files for each index field.

To assign predefined values:

1. Click the ellipsis button in this field to assign predefined index values to the list, and

the Predefined Values dialog box appears.

Predefined Values

2. Enter the values directly in the grid.

3. When you are finished entering all values, click OK.

To import a list of predefined index values:

1. To import an index value, click the Import icon.

2. Select the text document to import.

3. Click Open. A text file is imported that contains any predefined values; each line of

the text file is imported as a separate value.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 123

To export a list of predefined values:

1. Click the Export icon.

2. Enter the name of the text file.

3. Click Save. A text file is exported that contains all predefined values; each line of the

text file is exported as a separate value.

To delete a value:

1. Highlight the value.

2. Click the Delete icon.

3. Click OK.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 124

Scanner Setup Settings

In the PaperVision Capture Administration Console, you can test and save scanner settings

during index, barcode, and OCR zone configuration. Black and white images are saved in an

industry standard Group IV TIFF file format, while color or grayscale images are saved in a

standard JPG or BMP file format. Settings in the Scanner Settings dialog box can be

accessed during index, barcode, and OCR zone configuration.

PaperVision Capture supports more than 300 ISIS-compatible scanners. The PaperVision

Capture installation media contains most of the currently available ISIS scanner drivers.

However, as this list is ever-growing, some newer drivers may not be available at the time of

distribution. If you need additional drivers, please contact Digitech Systems’ Technical

Support at support@digitechsystems.com or by phone at (877)374-3569. If the driver is

available, our support personnel will assist you in obtaining the driver.

PaperVision Capture also offers the ability to use TWAIN scanners. The use of TWAIN

scanners is generally intended for extremely low-volume scanners as ISIS drivers are

available for most scanners on the market.

Scanner Settings

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 125

Note:

Depending on the type of scanner that is used, some scanner options may be

disabled, and the number of options available in the drop-down menus may vary.

Saved Settings

This drop-down menu displays any scanner settings that were previously saved.

To save a new scanner setting:

1. Enter the name in the Saved Settings field.

2. Click Apply.

To remove a setting:

1. Select the setting from the Saved Settings drop-down list.

2. Click Delete.

Scanner Name

Click the Scanner Name drop-down menu to select a scanner that has been installed and

detected by PaperVision Capture. Select the Properties menu to configure scanner and file

import devices. Depending on the type of scanner, the menu options will display different

settings.

The Properties menu contains the following options:

• More Settings may contain additional scanner settings that are available for

configuration.

• About displays the driver's version, copyright, and other information specific to the

scanner.

• Area Settings allow you to assign the scanning area.

• Extended Settings may contain additional scanner settings that are available for

configuration.

• Windows Image Acquisition may contain additional settings if your scanner

supports Windows Image Acquisition.

• Calibrate allows you to calibrate the scanner driver.

• Configure allows you to configure the scanner driver settings.

Color Format

Also known as the mode, you can select from options such as black and white, color, etc.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 126

Dither

Dithering converts and simulates unavailable colors. When dithering is turned on, the system

combines two or more colors to approximate the unavailable color.

Horizontal Resolution

Select the horizontal dots-per-inch resolution setting to apply during the scanning process.

Vertical Resolution

Select the vertical dots-per-inch resolution setting to apply during the scanning process.

Page Size

This setting determines the default page size of the image as it is scanned.

Scan Type

This setting determines if scanning should be two-sided (duplex), one-sided (simplex), etc.

Brightness

Brightness defines a pixel's lightness value from black (darkest) to white (brightest). Select

the brightness level to be applied during the scanning process and whether it should be

applied manually or automatically. If applying the brightness manually, use the slider to

increase or decrease its amount.

Contrast

Contrast is a measure of the rate of change of brightness in an image. A high-contrast image

contains defined transitions from black to white. Select the contrast level to be applied during

the scanning process and whether it should be applied manually or automatically. If applying

the contrast manually, use the slider to increase or decrease its amount.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 127

Manual Barcode and OCR Indexing

You can configure the Capture and Indexing steps so that indexing operators (or scanning

operators tasked with indexing) can apply barcode or OCR zones directly on images in order

to populate index fields. For more information, see the section on Manual Barcode and

OCR Indexing in the previous chapter.

Manual QC

If you require Indexing operators to review and apply QC tags in the Indexing step, the

following Manual QC properties are available for configuration.

Allow Manual QC

You can enable this setting to allow operators to add your selected QC tags within the

Indexing job step.

Note:

When you enable this property, the Indexing step also consumes a Capture QC

Manual license (in addition to the Capture Index license).

Allow Review QC Tags

Applicable to manual job steps, this property allows you to choose whether the operator can

view the Browse QC Tags window in the PaperVision Capture Operator Console. Select True

to allow the operator to view the Browse QC Tags window. Select False to prevent the

operator from viewing the Browse QC Tags window.

Note:

No additional PaperVision Capture license is required for the operator to review QC

tags.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 128

QC Auto Play

When the Allow Manual QC property is enabled in the Indexing step, you can define how

long (in seconds) each image appears on screen so operators can perform visual inspections.

Click the ellipsis button on the right to configure the auto play settings.

QC Auto Play

• The Delay (sec) property determines how long each image or group of images remains

on screen at a time in the Manual QC step.

• The Skip Mode determines whether auto play skips batches or documents:

1. If you select the Batch skip mode, then you can define how pages are skipped. For

page skipping, you can require that operators inspect all pages (None), by page

number (Number, such as 1, 5, 10, etc.), or by a random number of pages

(Random).

2. If you select the Document skip mode, you can define how documents and pages

are skipped.

• For document skipping, you can require that operators inspect all documents

(None), by document number (Number, such as 1, 5, 10, etc.), or by a random

number of documents (Random).

• For page skipping, you can require that operators inspect all pages (None), by

page number (Number, such as 1, 5, 10, etc.), or by a random number of pages

(Random).

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 129

When you select the Random option, auto play skips an arbitrary number of pages or

documents (between zero and your assigned number). For example, if you enter “10,” then

three pages/documents may be skipped during the first auto play; nine pages/documents

during the second auto play; ten pages/documents during the third auto play; etc.

Operator Permissions

You can assign specific permissions that allow operators to perform operations on documents

and pages. In addition, you can determine whether operators can view the Browse Batch

window in the Operator Console. The Import Images operation is the only operation that

requires an additional Capture Scan license (in addition to the Capture Index license). The

remaining permissions do not require an additional license and are enabled by default to

provide operators the flexibility in manipulating documents and pages when indexing in the

Operator Console.

Add Documents

When set to True, the operator can append a blank document to the end of the batch.

Browse Batch

When set to True, the operator can view the Browse Batch window.

Copy Documents

When set to True, the operator can copy all pages and append the new document after the

selected document.

Copy/Move Pages

When set to True, the operator can copy/paste and cut/paste consecutive or non-consecutive

pages in one document or across multiple documents. The operator can also drag and drop

pages from one location to another in the Thumbnails window or multiple-display view.

Delete Documents

When set to True, the operator can delete a document and its associated images.

Delete Pages

When set to True, the operator can delete one or multiple page(s) within one document or

across multiple documents.

Chapter 6 – Indexing Configuration

PaperVision® Capture Administration Guide 130

Extract and Copy Pages

When set to True, the operator can extract a region of an image and copy it to the next page

of the document.

Import Images

When set to True, the operator can import images into a document.

Note:

By default, this property to set to False. When you enable this property, the

Indexing step also consumes a Capture Scan license (in addition to the Capture

Index license).

Insert Document Breaks

When set to True, the operator can insert a document break within a document.

Invert and Save Pages

When set to True, the operator can invert one or multiple pages’ polarity and then save the

pages.

Remove Document Breaks

When set to True, the operator can remove an existing document break within a document.

Re-Save Pages

When set to True, the operator can save a page that has been rotated or whose polarity has

been inverted.

Rotate and Save Pages

When set to True, the operator can rotate one or multiple pages and then save the pages.

Shuffle Documents to Duplex

When set to True, the operator can shuffle documents to duplex.

Chapter 7 – Barcode Configuration

PaperVision® Capture Administration Guide 131

You can use barcodes to populate index values and insert document breaks.

PaperVision Capture recognizes one- and two-dimensional, black and white,

and color barcodes. The Barcode job step allows you to configure a barcode reading

process that executes automatically in the PaperVision Capture Operator Console or by the

PaperVision Capture Automation Service.

Note:

Use of the binary scaling image processing filter can improve the recognition rate of

barcode detection.

To view the properties of the Barcode job step:

1. In the Job Definitions screen, select the Barcode job step in the workspace.

2. In the Properties grid, expand the Auto Document Break, General, and Indexes

nodes.

Auto Document Break

While scanning documents, you can determine where one document ends and the next

document begins using the Auto Document Break properties. Although you can separate

documents manually, you can select from options that are described below:

• By default, no auto-document breaks are inserted. When set to None, the system will

expect you to manually separate new documents. No options are available for this setting.

• If you select the Barcode mode, click the ellipsis button to the right of the Barcode Zone

field to define the zones in the Edit Document Break Barcodes screen. Select True for

the Save Page property to leave the page with the barcode in the batch, or select False to

remove the page with the barcode from the batch. For more information, see the section

on Barcode Zones in this chapter.

General Properties

For information on the Indexing step’s general properties, see the section on General

Properties in Chapter 4.

Indexes

You can configure additional index values and barcode zones for the Barcode job step. For

more information on configuring index values, see the section on Index Configuration in

Chapter 6.

Chapter 7 – Barcode Configuration

PaperVision® Capture Administration Guide 132

Barcode Parsing

During indexing configuration in a Barcode step, you can configure a text delimiter or a

regular expression to parse specific index fields from a barcode. You can then specify which

field’s index is parsed from the barcode (e.g., you can select the third field's index so only the

last four digits of a social security number are parsed). Optionally, you can verify that an

exact number of index fields results from the parse operation (e.g., three index fields

indicative of a social security number in the format xxx-xx-xxxx).

Note:

The Verify Number of Fields setting is intended to verify that an exact number of

index fields (two or more) results from the parse operation.

If errors occur during barcode parsing, such as when the parsed number of index fields differs

from your specified number of fields, you can select one of three subsequent actions. First, the

entire index value can be skipped (therefore, no barcode parsing occurs). In the second option,

the entire barcode value is used (therefore, no barcode parsing occurs). In the last option, you

can specify the text used as the parsed value (e.g., you can enter “unknown value”).

To configure barcode parsing:

1. In the Properties grid for the Barcode step, click the ellipsis button to the right of the

Indexes row.

2. In the Index Configuration dialog box, expand the General (Step Level) node.

Chapter 7 – Barcode Configuration

PaperVision® Capture Administration Guide 133

3. Click the ellipsis button to the right of the Barcode Parsing row. The Configure

Barcode Parsing dialog box appears.

Configure Barcode Parsing

4. In the Delimiter section, select whether to use a text delimiter or regular expression

to split the original value into fields. If you enter an invalid text delimiter or regular

expression, the error symbol will appear to the right of the field.

Note:

Additional information on regular expressions can be located at:

http://msdn.microsoft.com/library/default.asp?url=/library/en-

us/script56/html/js56reconIntroductionToRegularExpressions.asp

5. In the Field Parsing section, specify the field index position from which to parse

data.

Chapter 7 – Barcode Configuration

PaperVision® Capture Administration Guide 134

6. Optionally, you can verify that an exact number of index fields (two or more) results

from the parse operation.

For example, you can set the Field Index value to “3” to parse only the last four

digits of a social security number that exists in the format xxx xx xxxx. You can

then select the Verify Number of Fields option to verify that three index fields

(indicative of a social security number) result from the parse operation.

7. In the Parsing Errors section, select the action that will be executed if parsing errors

occur:

• Skip Index Value: The entire index value is skipped, so no barcode parsing

occurs.

• Use Complete Barcode Value: The complete barcode value is used, so no

barcode parsing occurs.

• Use Error Text: Your specified text is used as the parsed value.

8. In the Preview section, you can enter a sample index value to ensure the text

delimiter or regular expression parses the value correctly.

Configure Barcode Parsing (Configured)

Chapter 7 – Barcode Configuration

PaperVision® Capture Administration Guide 135

Barcode Zones

During index value configuration for a Capture step, you can configure barcode zones to be

recognized during the scanning process in the PaperVision Capture Operator Console.

To open the barcode zone settings:

1. In the Index Configuration dialog box, expand the General (Step Level) Settings

node for the appropriate index.

2. Click the ellipsis button to the right of the Barcode Zones field. The Edit Barcode

Zones screen opens.

Edit Barcode Zones

Note:

If you define more than one barcode zone in a multi-page document, the last

barcode value that is read on the last page overrides all others and populates the

index. If you define more than one barcode zone in a single-page document, the last

barcode value that passes through the system populates the index.

Chapter 7 – Barcode Configuration

PaperVision® Capture Administration Guide 136

The Edit Barcode Zones screen contains the following components:

• The main window, where you draw the barcode zones, displays the individual images.

To draw a barcode zone, press the left mouse button while you drag a rectangular

region around the barcode. You can then widen and narrow the boundaries of the

barcode zone region to adjust its size.

• The Barcode Explorer provides an expandable view of each defined barcode zone,

its dimensions, and test results.

• The Properties grid, viewable when you highlight a zone in the Barcode Explorer

tree, displays all properties associated with the selected barcode zone.

• Thumbnails windows are found in the Edit Barcode Zones, Edit OCR Zones, Edit

Nuance Full-Text OCR, Edit Open Text Full-Text OCR, and Edit Image Processing

Filters screens. You can right-click within any Thumbnails window to perform basic

operations on images, such as the cut/paste, copy/paste, delete, or select all operations.

The cut, copy, paste, and delete operations can be performed on consecutive or non-

consecutive images. Additionally, you can select multiple images and simultaneously

rotate them. The scrolling capability, displayed with up/down or left/right arrows as you

drag and drop images, allows you to quickly scroll through remaining images not shown

in the current window.

Note:

Images viewed as thumbnails can have maximum dimensions of 32,768 x 32,768

pixels.

• The status bar on the bottom of the screen displays each image’s page number, page

size (in KB), and page dimensions (in mm).

Note:

The page dimensions 215 x 279 mm are approximately equivalent to 8.5 x 11 inches.

Saving Barcodes

To save all defined barcode zones and return to index configuration, click the Save Barcodes

icon.

Configuring a Scanner

The Configure Scanner command allows you to assign scanner settings for barcode zone

recognition. To configure these settings, click the Configure Scanner icon. For more

information on each setting, see the section on Scanner Setup in Chapter 6.

Chapter 7 – Barcode Configuration

PaperVision® Capture Administration Guide 137

Starting the Scanning Process

After loading images, you can scan them to ensure the barcodes zones are being read

successfully. To start the scanning process, click the Start Scanning icon.

Stopping the Scanning Process

To stop the scanning process, click the Stop Scanning icon.

Removing a Single Image

To remove a single image:

1. In the Thumbnails section, select the image to delete.

2. Click the Remove Single Image icon.

3. Click Yes to confirm the removal.

Rotating an Image 90° Counter-Clockwise

To rotate the image 90 degrees counter-clockwise, click the Rotate Image 90° Counter-

Clockwise icon.

Rotating an Image 90° Clockwise

To rotate the image 90 degrees clockwise, click the Rotate Image 90° Clockwise icon.

Removing All Images

This command removes all current images from the main scanning window and from the

Thumbnails section.

To remove all images:

1. Click the Remove All Images icon.

2. Click Yes to confirm the removals. If you have defined barcode zones prior to

clearing all images, these barcode zones are retained.

Chapter 7 – Barcode Configuration

PaperVision® Capture Administration Guide 138

Importing Images

To import images:

1. Click the Import Images icon.

2. Locate the directory of the image(s).

3. Select the image to import.

4. Click Open.

Exiting the Edit Barcode Zones Screen

To close and exit out of the Edit Barcode Zones screen:

1. Click the Exit icon.

2. Click Yes to save all barcode changes.

Testing All Barcode Zones

This operation verifies that all defined barcode zone regions read barcodes successfully.

Note:

If you test multiple barcode zones that exist for the same index, the last barcode read

by the system overrides the others. Results for every barcode will then populate the

Results row in the Barcode Explorer.

To test all barcodes:

1. After you insert all barcode zones and assign properties to each, click the Test All

Barcode Zones icon.

• The Barcode Explorer tree updates the Results row for each zone that contains

your defined barcodes.

• A successful reading, indicated with a green check mark, will populate the

Results row in the Barcode Explorer tree.

2. If you do not receive a successful test result, select more barcode types, enable

decoding, and/or enable checksum reading as appropriate, and run the test once again.

Tip:

Poor image quality might result in an unsuccessful reading. Import a clearer

barcode image if the first reading was unsuccessful.

Chapter 7 – Barcode Configuration

PaperVision® Capture Administration Guide 139

Zooming In, Zooming Out, and Resetting the Zoom

• To zoom in on an area of the image, click the Zoom In icon.

• To zoom out of the current view of the image, click the Zoom Out icon.

• To reset the image to its original view, click the Zoom Reset icon.

Chapter 7 – Barcode Configuration

PaperVision® Capture Administration Guide 140

Barcode Explorer

The Barcode Explorer summarizes your defined barcode zones per page and allows you to

add, remove, test, and modify each barcode zone.

• To view the properties of a barcode zone, highlight the Zone node in the tree, and its

properties appear in the grid below.

• Expand the Zone node to view a barcode zone's X and Y coordinates, dimensions (in

millimeters), orientation, and test results.

Barcode Explorer

Chapter 7 – Barcode Configuration

PaperVision® Capture Administration Guide 141

Adding a Barcode Zone to a Page

You can add a new barcode zone to the current page or a new page. The Barcode Explorer

tree updates with each addition or modification.

To add a new barcode zone to the current page:

1. Click the down arrow in the Add Zone icon, and select Add Zone (Selected

Page).

2. Use the cursor to drag a rectangular region around a barcode.

3. Move and/or edit the barcode zone if necessary.

To add a new barcode zone to a new page:

1. Click the down arrow in the Add Zone icon, and select Add Zone (New Page).

2. In the Page Index dialog box, enter the page number where the new barcode zone

will reside.

Note:

If you enter a page that already exists or if you enter an invalid number, a

reminder message appears.

3. With the left mouse button, drag a rectangular region around a barcode.

4. Move and/or edit the barcode zone if necessary.

Removing a Barcode Zone

To remove a barcode zone:

1. In the tree, highlight the zone(s) to remove.

2. Click the Remove Zone icon.

3. Click OK to the confirmation prompt.

Chapter 7 – Barcode Configuration

PaperVision® Capture Administration Guide 142

Removing All Zones on a Page

To remove all barcode zones on a page:

1. In the Barcode Explorer tree, highlight the page where the zones will be removed.

2. Click the Remove All Zones On This Page icon.

3. Click OK to the confirmation prompt.

Testing a Barcode Zone

This operation verifies that individual barcode zones can be read successfully. If more than

one barcode exists in one zone, the engine returns the value read from the first barcode.

To test a barcode zone:

1. Highlight the zone in the Barcode Explorer.

2. Click the Test Barcode Zone icon. A successful reading, indicated with a green

check mark, populates the Results row in the Barcode Explorer tree.

3. If you do not receive a successful test result, select more barcode types, enable

decoding, and/or enable checksum reading as appropriate, and run the test once again.

Tip:

Poor image quality might result in an unsuccessful reading. Import a clearer

barcode image if the reading was unsuccessful.

Expanding All and Collapsing All Barcode Zones

• To expand all zones, click the Expand All icon.

• To collapse all zones, click the Collapse All icon.

Chapter 7 – Barcode Configuration

PaperVision® Capture Administration Guide 143

Barcode Zone Properties

The properties described in this section can be configured for each barcode zone.

Image Size

This field is read-only; if no barcode zone is defined, the page size appears in this field. If a

barcode zone is defined, the size of the zone and the page size display in this field. All sizes

appear in millimeters.

Barcode Types

The following two-dimensional (2D) barcode types are supported in PaperVision Capture:

• DataMatrix

• PDF417

• QR Code

• Royal Post

• Australian Post

• Intelligent Mail

The following one-dimensional (1D) barcode types are supported in PaperVision Capture:

• Addon 2

• Addon 5

• BCD Matrix

• Codabar

• Code25 Datalogic

• Code25 IATA

• Code25 Industrial

• Code25 Interleaved

• Code25 Invert

• Code25 Matrix

• Code 32

• Code 39

• Code 93

• EAN 13

• EAN 8

• Postnet

• Type 128

• UCC 128

• UPC-A

• UPC-E

Chapter 7 – Barcode Configuration

PaperVision® Capture Administration Guide 144

To select the barcode types:

1. Click the ellipsis button in the Barcode Types field in the Properties grid.

2. Select the barcode types to be recognized.

3. Click the Select All button if you want PaperVision Capture to recognize all types.

4. Click OK.

Decode

Some barcode types, such as Code 128, do not represent their data as ASCII characters. Other

barcode types, such as Code 3 of 9, use special characters to extend the basic character set to

include the entire ASCII set. When this setting is enabled, barcode values are converted into

human-readable ASCII strings. For example, if the barcode uses escape characters, as in

"*%K123%M?*", and the Decode property is True, then "[123]" will be returned. If the

Decode property is False, the raw barcode is returned.

Note:

You should enable this setting unless the barcode results should not be converted

into ASCII strings. For example, this setting should be disabled if you are detecting

Code 3 of 9 barcodes that represent dates using the slash mark “/” character (e.g.

01/01/1999). If this setting is enabled, no results are returned because “/0” and “/1”

are not valid ASCII characters.

Orientation

PaperVision Capture detects horizontal and vertical barcodes with skew angles of no more

than 15 degrees from the horizontal and vertical axes, respectively. Horizontal barcode

detection is slightly faster than vertical barcode detection. If you are unsure of the expected

barcode orientation or if the documents might contain barcodes with different orientations,

select Both from the drop-down menu.

Required for Delete (for Auto Document Breaks)

This property is applicable when you define Auto Document Breaks with barcodes. When

set to True, the break page will be deleted when all defined barcode zones are read

successfully.

Chapter 7 – Barcode Configuration

PaperVision® Capture Administration Guide 145

Region

The Region property displays a barcode zone's X and Y coordinates and its height and width.

To change the dimensions of the barcode zone:

1. Click the ellipsis button in the right column next to the Region field. The Zone

Rectangle dialog box appears.

Zone Rectangle

2. In the Zone Rectangle dialog box, select Whole Page if you want the barcode zone

to comprise the entire height and width of the page.

3. To specify the dimensions of the barcode zone, enter the left, top, width, and height

(in millimeters) of the zone rectangle.

4. Click OK.

Chapter 7 – Barcode Configuration

PaperVision® Capture Administration Guide 146

Regular Expression Verification (for Auto Document Breaks)

This field is applicable when you define Auto Document Breaks with barcodes. If you enter

an exact value or regular expression into the Regular Expression Verification field, a

document break is only inserted when the system reads barcodes matching your exact value or

regular expression. If you leave this field blank, any barcode read by the system will cause a

document break to be inserted. A regular expression is a pattern of text that consists of

ordinary characters (for example, letters A through Z) and special characters, known as

metacharacters. The pattern describes one or more strings to match when searching a body of

text. The regular expression serves as a template for matching a character pattern to the string

being searched.

To configure a regular expression:

1. Click the ellipsis button in the right column next to the Regular Expression field.

The Regular Expression dialog box appears.

Regular Expression

2. In the Regular Expression dialog box, enter the regular expression.

3. Enter the text to validate.

• A successful validation displays with a check mark icon.

• Invalid entries display with an “X” icon.

Use Checksum

A checksum is an error detection process where additional characters are appended to a

barcode to ensure more accurate readings. Enable this setting if you want the checksum to be

recognized during the scanning process.

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 147

PaperVision Capture enables you to customize Optical Character Recognition

(OCR) settings for individual index fields and pages of text that you define

within zones. The Nuance and Open Text OCR job steps allow you to configure an OCR

process that executes automatically in the PaperVision Capture Operator Console or by the

PaperVision Capture Automation Service. You can also configure OCR zones to insert

document breaks. Character recognition options allow you to customize how values are

recognized by processes such as OCR, Intelligent Character Recognition (ICR), and Magnetic

Ink Character Recognition (MICR).

During index value configuration for the Nuance OCR or Open Text OCR job step, you can

define the OCR zones that will be recognized during OCR processing. Your selected step

determines the properties available for zonal OCR configuration. For more information

specific settings for each step, see the sections on Nuance Zonal OCR or Open Text Zonal

OCR in this chapter.

Maximum Image Sizes

The Nuance OCR engine supports incoming images ranging from 75 to 2400 dots per inch

(DPI). In pixels, this range is 16 x 16 to 8400 x 8400 pixels.

The maximum supported image dimensions that can be processed through the Open Text

engine vary with resolution. The approximate maximum width is approximately 32,000

pixels, and the maximum height is approximately 24,000 pixels. For example, the maximum

supported image dimensions at 300 dpi are approximately 106 inches x 80 inches. Images that

are processed through the Open Text OCR engine must contain matching horizontal and

vertical resolutions.

Note:

Larger images can be ingested into PaperVision Capture provided that:

1. No Full-Text OCR will be performed on the images (unless they are processed

using the Image Fit filter and cropped to meet size requirements)

2. No image processing will be performed on the images (unless they are

processed using the Image Fit filter and cropped to meet size requirements)

3. Images will not be viewed as thumbnails

To view the properties for the Nuance OCR or Open Text OCR job step:

1. In the Job Definitions screen, select the Nuance OCR or Open Text OCR job

step in the workspace.

2. In the Properties grid, expand the Auto Document Break, General, and Indexes

nodes.

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 148

Auto Document Break

While scanning documents, you can determine where one document ends and the next

document begins by inserting an auto document break. Although you can separate documents

manually, you can select from options that are described below. Select an option in the drop-

down list in the right column of the Mode field:

• None: This is the default auto-document break type for a newly created step. When set

to None, the system will expect you to manually separate new documents. No options are

available for this setting.

• OCR: If you select the OCR mode, click the ellipsis button to the right of the OCR Zone

field to define the zones in the Edit OCR Document Breaks screen. For the Save Page

property, select True to leave the page with the auto-document break in the batch, or

select False to remove the auto-document break page from the batch.

General Properties

For more information, see the section on General Properties in Chapter 4.

Indexes

You can configure OCR zones specific to each index. The Line Feed Delimiter property,

specific to OCR zones, allows you to define extra spaces, characters, etc. that will replace

carriage returns located during OCR processing. To configure the settings for an index, click

the ellipsis button next to the Indexes row in the Properties grid. For more information on

assigning index types, see the section on Index Types and Formats in Chapter 6.

Line Feed Delimiter

To define the line feed delimiter for the OCR Zone:

1. In the Properties grid for the OCR step, click the ellipsis button to the right of the

Indexes row.

2. In the Index Configuration dialog box, expand the General (Step Level) node.

3. Click the ellipsis button to the right of the OCR Line Feed row.

OCR Line Feed

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 149

4. In the OCR Line Feed dialog box, select the Replace checkbox.

5. Enter the Delimiter that will be used to replace the OCR line feed.

6. Click OK.

OCR Parsing

During indexing configuration in an OCR step, you can configure a text delimiter or a regular

expression to parse specific index fields from OCR text. You can then specify which field’s

index is parsed (e.g., the fourth field’s index from a credit card number). Optionally, you can

verify that a certain number of index fields results from the parse operation (e.g., four index

fields indicative of a complete credit card number).

Note:

The Verify Number of Fields setting is intended to verify that an exact number of

index fields (two or more) results from the parse operation.

If errors occur during OCR parsing, such as when the parsed number of index fields differs

from your specified number of fields, you can select one of three subsequent actions. First, the

entire index value can be skipped (therefore, no OCR parsing occurs). In the second option,

the entire OCR value is used (therefore, no OCR parsing occurs). In the last option, you can

specify the text used as the parsed value (e.g., you can enter “unknown value”).

To configure OCR parsing:

1. In the Properties grid for the OCR step, click the ellipsis button to the right of the

Indexes row.

2. In the Index Configuration dialog box, expand the General (Step Level) node.

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 150

3. Click the ellipsis button to the right of the OCR Parsing row. The Configure OCR

Parsing dialog box appears.

Configure OCR Parsing

4. In the Delimiter section, select whether to use a text delimiter or regular expression

to split the original value into fields. If you enter an invalid text delimiter or regular

expression, the error symbol will appear to the right of the field.

Note:

Additional information on regular expressions can be located at:

http://msdn.microsoft.com/library/default.asp?url=/library/en-

us/script56/html/js56reconIntroductionToRegularExpressions.asp

5. In the Field Parsing section, specify the field index position from which to parse

data.

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 151

6. Optionally, you can verify that an exact number of index fields (two or more) results

from the parse operation.

For example, you can set the Field Index value to “4” to parse only the last four

digits of a credit card number You can then select the Verify Number of Fields

option to verify that four index fields (indicative of a social security number)

result from the parse operation.

7. In the Parsing Errors section, select the action that will be executed if parsing errors

occur:

• Skip Index Value: The entire index value is skipped, so no OCR parsing

occurs.

• Use Complete OCR Value: The complete OCR value is used, so no OCR

parsing occurs.

• Use Error Text: Your specified text is used as the parsed value.

8. In the Preview section, you can enter a sample index value to ensure the text

delimiter or regular expression parses the value correctly.

Configure OCR Parsing (Configured)

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 152

OCR Zones

PaperVision Capture recognizes OCR zones that you define in Job Definitions. During index

value configuration for the Nuance OCR and Open Text OCR job step, you can define the

OCR zones that will be recognized during OCR processing.

To view OCR zone settings:

1. In the Job Definitions workspace, select the Nuance Zonal OCR or Open Text Zonal

OCR job step.

2. In the Properties grid, expand the Indexes node, and then click the ellipsis button next

to Indexes field.

3. In the Index Configuration dialog box, highlight the index in the Indexes section.

4. Under the Index Properties section, expand the General (Step Level) node.

5. Click the ellipsis button to the right of the OCR Zones field. The Edit OCR Zones

screen appears.

Edit OCR Zones (Nuance Zonal OCR)

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 153

The Edit OCR Zones screen contains the following components:

• The main window, where you draw the OCR zones, displays the individual images.

To draw an OCR zone, press the left mouse button while you drag a rectangular

region around the OCR region. You can widen and narrow the region's boundaries to

adjust its size.

• OCR Explorer provides an expandable view of each defined OCR zone, its

dimensions, and test results.

• The Properties grid, viewable when you highlight a zone in the OCR Explorer tree,

displays all properties associated with the selected OCR zone.

• Thumbnails windows are found in the Edit Barcode Zones, Edit OCR Zones, Edit Full-

Text OCR, and Edit Image Processing Filters screens. You can right-click within any

Thumbnails window to perform basic operations on images, such as the cut/paste,

copy/paste, delete, or select all operations. The cut, copy, paste, and delete operations can

be performed on consecutive or non-consecutive images. Additionally, you can select

multiple images and simultaneously rotate them. The scrolling capability, displayed with

up/down or left/right arrows as you drag and drop images, allows you to quickly scroll

through remaining images not shown in the current window.

Note:

Images viewed as thumbnails can have maximum dimensions of 32,768 x

32,768 pixels.

• The status bar on the bottom of the screen displays each image’s page number, page

size (in KB), and page dimensions (in mm).

Note:

The page dimensions 215 x 279 mm are approximately equivalent to 8.5 x 11

inches.

Saving All OCR Zones

To save all defined OCR zones and return to index configuration, click the Save All OCR

Zones icon.

Configuring the Scanner

To configure the scanner settings, click the Configure Scanner icon. For details on each

setting, see the section on Scanner Setup Settings in Chapter 6.

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 154

Starting the Scanning Process

After loading images, scan them to ensure OCR zones are being read successfully. To scan

the images, click the Start Scanning icon.

Stopping the Scanning Process

To stop the scanning process, click the Stop Scanning icon.

Removing a Single Image

To remove a single image:

1. In the Thumbnails section, select the image to delete.

2. Click the Delete Single Image icon.

3. Click Yes to the confirmation message.

Removing All Images

This command removes all current images from the main scanning window and from the

Thumbnails section.

To remove all images:

1. Click the Remove All Images icon.

2. Click Yes to the confirmation message.

Note:

If you have defined OCR zones prior to clearing all images, these zones are retained.

Rotating the Image 90° Counter-Clockwise

To rotate the image 90 degrees counter-clockwise, click the Rotate Image 90° Counter-

Clockwise icon.

Rotating the Image 90° Clockwise

To rotate the image 90 degrees clockwise, click the Rotate Image 90° Clockwise icon.

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 155

Importing Images

To import images:

1. Click the Import Images icon.

2. Locate the directory of the image(s).

3. Click Open, and the image appears in the main OCR window.

Testing All OCR Zones

The Test All OCR Zones command verifies that all defined OCR zone regions will recognize

OCR characters.

To test all OCR zones:

1. After you insert all OCR zones and assign properties to each, click the Test All OCR

Zones icon.

• The OCR Explorer updates the Results row for each page containing your defined

zones.

• A successful reading, indicated with a green check mark, populates the Results

row.

2. If you do not receive a successful test result, adjust one or more properties, and run

the test once again.

Tip:

Poor image quality might result in an unsuccessful reading, so try importing a

clearer image.

Zooming Commands

• To zoom in on an area of the image, click the Zoom In icon.

• To zoom out of the current view of the image, click the Zoom Out icon.

• To reset the image to its original view, click the Zoom Reset icon.

Exiting the OCR Zones Screen

To close and exit out of the Edit OCR Zones screen:

1. Click the Exit icon.

2. Click Yes to save all changes.

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 156

General OCR Properties

You can assign general OCR properties described in this section.

Region Size

This field is read-only; the OCR zone's X and Y coordinates are displayed along with its

height and width in millimeters.

Image Size

This field is read-only; if no OCR zone is defined, the page size appears in this field. If an

OCR zone is defined, the zone and page size display in millimeters.

Regular Expression Verification

A regular expression is a pattern of text that consists of ordinary characters (for example,

letters A through Z) and special characters, known as metacharacters. The pattern describes

one or more strings to match when searching a body of text. The regular expression serves as

a template for matching a character pattern to the string being searched.

Regular expressions are applied on a per-zone basis. When you define Auto Document Breaks

using OCR zones, you can assign an exact value or regular expression, and a document break

will only be inserted when the system reads an OCR zone matching that exact value or regular

expression. If you leave this field blank, any OCR zone recognized by the system will cause a

document break to be inserted.

To assign a search value:

1. Click the ellipsis button next to the Regular Expression Verification field.

2. Enter the regular expression or exact value.

3. Enter the text to validate.

• A successful validation displays with a green icon.

• Invalid entries display with a red icon.

Note:

To clear the field, right-click the ellipsis button and select Reset.

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 157

Nuance OCR Page Properties

The Nuance OCR settings described in this section can be configured for each page. Some of

the settings refer to the temporary black and white image that is created during OCR

processing.

Additional Character Filters

This setting allows you to define additional characters to recognize during OCR processing.

Characters that you define here are processed when you have selected the Plus or Number

Character Filter setting.

Additional Language Filters

You can assign additional characters to increase the number of acceptable characters as

determined by your selected spelling language.

Brightness

You can assign the brightness value (between 0 and 100) for the image. A value of 0 is

lightest; 100 results in the darkest image. The default value is 50.

Brightness Threshold

You can assign a brightness threshold value (between 0 and 255) for the image. The default

value is 128.

Enable Fax-Handling (Omnifont Multi-Lingual)

You should enable this setting if you are processing a scanned image that was faxed in draft

mode (200 x 100 dpi).

Hand-Printed Character Height

You can assign the expected character height (in 1/1200 of an inch) for the Constrained

Handprint Recognition (Numeric) module. The default value is 0.

Note:

1/1200 of an inch is equivalent to approximately 0.021mm.

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 158

Hand-Printed Character Width

You can assign the expected character width (in 1/1200th of an inch) for the Constrained

Handprint Recognition (Numeric) module. The default value is 0.

Hand-Printed Detect Spaces

If this setting is enabled, the Constrained Handprint Recognition (Numeric) module will

detect spaces between characters.

Hand-Printed Leading Spaces

You can assign the expected leading spaces (in 1/1200th of an inch) for the Constrained

Handprint Recognition (Numeric) module. The default value is 0.

Hand-Printed Style

You can select either the European or U.S. writing style of the Constrained Handprint

(Numeric) module. For example, the number seven is crossed in European style and

uncrossed in American style.

Recognition Languages

The default recognition language is English, and any combination of recognition languages

can be selected. You can increase the number of recognized characters by assigning the

Additional Language Filter property, and you can narrow them by selecting from the

Character Filter list.

To select the Recognition Languages:

1. Click the ellipsis button next to the Recognition Language field.

2. Select the languages to include during the OCR process. Characters from your selected

language will be recognized during OCR.

3. Click OK.

Note:

A faster reading will result if you match the Spelling Language to your

selected Recognition Language.

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 159

Recognition Process Setting

The Recognition Process Setting is applied at the page level during OCR and involves a trade-

off between accuracy and speed.

• Accurate, the default setting, results in the most accurate recognition.

• Balanced applies average accuracy and speed recognition.

• Fast results in the fastest recognition, but accuracy may be compromised.

Rejection Symbol

This property represents rejected characters in output documents. A rejected character is not

recognized by the active OCR recognition engine configuration. The default value is the Tilde

character (~). Only a single character can be entered in this field.

Tip:

To prevent unrecognized characters from appearing in output documents, leave this

field blank.

Spelling Language

This property accepts all possible recognition languages. The Auto setting matches the

recognition language with the corresponding spelling language. Only one spelling language

can be selected at a time.

Vertical Dictionaries

By default, Vertical Dictionaries are disabled; however, you can select any combination of

dictionaries to include during OCR processing. PaperVision Capture supports the following

dictionaries:

• Dutch Legal Professional Dictionary

• Dutch Medical Professional Dictionary

• English Financial Professional Dictionary

• English Legal Professional Dictionary

• English Medical Professional Dictionary

• French Legal Professional Dictionary

• French Medical Professional Dictionary

• German Legal Professional Dictionary

• German Medical Professional Dictionary

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 160

Nuance Zonal OCR Properties

The OCR settings described in this section can be configured for each zone.

Capitalize Proper Names

If this setting is enabled, the correction feature of the recognition subsystem will capitalize

names inside recognized text.

Character Filter

Character filters that are defined at the zone level will narrow the search for only your

specified sets of characters. By default, all character filters are selected, but you can select a

specific set of characters that will be recognized during OCR processing.

Your selected recognition module may restrict the character filters recognized during OCR

processing. For example, the Constrained Handprint (Numeric) module only supports

numerals and four other characters, so if you select the Alpha character filter, your character

filters will not be recognized. All character filters are supported by the Omnifont Multi-

Lingual, Constrained Handprint (Alphanumeric), Omnifont Multi-Lingual (FRX), and Draft

Dot-Matrix modules.

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 161

The table below describes each character filter that you can define for the zone:

Character Filter

Description

All

Since all filters are enabled, no filtering is applied

Alpha

Recognizes only upper- and lower-case letters

Default

Causes the zone to be handled globally; do not

combine with any other filter

Digit

Recognizes only numerals

(1, 2, 3, etc.)

Lower-case

Recognizes only lower-case letters

(a, b, c, etc.), including accented letters

Miscellaneous

Only recognizes other miscellaneous characters

(+, -, etc.)

Numbers

Recognizes only the digits and any values defined

in the Additional Character Filters field for the page

Plus

Enables the use of only defined Additional

Character Filters; these characters are added after

all other filters

Punctuation

Recognizes only punctuation signs

(!, @, #, etc.)

Upper-case

Recognizes only upper-case letters

(A, B, C, etc.), including accented letters

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 162

Filling Method

This setting is based on the selected recognition module and contains the filling method for

the specified OCR zone. The filling method corresponds with the zone’s contents. If an

incorrect filling method is chosen for the zone, its contents will not be recognized. The

following table displays the filling methods, their descriptions, and the supported recognition

modules.

Filling Method

Description

Supported

Recognition Modules

Default

This is the filling

method to be used,

acquired from the

recognition

module

N/A

Omnifont

(Default setting)

indicates machine-

printed text with

any typeface

Omnifont Plus (2W)

Omnifont Plus (3W)

Omnifont Multi-Lingual

(FRX)

Omnifont Matrix

Draft-Dot 9

9-pin draft dot-

matrix printout

Draft Dot-Matrix

Omnifont Matrix

Hand-Printed

Hand-printing

within the zone

Constrained Handprinted

Recognition (Numeric)

Constrained Handprinted

Recognition

(Alphanumeric)

Draft-Dot 24

24-pin draft dot-

matrix printout

Omnifont Multi-Lingual

Omnifont Matrix

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 163

Filling Method

Description

Supported Recognition

Modules

OCR-A

OCR-A filling method

Omnifont Multi-Lingual

Omnifont Matrix

Matrix Matching Recognition

OCR-B

OCR-B filling method

Omnifont Multi-Lingual

Omnifont Matrix

Matrix Matching Recognition

Magnetic Ink

Character

Recognition

Magnetic ink character

filling method

Matrix Matching Recognition

Dash-digit

Dash-digit zone filling

method

Matrix Matching Recognition

Dot-digit

Indicates the dot-digit

zone filling method

Matrix Matching Recognition

Ignore Blank Spaces

If this setting is enabled, white space characters (including white space created by the

SPACEBAR and TAB keys) will be excluded (ignored) during OCR processing.

Ignore Character Case

If this setting is enabled, upper-and lower-case characters will be ignored during OCR

processing. If this setting is disabled, upper- and lower-case characters will be discerned

during OCR processing.

Include Punctuation

If this setting is enabled, punctuation will be recognized during OCR processing.

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 164

Recognition Module

All zones must have a recognition module assigned before OCR processing can be

successfully completed. See the next section on OCR Recognition Modules for detailed

descriptions of each module.

Verify Complete Lines

If you enable this setting, entire lines of text (instead of individual words) will be processed

through OCR. Select False to pass individual words through OCR processing.

Zone Type

This setting describes the area inside the OCR zone, and whether that area should be

recognized or ignored. You can assign zone types to be treated as text, a table, or a form.

• Auto automatically performs a parsing algorithm, and may create several OCR zone types

including Flow, Table, and Form.

• Flow contains flowed text without a table structure inside the zone.

• Form represents an unfilled form.

• Table contains a table with rows and columns, with or without a grid.

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 165

Nuance OCR Recognition Modules

A Nuance OCR license includes all recognition modules except the Constrained Handprint

Recognition (Numeric) and Constrained Handprint Recognition (Alphanumeric) modules that

require a separate Intelligent Character Recognition (ICR) license.

Omnifont Matrix

The Omnifont Matrix recognition module recognizes machine-printed text from printed

publications, laser and ink-jet printers, and electric typewriters. Mechanical typewriters may

also produce readable output. This module can also be used with Letter Quality (LQ) or Near

Letter Quality (NLQ) output from dot-matrix printers, and can also be used for Draft Quality

(DQ).

Omnifont Matrix detects and transmits bold, italic, and underlined text (including

combinations). This module also detects and transmits character size and classifies font types

into the serif, sans serif, and monospaced categories.

Supported Filling Methods:

• Omnifont

• Draft-Dot 9

• Draft Dot-24

• OCR-A

• OCR-B

Supported Filter Types:

• All

• Digit

• Alphanumeric

Supported Recognition Processing Settings:

• Fast

• Balanced and Accurate merged into one value

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 166

Omnifont Multi-Lingual

The Omnifont Multi-Lingual module recognizes machine printed text from printed

publications, laser and ink jet printers, and electric typewriters. Mechanical typewriters may

produce readable output. Additionally, dot matrix printers with NLQ and LQ output may

produce readable results. Use the DRAFTDOT24 filling method for draft quality 24-pin dot-

matrix documents. NLQ and LQ output can be better recognized without using the filling

method DRAFTDOT24. A maximum of 500 OCR zones can be defined on one image for this

module.

Omnifont Multi-Lingual detects and transmits bold, italic, and underlined text (including

combinations). This module also detects and transmits character size and classifies font types

into serif, sans serif, and monospaced categories.

Character Range:

• Latin, Greek, and Cyrillic alphabets and accented letters

• 500 characters

Character Set:

Characters

Non-Accented

Accented

Latin alphabet upper-case letters

Latin alphabet lower-case letters

Digits

Punctuation

Miscellaneous symbols

Cyrillic upper-case letters

Cyrillic lower-case letters

Greek upper-case letters

Greek lower-case letters

OCR (OCR-A and MICR) characters

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 167

Supported Filling Methods:

• Omnifont

• Draft Dot-24

• OCR-A

• OCR-B

Supported Filter Types:

• Default

• Digit

• Upper-Case

• Lower-Case

• Punctuation

• Miscellaneous

• Plus

• All

• Alphanumeric

• Number

Supported Recognition Process Settings:

• Fast

• Balanced

• Accurate

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 168

Draft Dot-Matrix

The Draft Dot-Matrix recognition module is only designed for draft-quality, 9-pin, dot-matrix

text. No recognition process settings are supported, but all filters are supported in the module.

Expanded characters are not recognized, but condensed characters can be recognized

(although their accuracy may be low).

For NLQ or LQ text, the following Omnifont modules produce better results:

• Omnifont Plus (2W)

• Omnifont Plus (3W)

• Omnifont Matrix

• Omnifont Multi-Lingual

Character Range:

Upper- and Lower-Case

Lower-Case Only

A Acute (A’)

A Circumflex (a^)

AE (Ae)

A Macron (a-)

A Ring (Ao)

A Grave (a`)

A Umlaut (A:)

E Umlaut (e:)

A Tilde (A˜)

E Circumflex (e^)

C Cedilla (C,)

E Grave (e`)

E Acute (E')

I Umlaut (I:)

I Acute (I')

I Circumflex (I^)

N Tilde (N~)

I Grave (I`)

O Double Acute (O")

O Circumflex (O^)

O Acute (O')

O Macron (O-)

O Umlaut (O:)

O Grave (O`)

O Tilde (O~)

S Hacek (Sv)

O Slash (O/)

U circumflex (U^)

AE (OE)

U Grave (U`)

U Double Acute (U")

U Acute (U')

U Umlaut (U:)

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 169

Constrained Handprint Recognition (Numeric)

The Constrained Handprint Recognition (Numeric) module recognizes hand-printed numeric

characters and four calculation signs. The Constrained Handprint Recognition

(Alphanumeric) module is included with the ICR license.

• For better recognition, characters should not touch one another, and each character

must be between 30-180 pixels in height.

• Well-formed numbers written in pen are best recognized; pencil and felt-tip pens

result in poorer recognition.

• The maximum number of characters that can be contained in a zone is 3000.

• The maximum number of lines that can be contained in a zone is 40.

• The maximum number of characters that can be contained per line is 600.

• Each OCR zone can contain only one character, or each zone can contain several lines

of characters.

• Optimally, the OCR zone region should be 5x6 mm separated by 3 mm.

Character range:

• Digits (0-9)

• Plus sign (+)

• Minus sign (-)

• Period or full-stop (.)

• Comma (,)

Supported Filter Types:

• All

• Digit

• Punctuation

• Miscellaneous

Note:

You can use the Digit filter to exclude the Plus Sign, Minus Sign, Period, and

Comma during processing.

Supported Recognition Processing Settings:

• Fast

• Balanced and Accurate (merged into one value)

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 170

Constrained Handprint Recognition (Alphanumeric)

The Constrained Handprint Recognition (Alphanumeric) module recognizes hand-printed

alphanumerical characters such as upper- and lower-case letters, digits, and others. The

Constrained Handprint Recognition (Alphanumeric) module is included with the ICR license.

This module can read flowed text, but is applied mainly in hand-printed forms.

The Constrained Handprint Recognition (Alphanumeric) module differentiates over 150

characters, including digits, punctuation marks, miscellaneous characters, English alphabet

letters, and accented characters.

Note:

Cyrillic and Greek languages are not supported in this module.

The only supported Filling Method is Handprint, but all filter types are supported. Hand-

printed text is more difficult to recognize, but enhanced character quality can improve

recognition. Structured forms and zone filters can improve OCR processing for this module.

• For better recognition, characters should not touch one another.

• Each character must be between 30-180 pixels in height.

• Well-formed characters written in pen are best recognized.

• Pencil and felt-tip pens result in poorer recognition.

• The maximum number of characters per line is 200.

• An infinite number of lines can be assigned per zone.

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 171

Recognized Punctuation and Miscellaneous Characters:

• Exclamation Mark (!)

• Question Mark (?)

• Apostrophe or Single Quote (')

• Quotation Mark (")

• Semicolon (;)

• Comma (,)

• Colon (:)

• Period or full-stop (.)

• Hyphen or Minus Sign (-)

• Opening and Closing Parentheses ( )

• Opening and Closing Square Brackets [ ]

• Opening and Closing Curly Brackets { }

• Number Sign (#)

• Percent Sign (%)

• At (@)

• Ampersand (&)

• Vertical Bar ( | )

• Dollar Sign ($)

• Asterisk (*)

• Plus Sign (+)

• Equals Sign (=)

• Underscore (_)

• Slash Mark (/)

• Backslash (\)

• Less Than ( < )

• Greater Than ( > )

Supported Recognition Process Settings:

• Fast

• Balanced

• Accurate

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 172

Matrix Matching Recognition

The Matrix Matching Recognition module reads groups of fixed-font characters designed

specifically for OCR or imaging applications in which no two characters have similar shapes.

Relevant applications include banking, check handling, product distribution, and document

validation, where accuracy is critical. Each character group has its own filling method.

Additionally, some non-fixed print styles are also recognized. No recognition processing

settings are supported, but all filters (except the Lower-Case filter) are supported in the

module.

Character Range:

Character Type

Characters Included

OCR-A*

•

Upper-case English letters

• Digits

• Some punctuation

• OCR symbols (Chair, Hook, and Fork):

OCR-B

•

Upper-case English letters

• Digits

• Some punctuation

Magnetic Ink

Character*

•

Digits

• Some punctuation

• Magnetic Ink Character symbols (OCR

Branch Bank, OCR Amount of Check,

OCR Dash, and OCR Customer Account

Number:

Dot-Digit Zone

•

Ten digits and period

• Commas are read, but converted to periods

Dash-Digit Zone

•

Ten digits and period

• Commas are read, but converted to periods

* Only recognized when selected for the Filling Method

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 173

Supported Filling Methods:

• OCR-A

• OCR-B

• Magnetic Ink Character Recognition

• Dot-Digit

• Dash-Digit

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 174

Omnifont Plus (2W) and (3W)

The Omnifont Plus (2W) and (3W) modules recognize machine-printed text from printed

publications, laser and ink-jet printers, and electric typewriters. Mechanical typewriters may

also produce good output. These modules provide improved recognition results and combine

results from the Omnifont Multi-Lingual and Omnifont Matrix modules (2W) and Omnifont

Multi-Lingual, Omnifont Matrix, and Omnifont Multi-Lingual (FRX) modules (3W). Only

the Omnifont filling method is supported in these modules.

Both modules detect and transmit bold, italic, and underlined text (including combinations).

They also detect and transmit character size and classify font types into serif, sans serif, and

monospaced categories.

Character Set:

Characters

Non-accented

Accented

Latin alphabet upper-case letters

Latin alphabet lower-case letters

Digits

Punctuation

Miscellaneous symbols

Cyrillic upper-case letters

Cyrillic lower-case letters

Greek upper-case letters

Greek lower-case letters

OCR (OCR-A and MICR) characters

Supported Filters:

• All

• Digit

• Alphanumeric

Supported Recognition Processing Settings:

• Fast

• Balanced

• Accurate

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 175

Omnifont Multi-Lingual (FRX)

The Omnifont Multi-Lingual (FRX) module recognizes machine-printed text from printed

publications, laser and ink jet printers, and electric typewriters. Mechanical typewriters may

produce readable output. Additionally, dot-matrix printers with NLQ and LQ output may

produce readable results. No recognition process languages are supported, but all filters are

supported in this module. Only the Omnifont filling method is supported in this module.

This module supports Latin, Greek, and Cyrillic alphabets with accented letters. Omnifont

Multi-Lingual (FRX) detects and transmits bold, italic, and underlined text (including

combinations). This module also detects and transmits character size and classifies font types

into serif, sans serif, and monospaced categories.

You can select multiple languages for OCR recognition, but languages are only recognized if

they belong to the same code page. For example, OCR can process English, Spanish, and

French since they belong to the Latin 1 code page. OCR may fail to recognize both English

and Russian since they belong to different code pages.

Supported Languages per Code Page:

Code Page

Supported Languages

Latin 1

English, German, French, Spanish, Italian, Dutch, Swedish,

Norwegian, Finnish, Danish, Portuguese, Portuguese

Brazilian, Catalan, Afrikaans, Aymara, Basque, Breton,

Faroese, Friulian, Gaelic, Galician, Eskimo, Icelandic,

Indonesian, Latin, Malaysian, Pidgin English, Swahili,

Tahitian, Welsh, Frisian, Zulu

Latin 2

Polish, Czech, Hungarian, Romanian, Albanian, Croatian,

Wend (Sorbian), Slovak, Slovenian

Cyrillic

Russian, Ukranian, Byelorussian, Bulgarian, Macedonian,

Serbian

Greek

Turkish

Turkish, Kurdish (written in Latin alphabet)

Baltic

Estonian, Hawaiian, Latvian, Lithuanian

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 176

Open Text Zonal OCR

The Open Text Zonal OCR step contains a disparate set of properties available for configuration.

Open Text® OCR processing recognizes machine-printed text, but handwritten text is not

recognized. Additionally, new line characters are removed during Open Text OCR processing.

The properties described in this section are available for configuration in the Open Text Zonal

OCR step.

To configure Open Text OCR zones:

1. In the Job Definitions workspace, select the Open Text Zonal OCR job step.

2. In the Properties grid, expand the Indexes node, and then click the ellipsis button

next to the Indexes field. Proceed to step 4.

3. Or, expand the Auto Document Break node to configure OCR zones that will

automatically break documents. Proceed to step 7.

4. In the Index Configuration dialog box, click the Add button.

5. Under the Index Properties section, expand the General (Step Level) node.

6. Click the ellipsis button to the right of the OCR Zones field. The Edit OCR Zones

screen appears.

Edit OCR Zones (Open Text Zonal OCR)

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 177

7. Drag the cursor around the OCR zone on the image, and the properties appear in the

grid. The next section describes the properties available for configuration.

OCR Statistics

You can configure custom code that reports specific OCR statistics when an OCR zone is

processed through the Open Text OCR engine. For example, you can configure custom code

to record statistics when an OCR zone populates an index value by using the

OCRIndexZonesStatistics sample script. Custom code samples are located in the

Library\Samples directory (as text or XML files), where PaperVision Capture was installed.

The following OCR sample scripts are available for configuration:

• OCRFullTextPageStatistics

• OCRIndexZoneStatistics

• OCRMarkSenseZoneStatistics

To configure custom code OCR statistics:

1. In the Edit OCR Zones screen, click the ellipsis button next to the OCR Statistics

field. The Select Custom Code Generator dialog appears.

Select Custom Code Generator - Basic

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 178

2. Select the Basic custom code generator, and then click OK. The Script Editor opens.

Script Editor

3. If desired, you can import code from the OCRIndexZoneStatistics or

OCRMarkSenseZonescript into the Script Editor. Click the Import icon, and

then browse to the Library\Samples directory where PaperVision Capture was

installed.

4. Otherwise, insert your custom code into the Script Editor.

5. Click OK.

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 179

Auto Rotate

By default, this property is set to True, and the Open Text Zonal OCR engine will attempt to

recognize text in all orientations (vertically and horizontally) within the zone. If you do not

want the Open Text Zonal OCR engine to recognize text in all orientations (vertically only)

within the zone, set this property to False.

Brightness Sample Size

This value (indicating both width and height) specifies the rectangle size used to calculate the

brightness threshold. You can specify a value between 1 and 32, and the default value is 15.

Note:

Smaller brightness sample sizes may cause the OCR engine to recognize extraneous

noise on the image.

Brightness Threshold

You can assign a brightness threshold value (between 0 and 255) for the image. The default

value is 75.

Country/Language

When you select from the Country/Language property, your selection may reflect not only a

country or language, but country groups (e.g., Western Europe), language groups (e.g., Latin),

and character sets (e.g., OCR). Each country corresponds to one or more languages, and

countries are automatically expanded into language sets (e.g., German corresponds to the

German language; Switzerland corresponds to the German, French, Italian, and Rhaeto-

Romantic languages). Specific languages are also available for selection under the

Country/Language property (e.g., English, German, Dutch, Italian, etc.). It is recommended to

narrow your selection as much as possible since OCR recognition may become slower with a

greater number of selected countries or languages. It is also recommended to select a country

rather than a language or country group (e.g., Western Europe, South America, Scandinavia)

since the recognition of certain types of addresses and money transfer forms may improve.

Note:

You cannot select the OCR character set individually; it must be selected with

another language, language group, country, or country group. For a complete list of

supported countries, languages, country groups, language groups, and character sets,

see Appendix G.

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 180

Language Groups

If you select a language group, it is recommended to select only one, since they encompass

multiple languages, countries, and code pages:

1. Cyrillic: Code page 1251

2. Greek: Code page 1253

3. Latin: Code pages 1250, 1252, 1254 and 1257 (i.e. Central Europe, Western Europe,

Turkey, Baltic)

4. Azerbaijanian

Note:

For language groups, recognition results are always represented by Unicode

characters. The English character set (A-Z, a-z) is implicitly available with all

country-language selections, even Greek or Cyrillic.

Minimum Confidence

The confidence level reflects the reliability of the OCR recognition results. Values range from

zero (the default setting), the lowest confidence level, to 255, the highest confidence level

indicating the most reliable recognition results. Characters with lower confidence levels than

your specified value will display as the rejection symbol, which is the tilde (~) character by

default.

Timeout Value

This property allows you to define the maximum amount of time that the Open Text OCR

engine processes a single image before it fails. By default, this property is set to 180 seconds

(3 minutes). You can assign a timeout between one second and 3,600 seconds (1 hour).

Note:

Raising the timeout setting may increase the amount of time to process all images.

Reader Engine

Two internal OCR reader engines, RecoStar and AEGReader, are available for selection in the

Open Text Zonal OCR step. Document content may cause one engine to generate more

accurate recognition results, so the Voter option is selected by default. The Voter option

automatically "votes" between both engines' recognition results, and generates results from

the engine with the highest confidence level.

Chapter 8 – Zonal OCR

PaperVision® Capture Administration Guide 181

Rejection Symbol

This property represents rejected characters in output documents. A rejected character is not

recognized by the active OCR recognition engine configuration. The default value is the Tilde

character ( ~ ). Only a single character can be entered in this field.

Tip:

To prevent unrecognized characters from appearing in output documents, leave this

field blank.

Syntax Mode

When you assign the syntax mode to alphanumerical, the default character set is

alphanumeric. If a character is ambiguous, the OCR engine will attempt to process the

character as a letter before a number. For example, the OCR engine will process a "G" before

"6", "S" before "5", etc. When you assign the syntax mode to numerical, the default character

set is numeric. If a character is ambiguous, the OCR engine will attempt to process the

character as a number before a letter. For example, the OCR engine will process a "6" before

"G", "5" before "S", etc.

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 182

The Nuance Full-Text OCR job step allows you to configure an automated

process that reads pages of text and converts recognized results to one or multiple

file types. Once configured, this step executes automatically in the PaperVision Capture

Automation Service. To execute the Nuance Full-Text OCR step, a Capture Full-Text OCR

license is required.

The Nuance Full-Text OCR step converts extracted text into various file types such as .txt, .rtf,

.csv, .pdf, .doc (and .docx) .htm, .xls (and .xlsx), and others. Each converter output type

contains unique settings that you can configure to support your full-text OCR requirements.

Prior to activating the job, you can test and preview the full-text OCR results. Once the Nuance

Full-Text OCR step is executed, a maximum of 500 pages will comprise each full-text

document before a subsequent full-text output file is created for that same document.

Note:

The Nuance OCR engine supports incoming images ranging from 75 to 2400 dots per

inch (DPI). In pixels, this range is 16 x 16 to 8400 x 8400 pixels.

Larger images can be ingested into PaperVision Capture provided that:

1. No Full-Text OCR will be performed on the images (unless they are processed

using the Image Fit filter and cropped to meet size requirements)

2. No image processing will be performed on the images (unless they are processed

using the Image Fit filter and cropped to meet size requirements)

3. Images will not be viewed as thumbnails

Additionally, if you process multiple pages containing large amounts of text, testing

and executing the Nuance Full-Text OCR step may take a few minutes.

Auto Image Orientation

By default, this property is set to True, and the Nuance Full-Text OCR engine may

automatically rotate some images in order to recognize text. If you do not want the Nuance

Full-Text OCR engine to automatically rotate images prior to text recognition, set this property

to False.

Note:

Since the engine may automatically rotate some images in order to recognize text, the

resulting output images may also be rotated.

Outputs

By default, no conversion types are selected. To select and configure an output type, click the

ellipsis button in the Outputs field. See the next section on Converter Output Properties for a

list of properties specific to each output type.

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 183

Override Invalid Pages

When this property is set to True, the Nuance Full-Text OCR engine processes each image

using the specified Recognition Process Setting (Speed, Balanced, or Accuracy) within the

allotted time specified in the Timeout (sec) setting. If the image cannot be processed with your

selected Recognition Process Setting, then PaperVision Capture attempts to process the image

with the remaining Recognition Process Settings. If the image still cannot be processed after

PaperVision Capture cycles through all Recognition Process Settings, the page is processed as a

picture for image-based outputs or a blank page for text-based outputs (in both cases, these

pages are also tagged with the "Skipped Full Text Processing" QC tag for future review). As a

result, the remaining documents are processed.

When this property is set to True and an error occurs during the conversion to the selected

output format (e.g., PDF Searchable Image), the entire batch will be now be processed as

images and not full-text (therefore, no error will be returned). As a result, all batches will be

processed through the Nuance Full-Text OCR step without requiring any user intervention.

When this property is set to False, the Nuance Full-Text OCR engine processes each image

using the specified Recognition Process Setting (Speed, Balanced, or Accuracy) within the

allotted time specified in the Timeout (sec) setting. If the image cannot be processed with your

selected Recognition Process Setting, then PaperVision Capture attempts to process the image

with the remaining Recognition Process Settings. If the image still cannot be processed after

PaperVision Capture cycles through all Recognition Process Settings, a timeout error appears in

the Administration Console and is logged in the Event Viewer. As a result, the remaining

documents are not processed.

Note:

A batch can potentially stop processing in a full-text OCR step only if this property is

disabled.

Timeout (sec)

This property allows you to define the maximum amount of time that the OCR engine processes

a single image before it fails. By default, this property is set to 180 seconds (3 minutes). You

can assign a timeout between one second and 86,400 seconds (24 hours).

Note:

Raising the timeout setting may increase the amount of time to process all images.

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 184

Converter Output Properties

To configure the Nuance Full-Text OCR job step, you must select one or more output types and

configure the properties specific to each output.

To configure the converter output properties:

1. In the Job Definitions screen, select the Nuance Full-Text OCR job step in the

workspace.

2. In the Properties grid, expand the Nuance Full-Text OCR Step node, and click the

ellipsis button next to the Outputs field. The Edit Nuance Full-Text OCR Settings

screen appears.

Edit Nuance Full-Text OCR Settings

OCR Page Properties

Within the Edit Nuance Full-Text OCR Settings screen, you can select one or more full-text

OCR outputs and configure various properties for each output. Within this screen, you can also

scan and test sample images prior to saving the configurations.

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 185

Saving Full-Text OCR Configurations

To save the full-text OCR configuration for the job step, click the Save Full-Text OCR

Configuration icon.

Configuring the Scanner

To configure the scanner settings, click the Configure Scanner icon. For details on each

setting, see the section on Scanner Setup Settings in Chapter 6.

Starting the Scanning Process

Prior to configuring properties for one or more output types, you can scan and load images into

the Edit Full-Text OCR screen. To scan the images, click the Start Scanning icon.

Stopping the Scanning Process

To stop the scanning process, click the Stop Scanning icon.

Removing a Single Image

To remove a single image:

1. In the Thumbnails section, select the image to delete.

2. Click the Delete Single Image icon.

3. Click Yes to the confirmation message.

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 186

Removing All Images

This command removes all current images from the main scanning window and from the

Thumbnails section.

To remove all images:

1. Click the Remove All Images icon.

2. Click Yes to confirm the removal.

Note:

If you have defined OCR zones prior to clearing all images, these zones are retained.

Importing Images

To import images:

1. Click the Import Images icon.

2. Locate the directory of the image(s).

3. Click Open, and the image appears in the main OCR window.

Rotating the Image 90° Counter-Clockwise

To rotate the image 90 degrees counter-clockwise, click the Rotate Image 90° Counter-

Clockwise icon.

Rotating the Image 90° Clockwise

To rotate the image 90 degrees clockwise, click the Rotate Image 90° Clockwise icon.

Testing Full-Text OCR (Current Page Only)

The Test Full-Text OCR command verifies that the current page’s text can be read successfully

and will open the output file in the selected output’s application.

To test full-text OCR for the current page:

1. Click the Import Images icon to load a test page.

2. Select one or more output configurations.

3. Adjust the appropriate output configuration properties and OCR page properties.

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 187

4. Click the Test Full-Text OCR (Selected Filter, Current Page Only) icon. The

Specify Output Files dialog box appears.

Specify Output Files

5. Enter the output file path where the full-text OCR results will reside. Proceed to step 8.

6. Or, click the ellipsis button to browse to the location. Proceed to the next step.

7. If you browsed to the file location, enter the file name in the Save As dialog box, and

then click Save.

8. To view the results, select the Open check box.

9. Click OK. The Nuance Full-Text OCR engine will process the results. If you opted to

open the resulting output file, it will open in its respective application or editor.

10. If the resulting file is not acceptable, adjust the OCR page properties and/or the

converter’s properties, and run the test again.

Testing Full-Text OCR (Selected Filter, All Pages)

This operation verifies that text from all pages can be read successfully.

To test full-text OCR for all pages:

1. Load more than one test page.

2. Select one or more output configurations.

3. Adjust the appropriate output configuration properties and OCR page properties.

4. Click the Test Full-Text OCR (Selected Filter, All Pages) icon, and follow steps

5 through 10 from the previous section.

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 188

Zooming Commands

• To zoom in on an area of the image, click the Zoom In icon.

• To zoom out of the current view of the image, click the Zoom Out icon.

• To reset the image to its original view, click the Zoom Reset icon.

Thumbnails

Thumbnails windows are found in the Edit Barcode Zones, Edit OCR Zones, Edit Nuance Full-

Text OCR, and Edit Image Processing Filters screens. You can right-click within any

Thumbnails window to perform basic operations on images, such as the cut/paste, copy/paste,

delete, or select all operations. The cut, copy, paste, and delete operations can be performed on

consecutive or non-consecutive images. Additionally, you can select multiple images and

simultaneously rotate them. The scrolling capability, displayed with up/down or left/right

arrows as you drag and drop images, allows you to quickly scroll through remaining images not

shown in the current window.

Note:

Images viewed as thumbnails can have maximum dimensions of 32,768 x 32,768

pixels.

Exiting the Edit Full-Text OCR Settings Screen

To close and exit out of the Edit OCR Zones screen:

1. Click the Exit icon.

2. Click Yes to save all changes.

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 189

Converter Output Formats

Each full-text OCR converter contains unique properties that you can configure within the

Nuance Full-Text OCR step. Options that are available for specific properties, such as the

Headers/Footers, Output Format, and Tables properties, may differ per converter.

To select a converter’s output configuration:

1. In the Output Configuration section, highlight one or more output types from the

Available Outputs list.

Output Configuration

2. Click the right arrow to move the selection to the Selected Outputs list.

3. To remove one or more selected outputs, highlight the appropriate types in the Selected

Outputs list, and then click the left arrow. Properties specific to each converter

populate the right column.

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 190

eBook

This converter generates the eBook .opf output (packaged in a .zip file) that can be uploaded to

hand-held devices.

Bullets: Retains bullets in output file

Cross-References: Retains cross-references (hyperlinks) in output file

Headers/Footers: Specifies handling of headers and footers in output file

• Convert to Plain Text: Converts headers and footers to plain text

• Ignore: Ignores header and footer text from original file

Image Color: Assigns image color in output file

• 24-bit Color (True Color)

• Grayscale

• Black and White

• Original

Image DPI: Specifies dots per inch (DPI) resolution setting for images in output file

• DPI 72

• DPI 100

• DPI 150

• DPI 200

• DPI 300

• None

• Original

Line Numbering Zones: Retains line numbering zones in output file

Output Format: Specifies type of format retention in output file

• Formatted Text: Retains text (without columns); also retains paragraph format, font,

graphics, table styles, highlights, and strikeouts (ignores layout-related formatting)

• Ignore All: Ignores all format styles in original file

Tables: Specifies handling of tables in output file

• Convert to Separated by Tabs: Does not retain tables, but converts tables to

columns separated by tabs

• Retain Tables: Retains all tables from original file

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 191

HTML 3.2

The HTML 3.2 converter is supported by many HTML editors and creates a clear, small,

HTML file format. After it is processed, the HTML output is packaged in a .zip file to facilitate

its transmission.

Bullets: Retains bullets in output file

Cross-References: Retains cross-references (hyperlinks) in output file

Headers/Footers: Specifies handling of headers and footers in output file

• Convert to Plain Text: Converts headers/footers to plain text

• Ignore: Ignores header and footer text from original file

Horizontal Rule Line: Places horizontal rule line between sections

Image Color: Assigns image color in output file

• 24-bit Color (True Color)

• Grayscale

• Black and White

• Original

Image DPI: Specifies dots per inch (DPI) resolution setting for images in output file

• DPI 72

• DPI 100

• DPI 150

• DPI 200

• DPI 300

• None

• Original

Index Page: Specifies how index page will be created in output file

• In Frame (index page appears in a separate column on same page as full-text output

file)

• None

• Simple HTML (index page displays thumbnail preview and hyperlink to full-text

output file)

Line Breaks: Inserts line breaks between lines of recognized text

Navigation (Next): Displays "Next" navigation text (for Simple HTML or In Frame index

pages)

Navigation (Previous): Displays "Previous" navigation text (for Simple HTML or In Frame

index pages)

Navigation (TOC): Displays Table of Contents navigation text (Simple HTML or In Frame

index pages)

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 192

HTML 3.2 (continued)

Output Format: Specifies type of format retention in output file

• Formatted Text: Retains text (without columns); also retains paragraph format, font,

graphics, table styles, highlights, and strikeouts (ignores layout-related formatting)

• Spreadsheet: Exports results in tabular form (suitable for spreadsheet use) and places

each document in separate worksheet

• Ignore All: Ignores all format styles in original file

Page Breaks: Specifies handling of page breaks in output file

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 193

HTML 4.0

The HTML 4.0 converter uses Cascading Style Sheet technology for box-like absolute

positioned objects, styles and manipulating all paragraph and character attributes. After it is

processed, the HTML output is packaged in a .zip file to facilitate its transmission.

Cross-References: Retains cross-references (hyperlinks) in output file

CSS (External): Enables external Cascading Style Sheet (CSS)

File (Subdirectory): Places every file into a subdirectory

Headers/Footers: Specifies handling of headers and footers in output file

• Convert to Plain Text: Converts headers/footers to plain text

• Ignore: Ignores header and footer text from original file

Horizontal Rule Line: Places horizontal rule line between sections

Image Color: Assigns image color in output file

• 24-bit Color (True Color)

• Grayscale

• Black and White

• Original

Image DPI: Specifies dots per inch (DPI) setting for images in output file

• DPI 72

• DPI 100

• DPI 150

• DPI 200

• DPI 300

• None

• Original

Index Page: Specifies how index page will be created in output file

• In Frame (index page appears in a separate column on same page as full-text output

file)

• None

• Simple HTML (index page displays thumbnail preview and hyperlink to full-text

output file)

Line Breaks: Inserts line breaks between lines of recognized text

Line Numbering Zones: Retains line numbering zones in output file

Name (Output File): Displays name of output file

Navigation (Next): Displays "Next" navigation text (for Simple HTML or In Frame index

pages)

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 194

HTML 4.0 (continued)

Navigation (Previous): Displays "Previous" navigation text (for Simple HTML or In Frame

index pages)

Navigation (TOC): Displays Table of Contents navigation text (Simple HTML or In Frame

index pages)

Output Format: Specifies type of format retention in output file

• Formatted Text: Retains text (without columns); also retains paragraph format, font,

graphics, table styles, highlights, and strikeouts (ignores layout-related formatting)

• True Page: Retains original page and column layout (involves absolute positioning

of text, pictures, tables, and frames)

• Ignore All: Ignores all format styles in original file

Rule Lines: Retains rule lines in output file

Styles: Retains styles from original document

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 195

InfoPath

This converter supports the saving of various form elements such as check boxes and input lines

and generates a Microsoft InfoPath (.xsn) file.

Cross-References: Retains cross-references (hyperlinks) in output file

Headers/Footers: Specifies handling of headers and footers in output file

• Convert to Ordinary Text: Converts headers/footers to plain text

• Ignore: Ignores header and footer text from original file

Image Color: Assigns image color in output file

• 24-bit Color (True Color)

• Grayscale

• Black and White

• Original

Image DPI: Specifies dots per inch (DPI) resolution setting for images in output file

• DPI 72

• DPI 100

• DPI 150

• DPI 200

• DPI 300

• None

• Original

Line Numbering Zones: Retains line numbering zones in output file

Output Format: Specifies type of format retention in output file

• Formatted Text: Retains text (without columns); also retains paragraph format, font,

graphics, table styles, highlights, and strikeouts (ignores layout-related formatting)

• True Page: Retains original page and column layout (involves absolute positioning

of text, pictures, tables, and frames)

• Ignore All: Ignores all format styles in original file

Rule Lines: Retains rule lines in output file

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 196

Microsoft Excel 2007

This converter generates a Microsoft Excel 2007 (.xlsx) file using features only supported by

Excel 2007.

Bullets: Retains bullets in output file

Cross-References: Retains cross-references (hyperlinks) in output file

Headers/Footers: Specifies handling of headers and footers in output file

• Auto Format: Automatically formats headers and footers to match original style

• Convert to Ordinary Text: Converts headers/footers to plain text

• Tabulated Form:

Leader Dots: Inserts leaders dots in output file

Line Breaks: Inserts line breaks between lines of recognized text

Line Numbering Zones: Retains line numbering zones in output file

Overview Sheet Name (Include): Includes name of last sheet (in Formatted Text output

format, every table appears in a separate sheet; all other text and images will appear on last

Overview Sheet)

Output Format: Specifies type of format retention in output file

• Formatted Text: Retains text (without columns); also retains paragraph format, font,

graphics, table styles, highlights, and strikeouts (ignores layout-related formatting)

• Spreadsheet: Exports results in tabular form (suitable for spreadsheet use) and

places each document in separate worksheet

• Ignore All: Ignores all format styles in original file

Overview Sheet Name: Specifies name of overview sheet

Page Breaks: Specifies the handling of page breaks in output file

Page Color: Retains page background color in output file

Image Color: Assigns image color in output file

• 24-bit Color (True Color)

• Grayscale

• Black and White

• Original

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 197

Microsoft Excel 2007 (continued)

Image DPI: Specifies dots per inch (DPI) resolution setting for images in output file

• DPI 72

• DPI 100

• DPI 150

• DPI 200

• DPI 300

• None

• Original

Tabs: Retains original tab positions in output file

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 198

Microsoft Excel 97

This converter generates a Microsoft Excel 97 binary (.xls) file.

Bullets: Retains bullets in output file

Image Color: Assigns image color in output file

• 24-bit Color (True Color)

• Grayscale

• Black and White

• Original

Image DPI: Specifies dots per inch (DPI) resolution setting for images in output file

• DPI 72

• DPI 100

• DPI 150

• DPI 200

• DPI 300

• None

• Original

Headers/Footers: Specifies handling of headers and footers in output file

• Convert to Ordinary Text: Converts headers/footers to plain text

• Ignore: Ignores header and footer text from original file

• Tabulated Form:

Line Numbering Zones: Retains line numbering zones in output file

Output Format: Specifies type of format retention in output file

• Formatted Text: Retains text (without columns); also retains paragraph format, font,

graphics, table styles, highlights, and strikeouts (ignores layout-related formatting)

• Spreadsheet: Exports results in tabular form (suitable for spreadsheet use) and

places each document in separate worksheet

• Ignore All: Ignores all format styles in original file

Page Breaks: Specifies the handling of page breaks in output file

Page Color: Retains page background color in output file

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 199

Microsoft Excel XP

This converter generates a Microsoft Excel XP binary (.xls) file.

Bullets: Retains bullets in output file

Headers/Footers: Specifies handling of headers and footers in output file

• Convert to Ordinary Text: Converts headers/footers to plain text

• Ignore: Ignores header and footer text from original file

• Tabulated Form:

Image Color: Assigns image color in output file

• 24-bit Color (True Color)

• Grayscale

• Black and White

• Original

Image DPI: Specifies DPI setting for images in output file

• DPI 72

• DPI 100

• DPI 150

• DPI 200

• DPI 300

• None

• Original

Output Format: Specifies type of format retention in output file

• Formatted Text: Retains text (without columns); also retains paragraph format, font,

graphics, table styles, highlights, and strikeouts (ignores layout-related formatting)

• Spreadsheet: Exports results in tabular form (suitable for spreadsheet use) and

places each document in separate worksheet

• Ignore All: Ignores all format styles in original file

Page Breaks: Specifies the handling of page breaks in output file

Page Color: Retains page background color in output file

Read-Only: Marks output file as read-only

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 200

Microsoft PowerPoint 2007

This converter generates a Microsoft PowerPoint 2007 (.pptx) file.

Bullets: Retains bullets in output file

Character Colors: Retains character colors in output file

Character Scaling: Retains character scaling in output file

Character Spacing: Retains character spacing in output file

Note:

If this property is set to True, text characters can be expanded or condensed in output

file. If images contain text with approximately two spaces between words, a single

space will be generated; if four or five spaces exist between words, a tab will be

generated.

Column Breaks: Inserts column breaks in output file

Cross-References: Retains cross-references (hyperlinks) in output file

Drop Caps: Retains drop caps (drop caps display enlarged first letter of paragraph that drops

down two or more lines)

Field Codes: Retains field codes in output file

Headers/Footers: Specifies handling of headers and footers in output file

• Auto Format: Automatically formats headers and footers to match original style

• Formatted Text: Retains text (without columns); also retains paragraph, font,

graphics, and table styles

• Ignore: Ignores header and footer text from original file

• In Boxes:

• Tabulated Form:

• Tabulated Form in Box:

Image Color: Assigns image color in output file

• 24-bit Color (True Color)

• Grayscale

• Black and White

• Original

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 201

Microsoft PowerPoint 2007 (continued)

Image DPI: Specifies dots per inch (DPI) resolution setting for images in output file

• DPI 72

• DPI 100

• DPI 150

• DPI 200

• DPI 300

• None

• Original

Line Breaks: Inserts line breaks between lines of recognized text

Line Numbering Zones: Retains line numbering zones in output file

Output Format: Specifies type of format retention in output file

• Formatted Text: Retains text (without columns); also retains paragraph format, font,

graphics, table styles, highlights, and strikeouts (ignores layout-related formatting)

• True Page: Retains original page and column layout (involves absolute positioning

of text, pictures, tables, and frames)

• Ignore All: Ignores all format styles in original file

Page Breaks: Specifies the handling of page breaks in output file

Page Color: Retains page background color in output file

Page Margins: Retains page margins in output file

Rule Lines: Retains rule lines in output file

Tabs: Retains original tab positions in output file

Title: Displays title of output file

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 202

Microsoft PowerPoint 97

This converter generates an .rtf file interpreted by Microsoft PowerPoint 97.

Bullets: Retains bullets in output file

Headers/Footers: Specifies handling of headers and footers in output file

• Convert to Ordinary Text: Converts headers/footers to plain text

• Ignore: Ignores header and footer text from original file

Image Color: Assigns image color in output file

• 24-bit Color (True Color)

• Grayscale

• Black and White

• Original

Image DPI: Specifies dots per inch (DPI) resolution setting for images in output file

• DPI 72

• DPI 100

• DPI 150

• DPI 200

• DPI 300

• None

• Original

Line Numbering Zones: Retains line numbering zones in output file

Tabs: Retains original tab positions in output file

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 203

Microsoft Publisher

This converter generates an .rtf file interpreted by Microsoft Publisher.

Bullets: Retains bullets in output file

Character Colors: Retains character colors in output file

Character Scaling: Retains character scaling in output file

Character Spacing: Retains character spacing in output file

Note:

If this property is set to True, text characters can be expanded or condensed in output

file. If images contain text with approximately two spaces between words, a single

space will be generated; if four or five spaces exist between words, a tab will be

generated.

Headers/Footers: Specifies handling of headers and footers in output file

• Convert to Ordinary Text: Converts headers/footers to plain text

• Ignore: Ignores header and footer text from original file

Image Color: Assigns image color in output file

• 24-bit Color (True Color)

• Grayscale

• Black and White

• Original

Image DPI: Specifies dots per inch (DPI) resolution setting for images in output file

• DPI 72

• DPI 100

• DPI 150

• DPI 200

• DPI 300

• None

• Original

Line Breaks: Inserts line breaks between lines of recognized text

Line Numbering Zones: Retains line numbering zones in output file

Output Format: Specifies type of format retention in output file

• Formatted Text: Retains text (without columns); also retains paragraph format, font,

graphics, table styles, highlights, and strikeouts (ignores layout-related formatting)

• Ignore All: Ignores all format styles in original file

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 204

Microsoft Publisher (continued)

Tables: Specifies handling of tables in output file

• Convert to Separated by Tabs: Does not retain tables, but converts tables to

columns separated by tabs

• Retain Tables: Retains all tables from original file

Tabs: Retains original tab positions from original file

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 205

Microsoft Reader

This converter generates a Microsoft Reader (.lit) file that can be uploaded to Windows-based

hand-held devices.

Bullets: Retains bullets in output file

Cross-References: Retains cross-references (hyperlinks) in output file

Headers/Footers: Specifies handling of headers and footers in output file

• Convert to Ordinary Text: Converts headers/footers to plain text

• Ignore: Ignores header and footer text from original file

Image Color: Assigns image color in output file

• 24-bit Color (True Color)

• Grayscale

• Black and White

• Original

Image DPI: Specifies dots per inch (DPI) resolution setting for images in output file

• DPI 72

• DPI 100

• DPI 150

• DPI 200

• DPI 300

• None

• Original

Line Numbering Zones: Retains line numbering zones in output file

Output Format: Specifies type of format retention in output file

• Formatted Text: Retains text (without columns); also retains paragraph format, font,

graphics, table styles, highlights, and strikeouts (ignores layout-related formatting)

• Ignore All: Ignores all format styles in original file

Tables: Specifies handling of tables in output file

• Convert to Separated by Tabs: Does not retain tables, but converts tables to

columns separated by tabs

• Retain Tables: Retains all tables from original file

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 206

Microsoft Word 2007

This converter generates a Microsoft Word .docx file that uses features supported by Word

2007.

Note:

Page width and height must be between 0.1 and 22 inches for all Microsoft Word and

RTF converters. Otherwise, an error will appear if you use the Flowing Page or True

Page output formats with .doc(x) and .rtf file extensions.

Bullets: Retains bullets in output file

Character Colors: Retains character colors in output file

Character Scaling: Retains character scaling in output file

Character Spacing: Retains character spacing in output file

Note:

If this property is set to True, text characters can be expanded or condensed in output

file. If images contain text with approximately two spaces between words, a single

space will be generated; if four or five spaces exist between words, a tab will be

generated.

Column Breaks: Inserts column breaks in output file

Columns: Retains columns in output file

Cross-References: Retains cross-references (hyperlinks) in output file

Drop Caps: Retains drop caps (drop caps display enlarged first letter of paragraph that drops

down two or more lines)

Field Codes: Retains field codes in output file

Headers/Footers: Specifies handling of headers and footers in output file

• Auto Format: Automatically formats headers and footers to match original style

• Formatted Text: Retains text (without columns); also retains paragraph, font,

graphics, and table styles

• Ignore: Ignores header and footer text from original file

• In Boxes:

• Tabulated Form:

• Tabulated Form in Box:

Image Color: Assigns image color in output file

• 24-bit Color (True Color)

• Grayscale

• Black and White

• Original

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 207

Microsoft Word 2007 (continued)

Image DPI: Specifies dots per inch (DPI) resolution setting for images in output file

• DPI 72

• DPI 100

• DPI 150

• DPI 200

• DPI 300

• None

• Original

Image in Text Box: Surrounds images with text boxes

Line Breaks: Inserts line breaks between lines of recognized text

Line Numbering Zones: Retains line numbering zones in output file

Output Format: Specifies type of format retention in output file

• Flowing Page: Available for applications that handle columns, preserves original

page and column layout so text flows across columns (boxes, frames used only when

necessary)

• Formatted Text: Retains text (without columns); also retains paragraph, font,

graphics, and table styles

• True Page: Retains original page and column layout (involves absolute positioning

of text, pictures, tables, and frames)

• Ignore All: Ignores all format styles in original file

Page Breaks: Specifies handling of page breaks in output file (Auto, Always, or Never)

Page Color: Retains page background color in output file

Page Consolidation: Combines pages in output file

Read-Only: Marks output file as read-only

Rule Lines: Retains rule lines in output file

Styles: Retains styles from original file

Tables: Specifies handling of tables in output file

• Convert to Separated by Tabs: Does not retain tables, but converts tables to

columns separated by tabs

• Retain Tables: Retains tables from original file

Tabs: Retains original tab positions from original file

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 208

Microsoft Word 2003 (WordML)

This converter generates an XML file and uses features supported by Microsoft Word 2003.

Note:

Page width and height must be between 0.1 and 22 inches for all Microsoft Word and

RTF converters. Otherwise, an error will appear if you use the Flowing Page or True

Page output formats with .doc(x) and .rtf file extensions.

Bullets: Retains bullets in output file

Character Colors: Retains character colors in output file

Character Scaling: Retains character scaling in output file

Character Spacing: Retains character spacing in output file

Note:

If this property is set to True, text characters can be expanded or condensed in output

file. If images contain text with approximately two spaces between words, a single

space will be generated; if four or five spaces exist between words, a tab will be

generated.

Column Breaks: Inserts column breaks in output file

Cross-References: Retains cross-references (hyperlinks) in output file

Drop Caps: Retains drop caps (drop caps display enlarged first letter of paragraph that drops

down two or more lines)

Field Codes: Retains field codes in output file

Image Color: Assigns image color in output file

• 24-bit Color (True Color)

• Grayscale

• Black and White

• Original

Image DPI: Specifies dots per inch (DPI) resolution setting for images in output file

• DPI 72

• DPI 100

• DPI 150

• DPI 200

• DPI 300

• None

• Original

Line Breaks: Inserts line breaks between lines of recognized text

Line Numbering Zones: Retains line numbering zones in output file

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 209

Microsoft Word 2003 (WordML - continued)

Output Format: Specifies type of format retention in output file

• Flowing Page: Available for applications that handle columns, preserves original

page and column layout so text flows across columns (boxes, frames used only when

necessary)

• Formatted Text: Retains text (without columns); also retains paragraph format, font,

graphics, table styles, highlights, and strikeouts (ignores layout-related formatting)

• True Page: Retains original page and column layout (involves absolute positioning

of text, pictures, tables, and frames)

• Ignore All: Ignores all format styles in original file

Page Color: Retains page background color in output file

Page Consolidation: Combines pages in output file

Read-Only: Mark output file as read-only

Rule Lines: Retains rule lines in output file

Tabs: Retains original tab positions from original file

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 210

Microsoft Word 2000/XP

This converter generates a .doc file and uses features supported by Microsoft Word 2000 and

later.

Note:

Page width and height must be between 0.1 and 22 inches for all Microsoft Word and

RTF converters. Otherwise, an error will appear if you use the Flowing Page or True

Page output formats with .doc(x) and .rtf file extensions.

Bullets: Retains bullets in output file

Character Colors: Retains character colors in output file

Character Scaling: Retains character scaling in output file

Character Spacing: Retains character spacing in output file

Note:

If this property is set to True, text characters can be expanded or condensed in output

file. If images contain text with approximately two spaces between words, a single

space will be generated; if four or five spaces exist between words, a tab will be

generated.

Column Breaks: Inserts column breaks in output file

Cross-References: Retains cross-references (hyperlinks) in output file

Drop Caps: Retains drop caps (drop caps display enlarged first letter of paragraph that drops

down two or more lines)

Field Codes: Retains field codes in output file

Headers/Footers: Specifies handling of headers and footers in output file

• Auto Format: Automatically formats headers and footers to match original style

• Formatted Text: Retains text (without columns); also retains paragraph, font,

graphics, and table styles

• Ignore: Ignores header and footer text from original file

• In Boxes:

• Tabulated Form:

• Tabulated Form in Box:

Image Color: Assigns image color in output file

• 24-bit Color (True Color)

• Grayscale

• Black and White

• Original

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 211

Microsoft Word 2000/XP (continued)

Image DPI: Specifies dots per inch (DPI) resolution setting for images in output file

• DPI 72

• DPI 100

• DPI 150

• DPI 200

• DPI 300

• None

• Original

Line Breaks: Inserts line breaks between lines of recognized text

Line Numbering Zones: Retains line numbering zones in output file

Output Format: Specifies type of format retention in output file

• Flowing Page: Available for applications that handle columns, preserves original

page and column layout so text flows across columns (boxes, frames used only when

necessary)

• Formatted Text: Retains text (without columns); also retains paragraph format, font,

graphics, table styles, highlights, and strikeouts (ignores layout-related formatting)

• True Page: Retains original page and column layout (involves absolute positioning

of text, pictures, tables, and frames)

• Ignore All: Ignores all format styles in original file

Page Consolidation: Combines pages in output file

Rule Lines: Retains rule lines in output file

Tabs: Retains original tab positions from original file

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 212

PaperFlow Full-Text

The PaperFlow converter generates a .txt file containing the full-text results that you can

subsequently import into the OCRFlow application. You can configure OCR page properties

that are described in the section on OCR Page Properties in Chapter 8.

PaperVision Enterprise Full-Text

The PaperVision Enterprise converter generates a .txt file containing the full-text results that

you can subsequently import into the PaperVision Enterprise application. You can configure

OCR page properties that are described in the section on OCR Page Properties in Chapter 8.

Note:

To export full-text data using either the PaperFlow or PVE export script, specify the

Nuance Full-Text OCR job step name in the OCR_JOB_STEP_NAME variable

within the script. The following line appears in the script:

private const string OCR_JOB_STEP_NAME = “”;

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 213

PDF

This converter supports several PDF features and is dependent upon the positions of recognized

characters. Exported in the True Page output format, the resulting PDF is viewable, searchable

and editable in a PDF viewer.

Color Quality: Specifies color quality in output file

• Good

• Minimum

• Lossless (Best Quality)

Compression Types: Specifies type of compression applied to PDF output file

• Contents: Compresses text content and line art

• Embedded Files: Compresses embedded files

• Flate: Applies flate compression (suitable for use on images with large areas of

single colors or repeating patterns)

• JBIG2: Applies JBIG2 compression (suitable for use on highly-compressed black

and white images or monochrome images)

• JPEG2000: Applies JPEG2000 compression (suitable for photographs or images

with gradual color changes)

• LZW: Applies LZW compression suitable for compressing text files (reduces file

size; suitable for use with .gif images from web sites and TIFF images)

Cross-References: Retains cross-references (hyperlinks) in output file

Encryption Level: Type of encryption applied to PDF output file

• None

• 40-bit RC4 (used in Adobe Acrobat 3.x and 4.x; lowest encryption level)

• 128-bit RC4 (used in Adobe Acrobat 5.x and later; medium encryption level)

• 128-bit AES (used in Adobe Acrobat 7.x and later; highest encryption level)

Headers/Footers: Specifies handling of headers and footers in output file

• Auto Format: Automatically formats headers and footers to match original style

• Ignore: Ignores header and footer text from original file

Image Color: Assigns image color in output file

• 24-bit Color (True Color)

• Grayscale

• Black and White

• Original

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 214

PDF (continued)

Image DPI: Specifies dots per inch (DPI) resolution setting for images in output file

• DPI 72

• DPI 100

• DPI 150

• DPI 200

• DPI 300

• None

• Original

Image Substitutes: Covers suspect words with small images

Linearized PDF: If enabled, this setting optimizes PDF files for efficient web display. The

first page will load quickly into a web page, and the remaining pages will load while the

PDF file is being viewed. The browser determines which page elements appear first

(typically, headings and text) and the elements that follow (e.g., larger pictures). This

property also optimizes efficiency when you skip to another page in the PDF file.

Line Numbering Zones: Retains line numbering zones in output file

Mixed Raster Content: Specifies level of Mixed Raster Content (MRC) in output file

(MRC is a process that uses image segmentation methods to improve contrast resolution of

raster images comprised of pixels.)

• No MRC

• Medium Compression

• Lossless Compression (Best Quality)

• Best Compression (Smallest File Size)

Outline Props: Specifies whether to retain bookmarks for pages

Output Format: Specifies type of format retention in output file

• True Page: Retains original page and column layout (involves absolute positioning

of text, pictures, tables, and frames)

Password (Open): Displays password required to open PDF file

Password (Permissions): Displays password required to edit PDF file, such as printing and

copying content

Note:

To apply passwords to PDF files, you must select an appropriate Encryption Level

setting.

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 215

PDF (continued)

PDF Compatibility: Specifies compatible PDF version (offers widest usability and designed

to display identically in most environments; excludes audio and video files)

• Optimize for Quality

• Optimize for Size

• PDF 1.0

• PDF 1.1

• PDF 1.2

• PDF 1.3

• PDF 1.4

• PDF-A

• PDF 1.5

• PDF 1.6

PDF Form Visuality: Displays PDF form’s visual components

PDF Form Visuality (User Set):

PDF Thumbnails: Creates thumbnail images in output file

Rule Lines: Retains rule lines in output file

Signature (Certification Description): Description for signature's certificate

Signature (SHA Thumbprint): Signature’s SHA1 thumbprint

Signature Type: Signature’s handler type (a digital signature authenticates PDF documents

to ensure that recipients receive unaltered versions from a trusted source)

URL (Highlight): Highlights URL address in output file

URL (Underline): Underlines URL address in output file

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 216

PDF Edited

Unlike the PDF converter, the PDF Edited converter does not rely on recognized characters’

positions, so you can insert sections of text in the editor. This converter is recommended if you

have made significant edits in the recognition results. The resulting PDF file is viewable,

searchable, and editable.

Bullets: Retains bullets in output file

Color Quality: Specifies color quality in output file

• Good

• Minimum

• Lossless (Best Quality)

Compression Types: Specifies type of compression applied to PDF output file

• Contents: Compresses text content and line art

• Embedded Files: Compresses embedded files

• Flate: Applies flate compression (suitable for use on images with large areas of

single colors or repeating patterns)

• JBIG2: Applies JBIG2 compression (suitable for use on highly-compressed black

and white images or monochrome images)

• JPEG2000: Applies JPEG2000 compression (suitable for photographs or images

with gradual color changes)

• LZW: Applies LZW compression suitable for compressing text files (reduces file

size; suitable for use with .gif images from web sites and TIFF images)

Cross-References: Retains cross-references (hyperlinks) in output file

Drop Caps: Retains drop caps (drop caps display enlarged first letter of paragraph that drops

down two or more lines)

Encryption Level: Type of encryption applied to PDF output file

• None

• 40-bit RC4 (used in Adobe Acrobat 3.x and 4.x; lowest encryption level)

• 128-bit RC4 (used in Adobe Acrobat 5.x and later; medium encryption level)

• 128-bit AES (used in Adobe Acrobat 7.x and later; highest encryption level)

Field Codes: Retains field codes in output file

Fonts (External): Includes external fonts in output file

Chapter 9 – Nuance Full-Text OCR

PaperVision® Capture Administration Guide 217

PDF Edited (continued)

Headers/Footers: Specifies handling of headers and footers in output file (e.g., converts

headers and footers to plain text, excludes them, etc.)