Loading Data

2015-03-03

: Pdf Loading Data Loading_Data 5.6.3 summation

Open the PDF directly: View PDF .
Page Count: 52

Download
Open PDF In Browser	View PDF

| 1

| 2

AccessData Legal and Contact Information

Document date: March 2, 2015

Legal Information
©2015 AccessData Group, Inc. All rights reserved. No part of this publication may be reproduced, photocopied,
stored on a retrieval system, or transmitted without the express written consent of the publisher.
AccessData Group, Inc. makes no representations or warranties with respect to the contents or use of this
documentation, and specifically disclaims any express or implied warranties of merchantability or fitness for any
particular purpose. Further, AccessData Group, Inc. reserves the right to revise this publication and to make
changes to its content, at any time, without obligation to notify any person or entity of such revisions or changes.
Further, AccessData Group, Inc. makes no representations or warranties with respect to any software, and
specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose.
Further, AccessData Group, Inc. reserves the right to make changes to any and all parts of AccessData
software, at any time, without any obligation to notify any person or entity of such changes.
You may not export or re-export this product in violation of any applicable laws or regulations including, without
limitation, U.S. export regulations or the laws of the country in which you reside.

AccessData Group, Inc.
1100 Alma Street
Menlo Park, California 94025
USA

AccessData Trademarks and Copyright Information
The following are either registered trademarks or trademarks of AccessData Group, Inc. All other trademarks are
the property of their respective owners.
AccessData®

DNA®

PRTK®

AccessData Certified Examiner® (ACE®)

Forensic Toolkit® (FTK®)

Registry Viewer®

AD Summation®

Mobile Phone Examiner Plus®

Resolution1™

Discovery Cracker®

MPE+ Velocitor™

SilentRunner®

Distributed Network Attack®

Password Recovery Toolkit®

Summation®
ThreatBridge™

AccessData Legal and Contact Information

| 3

A trademark symbol (®, ™, etc.) denotes an AccessData Group, Inc. trademark. With few exceptions, and
unless otherwise notated, all third-party product names are spelled and capitalized the same way the owner
spells and and capitalizes its product name. Third-party trademarks and copyrights are the property of the
trademark and copyright holders. AccessData claims no responsibility for the function or performance of thirdparty products.
Third party acknowledgements:
FreeBSD

AFF®

and AFFLIB® Copyright® 2005, 2006, 2007, 2008 Simson L. Garfinkel and Basis Technology
Corp. All rights reserved.

Copyright

BSD License: Copyright (c) 2009-2011, Andriy Syrov. All rights reserved. Redistribution and use in source and
binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following
disclaimer; Redistributions in binary form must reproduce the above copyright notice, this list of conditions and
the following disclaimer in the documentation and/or other materials provided with the distribution; Neither the
name of Andriy Syrov nor the names of its contributors may be used to endorse or promote products derived
from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE
COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
WordNet License

This license is available as the file LICENSE in any downloaded version of WordNet.
WordNet 3.0 license: (Download)
WordNet Release 3.0 This software and database is being provided to you, the LICENSEE, by Princeton
University under the following license. By obtaining, using and/or copying this software and database, you agree
that you have read, understood, and will comply with these terms and conditions.: Permission to use, copy,
modify and distribute this software and database and its documentation for any purpose and without fee or
royalty is hereby granted, provided that you agree to comply with the following copyright notice and statements,
including the disclaimer, and that the same appear on ALL copies of the software, database and documentation,
including modifications that you make for internal use or for distribution. WordNet 3.0 Copyright 2006 by
Princeton University. All rights reserved. THIS SOFTWARE AND DATABASE IS PROVIDED "AS IS" AND
PRINCETON UNIVERSITY MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED. BY
WAY OF EXAMPLE, BUT NOT LIMITATION, PRINCETON UNIVERSITY MAKES NO REPRESENTATIONS OR
WARRANTIES OF MERCHANT- ABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE
USE OF THE LICENSED SOFTWARE, DATABASE OR DOCUMENTATION WILL NOT INFRINGE ANY THIRD
PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR OTHER RIGHTS. The name of Princeton University or

AccessData Legal and Contact Information

| 4

Princeton may not be used in advertising or publicity pertaining to distribution of the software and/or database.
Title to copyright in this software, database and any associated documentation shall at all times remain with
Princeton University and LICENSEE agrees to preserve same.

Documentation Conventions
In AccessData documentation, a number of text variations are used to indicate meanings or actions. For
example, a greater-than symbol (>) is used to separate actions within a step. Where an entry must be typed in
using the keyboard, the variable data is set apart using [variable_data] format. Steps that require the user to
click on a button or icon are indicated by Bolded text. This Italic font indicates a label or non-interactive item in
the user interface.
A trademark symbol (®, ™, etc.) denotes an AccessData Group, Inc. trademark. Unless otherwise notated, all
third-party product names are spelled and capitalized the same way the owner spells and capitalizes its product
name. Third-party trademarks and copyrights are the property of the trademark and copyright holders.
AccessData claims no responsibility for the function or performance of third-party products.

Registration
The AccessData product registration is done at AccessData after a purchase is made, and before the product is
shipped. The licenses are bound to either a USB security device, or a Virtual CmStick, according to your
purchase.

Subscriptions
AccessData provides a one-year licensing subscription with all new product purchases. The subscription allows
you to access technical support, and to download and install the latest releases for your licensed products during
the active license period.
Following the initial licensing period, a subscription renewal is required annually for continued support and for
updating your products. You can renew your subscriptions through your AccessData Sales Representative.
Use License Manager to view your current registration information, to check for product updates and to
download the latest product versions, where they are available for download. You can also visit our web site,
www.accessdata.com anytime to find the latest releases of our products.
For more information, see Managing Licenses in your product manual or on the AccessData website.

AccessData Contact Information
Your AccessData Sales Representative is your main contact with AccessData. Also, listed below are the general
AccessData telephone number and mailing address, and telephone numbers for contacting individual
departments

AccessData Legal and Contact Information

| 5

Mailing Address and General Phone Numbers
You can contact AccessData in the following ways:

AccessData Mailing Address, Hours, and Department Phone Numbers
Corporate Headquarters:

AccessData Group, Inc.
1100 Alma Street
Menlo Park, California 94025 USAU.S.A.
Voice: 801.377.5410; Fax: 801.377.5426

General Corporate Hours:

Monday through Friday, 8:00 AM – 5:00 PM (MST)
AccessData is closed on US Federal Holidays

State and Local
Law Enforcement Sales:

Voice: 800.574.5199, option 1; Fax: 801.765.4370
Email: Sales@AccessData.com

Federal Sales:

Voice: 800.574.5199, option 2; Fax: 801.765.4370
Email: Sales@AccessData.com

Corporate Sales:

Voice: 801.377.5410, option 3; Fax: 801.765.4370
Email: Sales@AccessData.com

Training:

Voice: 801.377.5410, option 6; Fax: 801.765.4370
Email: Training@AccessData.com

Accounting:

Voice: 801.377.5410, option 4

Technical Support
Free technical support is available on all currently licensed AccessData solutions.
You can contact AccessData Customer and Technical Support in the following ways:
AD Customer & Technical Support Contact Information
AD SUMMATION
and AD
EDISCOVERY

Americas/Asia-Pacific:
800.786.8369 (North America)
801.377.5410, option 5
Email: legalsupport@accessdata.com

AD IBLAZE and
ENTERPRISE:

Americas/Asia-Pacific:
800.786.2778 (North America)
801.377.5410, option 5
Email: support@summation.com

All other AD
SOLUTIONS

Americas/Asia-Pacific:
800.658.5199 (North America)
801.377.5410, option 5
Email: support@accessdata.com

AccessData Legal and Contact Information

| 6

AD Customer & Technical Support Contact Information (Continued)
AD
INTERNATIONAL
SUPPORT

Europe/Middle East/Africa:

Hours of Support:

Americas/Asia-Pacific:

+44 (0) 207 010 7817 (United Kingdom)
Email: emeasupport@accessdata.com

Monday through Friday, 6:00 AM– 6:00 PM (PST), except corporate holidays.
Europe/Middle East/Africa:
Monday through Friday, 8:00 AM– 5:00 PM (UK-London) except corporate holidays.
Web Site:

http://www.accessdata.com/support/technical-customer-support
The Support website allows access to Discussion Forums, Downloads, Previous
Releases, our Knowledge base, a way to submit and track your “trouble tickets”, and
in-depth contact information.

Documentation
Please email AccessData regarding any typos, inaccuracies, or other problems you find with the documentation:
documentation@accessdata.com

Professional Services
The AccessData Professional Services staff comes with a varied and extensive background in digital
investigations including law enforcement, counter-intelligence, and corporate security. Their collective
experience in working with both government and commercial entities, as well as in providing expert testimony,
enables them to provide a full range of computer forensic and eDiscovery services.
At this time, Professional Services provides support for sales, installation, training, and utilization of Summation,
FTK, FTK Pro, Enterprise, eDiscovery, Lab and the entire Resolution One platform. They can help you resolve
any questions or problems you may have regarding these solutions.

Contact Information for Professional Services
Contact AccessData Professional Services in the following ways:

AccessData Professional Services Contact Information
Contact Method

Number or Address

Phone

North America Toll Free: 800-489-5199, option 7
International: +1.801.377.5410, option 7

services@accessdata.com

AccessData Legal and Contact Information

| 7

Contents

AccessData Legal and Contact Information . . . . . . . . . . . . . . . . . . . 3
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Chapter 1: Introduction to Loading Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Importing Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Chapter 2: Using the Evidence Wizard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Using the Evidence Wizard . . . . . . . . . . . . . . . . . . . . 11
About Associating People with Evidence . . . . . . . . . . . . . . 13
Using the CSV Import Method for Importing Evidence . . . . . . 13
Using the Immediate Children Method for Importing . . . . . . . 15

Adding Evidence to a Project Using the Evidence Wizard . . . . . 17
Evidence Time Zone Setting . . . . . . . . . . . . . . . . . . . . . 19

Chapter 3: Importing Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
About Importing Evidence Using Import . . . . . . . . . . . . . . 20
About Mapping Field Values . . . . . . . . . . . . . . . . . . . . . 20

Importing Evidence into a Project . . . . . . . . . . . . . . . . . 21
Chapter 4: Data Loading Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Document Groups . . . . . . . . . . . . . . . . . . . . . . . . . 24
Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Full-Text or OCR . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
DII Load File Format for Image/OCR . . . . . . . . . . . . . . . . 25

Email & eDocs . . . . . .
Coding . . . . . . . . . . .
Related Documents . . .
Transcripts and Exhibits

. . . . . . . . . . . . . . . . . . . . . . . . . . 27
. . . . . . . . . . . . . . . . . . . . . . . . . . 29
. . . . . . . . . . . . . . . . . . . . . . . . . . 32
. . . . . . . . . . . . . . . . . . . . . . . . . . 33

Transcripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Exhibits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

Work Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Sample DII Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
eDoc DII Load Files . . . . . . . . . . . . . . . . . . . . . . . . . . 36
eMail DII Load Files . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Contents

| 8

DII Tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Chapter 5: Analyzing Document Content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Using Cluster Analysis . . . . . . . . . . . . . . . . . . . . . . . 44
About Cluster Analysis . . . . . . . . . . . . . . . . . . . . . . . . 44
Filtering Documents by Cluster Topic . . . . . . . . . . . . . . . . 45

Using Entity Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
About Entity Extraction . . . . . . . . . . . . . . . . . . . . . . . . 47
Enabling Entity Extraction . . . . . . . . . . . . . . . . . . . . . . 49
Viewing Entity Extraction Data . . . . . . . . . . . . . . . . . . . . 49

Chapter 6: Editing Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Editing Evidence Items in the Evidence Tab . . . . . . . . . . . . 50
Evidence Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

Contents

| 9

Chapter 1

Introduction to Loading Data

Importing Data
This document will help you import data into your project. You create projects in order to organize data. Data can
be added to projects in the forms of native files, such as DOC, PDF, XLS, PPT, and PST files, or as evidence
images, such as AD1, E01, and OFF files.
To manage evidence, administrators, and users with the Create/Edit Projects permission, can do the following:
Add

evidence items to a project

View
Edit

properties about evidence items in a project

Associate

people to evidence items in a project

Note: You will normally want to have people created and selected before you process evidence.
See About Associating People with Evidence on page 13.
See the following chapters for more information:

To import data
1.

Click the Add Data button next to the project in the Project List panel.

In the Add Data dialog, select on of the method by which you want to import data. The following
methods are available:
Evidence
Job

(wizard): See Using the Evidence Wizard on page 11.

(Resolution1 applications): See About Jobs on page 377.

Import:

See Importing Evidence on page 20.

Cluster

Analysis: See Using Cluster Analysis on page 44.

Introduction to Loading Data

Importing Data

| 10

Chapter 2

Using the Evidence Wizard

Using the Evidence Wizard
When you add evidence to a project, you can use the Add Evidence Wizard to specify the data that you want to
add. You specify to add either parent folders or individual files.
Note: If you activated Cluster Analysis as a processing option when you created the project, cluster analysis will
automatically run after processing data.
You select sets of data that are called “evidence items.” It is useful to organize data into evidence items because
each evidence item can be associated with a unique person.
For example, you could have a parent folder with a set of subfolders.

\\10.10.3.39\EvidenceSource\
\\10.10.3.39\EvidenceSource\John Smith
\\10.10.3.39\EvidenceSource\Bobby Jones
\\10.10.3.39\EvidenceSource\Samuel Johnson
\\10.10.3.39\EvidenceSource\Edward Peterson
\\10.10.3.39\EvidenceSource\Jeremy Lane

You could import the parent \\10.10.3.39\EvidenceSource\ as one evidence item. If you associated a person to it,
all files under the parent would have the same person.
On the other hand, you could have each subfolder be its own evidence item, and then you could associate a
unique person to each item.
An evidence item can either be a folder or a single file. If the item is a folder, it can have other subfolders, but
they would be included in the item.
When you use the Evidence Wizard to import evidence, you have options that will determine how the evidence is
organized in evidence items.

Using the Evidence Wizard

| 11

When you add evidence, you select from the following types of files.

Evidence File Types
File Type

Description

Evidence Images

You can add AD1, E01, or AFF evidence image files.

Native Files

You can add native files, such as PDF, JPG, DOC PPT, PST, XLSX, and so on.

When you add evidence, you also select one of the following import methods.

Import Methods
Method

Description

CSV Import

This method lets you create and import a CSV file that lists multiple paths of
evidence and optionally automatically creates people and associates each
evidence item with a person.
Like the other methods, you specify whether the parent folder contains native
files or image files.
See Using the CSV Import Method for Importing Evidence on page 13.
This is similar to adding people by importing a file.
See the Project Manager Guide for more information on adding people by
importing a file.

Immediate Children

This method takes the immediate subfolders of the specified path and imports
each of those subfolders’ content as a unique evidence item. You can
automatically create a person based on the child folder’s name (if the child folder
has a first and last name separated by a space) and have it associated with the
data in the subfolder.
See Using the Immediate Children Method for Importing on page 15.
Like the other methods, you specify if the parent folder contains native files or
image files.

Folder Import

This method lets you select a parent folder and all data in that folder will be
imported. You specify that the folder contains either native files (JPG, PPT) or
image files (AD1, E01, AFF).
A parent folder can have both subfolders and files.
Using this method, each parent folder that you import is its own evidence item
and can be associated with one person.
For example, if a parent folder had several AD1 files, all data from each AD1 file
can have one associated person. Likewise, if a parent folder has several native
files, all of the contents of that parent folder can have one associated person.

Individual File(s)

This method lets you select individual files to import. You specify that these
individual files are either native files (JPG, PPT) or image files (AD1, E01, AFF).
Using this method, each individual file that you import is its own evidence item
and can be associated with a person.
For example, all data from an AD1 file can have an associated person. Likewise,
each PDF, or JPG can have its own associated person.

Note: The source network share permissions are defined by the administrator credentials.

Using the Evidence Wizard

| 12

About Associating People with Evidence
When you add evidence items to a project, you can specify people, or custodians, that are associated with the
evidence. These custodians are listed as People on the Data Sources tab.
In the Add Evidence Wizard, after specifying the evidence that you want to add, you can then associate that
evidence to a person. You can select an existing person or create a new person.
Important: If you want to select an existing Person, that person must already be associated to the project. You
can either do that for the project on the Home page > People tab, or you can do it on the Data
Sources page > People tab.
You can create people in the following ways:
On

the Data Sources tab before creating a project.
See the Data Sources chapter.

When

adding evidence to a project within the Add Evidence Wizard.
See Adding Evidence to a Project Using the Evidence Wizard on page 17.

On

the People tab on the Home page for a project that has already been created.

About Creating People when Adding Evidence Items
In the Add Evidence Wizard, you can create people as you add evidence. There are three ways you can create
people while adding evidence to a project:
Using

a CSV Evidence Import.
See Using the CSV Import Method for Importing Evidence on page 13.

Importing

immediate children.
See Using the Immediate Children Method for Importing on page 15.

Adding

a person in the Add Evidence Wizard.
You can select a person from the drop-down in the wizard or enter a new person name.
See the Project Manager Guide for more information on creating people.

Using the CSV Import Method for Importing Evidence
When specifying evidence to import in the Add Evidence Wizard, you can use one of two general options:
Manually

browse to all evidence folders and files.

Specify

folders, files, and people in a CSV file.
There are several benefits of using a CSV file:
You

can more easily and accurately plan for all of the evidence items to be included in a project by
including all sources of evidence in a single file.

You

can more easily and accurately make sure that you add all of the evidence items to be included in
a project.

If

you have multiple folders or files, it is quicker to enter all of the paths in the CSV file than to browse
to each one in the wizard.

If

you are going to specify people, you can specify the person for each evidence item. This will
automatically add those people to the system rather than having to manually add each person.

Using the Evidence Wizard

| 13

When using a CSV, each path or file that you specify will be its own evidence item. The benefit of having multiple
items is that each item can have its own associated person. This is in contrast with the Folder Import method,
where only one person can be associated with all data under that folder.
Specifying people is not required. However, if you do not specify people, when the data is imported, no people
are created or associated with evidence items. Person data will not be usable in Project Review.
See the Project Manager Guide for information on associating a person to an evidence item.
If you do specify people in the CSV file, you use the first column to specify the person’s name and the second
column for the path.
If you do not specify people, you will only use one column for paths. When you load the CSV file in the Add
Evidence Wizard, you will specify that the first column does not contain people’s names. That way, the wizard
imports the first column as paths and not people.
If you do specify people, they can be in one of two formats:
A

single name or text string with no spaces
For example, JSmith or John_Smith

First

and last name separated by a space
For example, John Smith or Bill Jones

In the CSV file, you can optionally have column headers. You will specify in the wizard whether it should use the
first row as data or ignore the first row as headers.

CSV Example 1
This example includes headers and people.
In the wizard, you select both First row contains headers and First column contains people names check
boxes.
When the data is imported, the people are created and associated to the project and the appropriate evidence
item.

People, Paths
JSmith,\\10.10.3.39\EvidenceSource\JSmith
JSmith,\\10.10.3.39\EvidenceSource\Sales\Projections.xlsx
Bill Jones,\\10.10.3.39\EvidenceSource\BJones
Sarah Johnson,\\10.10.3.39\EvidenceSource\SJohnson
Evan_Peterson,\\10.10.3.39\EvidenceSource\EPeterson
Evan_Peterson,\\10.10.3.39\EvidenceSource\HR
Jill Lane,\\10.10.3.39\EvidenceSource\JLane
Jill Lane,\\10.10.3.39\EvidenceSource\Marketing

This will import any individual files that are specified as well as all of the files (and additional subfolders) under a
listed subfolder.

Using the Evidence Wizard

| 14

You may normally use the same naming convention for people. This example shows different conventions
simply as examples.

CSV Example 2
This example does not include headers or people.
In the wizard, you clear both First row contains headers and First column contains people names check
boxes.
When the data is imported, no people are created or associated with evidence items.

\\10.10.3.39\EvidenceSource\JSmith
\\10.10.3.39\EvidenceSource\Sales\Projections.xlsx
\\10.10.3.39\EvidenceSource\BJones
\\10.10.3.39\EvidenceSource\SJohnson
\\10.10.3.39\EvidenceSource\EPeterson
\\10.10.3.39\EvidenceSource\HR
\\10.10.3.39\EvidenceSource\JLane
\\10.10.3.39\EvidenceSource\Marketing

Using the Immediate Children Method for Importing
If you have a parent folder that has children subfolders, when importing it through the Add Evidence Wizard, you
can use one of three methods:
Folder

Import

Immediate

Children

CSV

Import
See Using the CSV Import Method for Importing Evidence on page 13.

When using the Immediate Children method, each child subfolder of the parent folder will be its own evidence
item. The benefit of having multiple evidence items is that each item can have its own associated person. This is
in contrast with the Folder Import method, where all data under that folder is a single evidence item with only one
possible person associated with it.
Specifying people is not required. However, if you do not specify people, when the data is imported, no people
are created or associated with evidence items. Person data will not be usable in Project Review.
See the Project Manager Guide for more information on associating a person to evidence.
When you select a parent folder in the Add Evidence Wizard, you select whether or not to specify people.
If you do specify people, the names of people are based on the name of the child folders.
Imported names of people can be imported in one of two formats:
A

single name or text string with no spaces
For example, JSmith or John_Smith

Using the Evidence Wizard

| 15

First

and last name separated by a space
For example, John Smith or Bill Jones

For example, suppose a parent folder had four subfolders, each containing data from a different user. Using the
Immediate Children method, each subfolder would be imported as a unique evidence item and the subfolder
name could be the associated person.
\Userdata\

(parent folder that is selected)

\Userdata\lNewstead (unique evidence item with lNewstead as a person)
\Userdata\KHetfield

(unique evidence item with KHetfield as a person)

\Userdata\James Ulrich (unique evidence item with James Ulrich as a person)
\Userdata\Jill_Hammett (unique evidence item with Jill_Hammett as a person)
Note: In the Add Evidence Wizard, you can manually rename the people if needed.
The child folder may be a parent folder itself, but anything under it would be one evidence item.
This method is similar to the CSV Import method in that it automatically creates people and associates them to
evidence items. The difference is that when using this method, everything is configured in the wizard and not in
an external CSV file.

Using the Evidence Wizard

| 16

Adding Evidence to a Project Using the Evidence Wizard
You can import evidence for projects for which you have permissions.
When you add evidence, it is processed so that it can be reviewed in Project Review.
Some data cannot be changed after it has been processed. Before adding and processing evidence, do the
following:
Configure

the Processing Options the way you want them.
See the Admin Guide for more information on default processing options.

Plan

whether or not you want to specify people.
See the Project Manager Guide for more information on associating a person to evidence.

Unless

you are importing people as part of the evidence, you must have people already associated with
the project.
See the Project Manager Guide for more information on creating people.

Note: Deduplication can only occur with evidence brought into the application using evidence processing.
Deduplication cannot be used on data that is imported.

To import evidence for a project
1.

In the project list, click

(add evidence) in the project that you want to add evidence to.

Select Evidence.

In the Add Evidence Wizard, select the Evidence Data Type and the Import Method.
See Using the Evidence Wizard on page 11.

Click Next.

Select the evidence folder or files that you want to import.
This screen will differ depending on the Import Method that you selected.
If you are using the CSV Import method, do the following:

5a.
If

the CSV file uses the first row as headers rather than folder paths, select the First row contains
headers check box, otherwise, clear it.

If

the CSV file uses the first column to specify people, select the First column contains people’s
names check box, otherwise, clear it.
See Using the CSV Import Method for Importing Evidence on page 13.
 Click Browse.
 Browse to the CSV file and click OK.
The CSV data is imported based on the check box settings.
Confirm that the people and evidence paths are correct.
You can edit any information in the list.
If the wizard can’t validate something in the CSV, it will highlight the item in red and place a red
box around the problem value.
If a new person will be created, it will be designated by

5b.

If you are using the Immediate Children method, do the following:
 If you want to automatically create people, select Sub folders are people’s names,
otherwise, clear it.
See Using the Immediate Children Method for Importing on page 15.
 Click Browse.
 Enter the IP address of the server where the evidence files are located and click Go.

Using the Evidence Wizard

Adding Evidence to a Project Using the Evidence Wizard

| 17

For example, 10.10.2.29
to the parent folder and click Select.
Each child folder is listed as a unique evidence item.
If you selected to create people, they are listed as well.
Confirm that the people and evidence paths are correct.
You can edit any information in the list.
If the wizard can’t validate something, it will highlight the item in red and place a red box
around the problem value.

 Browse

If a new person will be created, it will be designated by
5c.

If you are using the Folder Input or Individual Files method, do the following:
 Click Browse.
 Enter the IP address of the server where the evidence files are located and click Go.
For example, 10.10.2.29
 Expand the folders in the left pane to browse the server.
 In the right pane highlight the parent folder or file and click Select.
If you are selecting files, you can use Ctrl-click or Shift-click to select multiple files in one
folder.
The folder or file is listed as a unique evidence item.

If you want to specify a person to be associated with this evidence, select one from the Person Name
drop-down list or type in a new person name to be added.
See About Associating People with Evidence on page 13.
If you enter a new person that will be created, it will be designated by

You can also edit a person’s name if it was imported.
7.

Specify a Timezone.
From the Timezone drop-down list, select a time zone.
See Evidence Time Zone Setting on page 19.

(Optional) Enter a Description.
This is used as a short description that is displayed with each item in the Evidence tab.
For example, “Imported from Filename.csv” or “Children of path”.
This can be added or edited later in the Evidence tab.

(Optional) If you need to delete an evidence item, click the

for the item.

10. Click Next.
11. In the Evidence to be Added and Processed screen, you can view the evidence that you selected so far.

From this screen, you can perform one of the following actions:
Add

More: Click this button to return to the Add Evidence screen.

Add

Evidence and Process: Click this button to add and process the evidence listed.
When you are done, you are returned to the project list. After a few moments, the job will start and the
project status should change to Processing.

12. If you need to manually update the list or status, click

Refresh.

13. When the evidence import is completed, you can view the evidence items in the Evidence and People

labels.

Using the Evidence Wizard

Adding Evidence to a Project Using the Evidence Wizard

| 18

Evidence Time Zone Setting
Because of worldwide differences in the time zone implementation and Daylight Savings Time, you select a time
zone when you add an evidence item to a project.
In a FAT volume, times are stored in a localized format according to the time zone information the operating
system has at the time the entry is stored. For example, if the actual date is Jan 1, 2005, and the time is 1:00
p.m. on the East Coast, the time would be stored as 1:00 p.m. with no adjustment made for relevance to
Greenwich Mean Time (GMT). Anytime this file time is displayed, it is not adjusted for time zone offset prior to
being displayed.
If the same file is then stored on an NTFS volume, an adjustment is made to GMT according to the settings of
the computer storing the file. For example, if the computer has a time zone setting of -5:00 from GMT, this file
time is advanced 5 hours to 6:00 p.m. GMT and stored in this format. Anytime this file time is displayed, it is
adjusted for time zone offset prior to being displayed.
For proper time analysis to occur, it is necessary to bring all times and their corresponding dates into a single
format for comparison. When processing a FAT volume, you select a time zone and indicate whether or not
Daylight Savings Time was being used. If the volume (such as removable media) does not contain time zone
information, select a time zone based on other associated computers. If they do not exist, then select your local
time zone settings.
With this information, the system creates the project database and converts all FAT times to GMT and stores
them as such. Adjustments are made for each entry depending on historical use data and Daylight Savings
Time. Every NTFS volume will have the times stored with no adjustment made.
With all times stored in a comparable manner, you need only set your local machine to the same time and date
settings as the project evidence to correctly display all dates and times.

Using the Evidence Wizard

Adding Evidence to a Project Using the Evidence Wizard

| 19

Chapter 3

Importing Evidence

About Importing Evidence Using Import
As an Administrator or Project Manager with the Create/Edit Projects permissions, you can import evidence for a
project.
You import evidence by using a load file, which allows you to import metadata and physical files, such as native,
image, and/or text files that were obtained from another source, such as a scanning program or another
processing program. You can import the following types of load files:
Summation
Generic

DII - A proprietary file type from Summation. See Data Loading Requirements on page 24.

- A delimited file type, such as a CSV file.

Concordance/Relativity

- A delimited DAT file type that has established guidelines as to what delimiter
should be used in the fields. This file should have a corresponding LFP or OPT image file to import.

Transcripts and exhibits are uploaded from Project Review and not from the Import dialog. See the Project
Manager Guide for more information on how to upload transcripts and exhibits.

About Mapping Field Values
When importing you must specify which import file fields should be mapped to database fields. Mapping the
fields will put the correct information about the document in the correct columns in the Project Review.
After clicking Map Fields, a process runs that checks the imported load file against existing project fields. Most
of the import file fields will automatically be mapped for you. Any fields that could not be automatically mapped
are flagged as needing to be mapped.
Note: If you need custom fields, you must create them in the Custom Fields tab on the Home page before you
can map to those fields during the import. If the custom names are the same, they will be automatically
mapped as well.
Any errors that have to be corrected before the file can be imported are reported at this time.
When importing a CSV or DAT load file that is missing the unique identifier used to map to the DocID file, an
error message will be displayed.
Notes:
If

a record contains the same values for the DocID as the ParentID, an error is logged in the log file and
the record is not imported. This allows you to correct the problem record and make sure all records in the
family are included in the loadfile correctly.

Importing Evidence

About Importing Evidence Using Import

| 20

In

review, the AttachmentCount value is displayed under the EmailDirectAttachCount column.

The

Importance value is not imported as a text string but is converted and stored in the database as an
integer representing a value of either Low, Normal, High, or blank. These values are case sensitive and
in the import file must be an exact match.

The

Sensitivity value is not imported as a text string but is converted and stored in the database as an
integer representing a value of either Confidential, Private, Personal, or Normal. These values are case
sensitive and in the import file must be an exact match.

The

Language value is not imported as a text string but is converted and stored in the database as an
integer representing one of 67 languages.

Body

text that is mapped to the Body database field is imported as an email body stream and is viewable
in the Natural viewer. When importing all file types, the import Body field is now automatically mapped to
the Body database field.

Importing Evidence into a Project
To import evidence into a project
1.

Log into the application as an Administrator or a user with Create/Edit Project rights.

In the Project List panel, click Add Evidence

Click Import.

In the Import dialog, select the file type (EDII, Concordance/Relativity, or Generic ).

next to the project.

4a.

Enter the location of the file or Browse to the file’s location.

4b.

(optional - Available only for Concordance/Relativity) Select the Image Type and enter the location
of the file, or Browse to the file’s location. You can choose from the following file options:
 OPT - Concordance file type that contains preferences and option settings associated with the
files.
 LFP - Ipro file type that contains load images and related information.

Perform field mapping.
Most fields will be automatically mapped. If some fields need to be manually mapped, you will see an
orange triangle.
5a.

Click Map Fields to map the fields from the load file to the appropriate fields.
See About Mapping Field Values on page 20.

5b.

To skip any items that do not map, select Skip Unmapped.

5c.

To return the fields back to their original state, click Reset.

Note: Every time you click the Map Fields button, the fields are reset to their original state.
6.

Select the Import Destination.
6a.

Choose from one of the following:
 Existing Document Group: This option adds the documents to an existing document group.
Select the group from the drop-down menu.
See the Project Manager Guide (or section) for more information on managing document
groups.
 Create New Document Group: This option adds the documents to a new document group.
Enter the name of the group in the field next to this radio button.

Importing Evidence

Importing Evidence into a Project

| 21

Select the Import Options for the file. These options will differ depending on whether you select DII,
Concordance/Relativity, or Generic.
General

Options:
Fast Import: This will exclude database indexes while importing.

 Enable
DII

Options:

 Page

Count Follows Doc ID: Select this option if your DII file has an @T value that contains
both a Doc ID and a page count.
 Import OCR/Full Text: Select this option to import OCR or Full Text documents for each
record.
 Import Native Documents/Images: Select this option to import Native Documents and
Images for each record.
 Process files to extract metadata: Selecting this option will import only the metadata that
exists on the load file and not process native files as you import them with a load file.
Concordance/Relativity,

or Generic Options:
Row Contains Field Names: Select this option if the file being imported contains a row
header.
 Field, Quote, and Multi-Entry Separators: From the pull-down menu, select the symbols for
the different separators that the file being imported contains. Each separator value must match
the imported file separators exactly or the field being imported for each record is not populated
correctly.
 Return Placeholder: From the pull-down menu, select the same value contained in the file
being imported as a replacement value for carriage return and line feed characters. Each
return placeholder value must match the imported file separators.
 First

Configure the Date Options.
Select

the date format from the Date Format drop-down menu.
This option allows you to configure what date format appears in the load file system, allowing the
system to properly parse the date to store in the database. All dates are stored in the database in a
yyy-mm-dd hh:mm:ss format.

Select

the Load File Time Zone.
Choose the time zone that the load file was created in so the date and time values can be converted
to a normalized UTC value in the database.
See Normalized Time Zones on page 118.

Select the Record Handling Options.
New

Record:
Select to add new records.
 Skip: Select to ignore new records.
 Add:

Existing

Record:
 Update: Select to update duplicate records with the record being imported.
 Overwrite: Select to overwrite any duplicate records with the record being imported.
 Skip: Select to skip any duplicate records.

10. Validation: This option verifies that:
The

path information within the load file is correct

The

records contain the correct fields. For example, the system verifies that the delimiters and fields
in a Generic or Concordance/Relativity file are correct.

You

have all of the physical files (that is, Native, Image, and Text) that are listed in the load file.

11. (optional) Drop DB Indexes. Database indexes improve performance, but slow processing when

inserting data. If this option is checked, all of the data reindexes every time more data is loaded. Only
select this option if you want to load a large amount of data quickly before data is reviewed.
12. Click Start.

Importing Evidence

Importing Evidence into a Project

| 22

Importing Evidence

Importing Evidence into a Project

| 23

Chapter 4

Data Loading Requirements

This chapter describes the data loading requirements of Resolution1 Platform and Summation and contains the
following sectons:
Document
Email

Groups (page 24)

& eDocs (page 27)

Coding
Related

(page 29)
Documents (page 32)

Transcripts
Work

Product (page 35)

Sample
DII

and Exhibits (page 33)

DII Files (page 36)

Tokens (page 40)k

Document Groups
Note: You can import and display Latin and non-Latin Unicode characters. While the application supports the
display of fielded data in either Latin or non-Latin Unicode characters, the modification of fielded data is
supported only in Latin Unicode characters.

Note: The display of non-Latin Unicode characters does not apply to transcript filenames, since transcript
deponents are defined by project users, or work product filenames, which are not displayed in the
application.

Images
The following describes the required and recommended formats for images.

Required
A

DII load file is required to load image documents. 0

Group

IV TIFFS: single or multi-page, black and white (or color), compressed images, no DPI minimum.

Single

page JPEGs for color images.

Data Loading Requirements

Document Groups

| 24

Full-Text or OCR
The following describes the required and recommended formats for full-text or OCR.

Required
If

submitting document level OCR, page breaks should be included between each page of text in the
document text file.
Failure to insert page breaks will result in a one page text file for a multi-page document. The ASCII
character 12 (decimal) is used for the “Page Break” character. All instances of the character 12 as page
breaks will be interpreted.

Document
All
A

level OCR or page level OCR.

OCR files should be in ANSI or Unicode text file format, with a *.txt extension.

DII load file. Loading Control List (.LST) files are not supported.

Recommended
OCR

text files should be stored in the same directories as image files.

Page

level OCR is recommended to ensure proper page breaks.

DII Load File Format for Image/OCR
Note: When selecting the Copy ESI option, the DII and source files must reside in a location accessible by the
IEP server; otherwise, import jobs will fail during the Check File process.
The following describes the required format for a DII load file to load images and OCR.

Required
A

blank line after each document summary.

@T

to identify each document summary.

@T

should equal the beginning Bates number.

If

OCR is included, then use @FULLTEXT at the beginning of the DII file (@FULLTEXT DOC or
@FULLTEXT PAGE).

If

@FULLTEXT DOC is included, OCR text files are assumed to be in the Image folder location with the
same name as the first image (TIFF or JPG) file.

If

@FULLTEXT PAGE is included, OCR text files are assumed to be in the Image folder location with the
same name as the image files (each page should have its own txt file).

If

@O token is used, @FULLTEXT token is not required.

If

Fulltext is located in another directory other than images, use @FULLTEXTDIR followed by the
directory path.

Data Loading Requirements

Document Groups

| 25

The

page count identifier on the @T line can be interpreted ONLY if it is denoted with a space character.
For example:
@FULLTEXT PAGE
@T AAA0000001 2
@D @I\IMAGES\01\
AAA0000001.TIF
AAA0000002.TIF
@T AAA0000003 1
@D @I\IMAGES\02\
AAA0000003.TIF

Import controls the Page Count Follows DocID option. If this option is deselected, the page count
identifier on the @T line would not be recognized.

Recommended
DII

load file names should mirror that of the respective volume (for easy association and identification).

@T

values (that is, the BegBates) and EndBates should include no more than 50 characters.
Non-alphabetical and non-numerical characters should be avoided.

Data Loading Requirements

Document Groups

| 26

Email & eDocs
You can host email, email attachments, and eDocs (electronic documents in native format) for review and
attorney coding, as well as associated full-text and metadata. It is also possible to include an imaged version (in
TIFF format) of the file at loading. A DII load file is required in order to load e-mail and electronic documents.
Note: You can import and display of Latin and non-Latin Unicode characters. While the application supports the
display of fielded data in either Latin or non-Latin Unicode characters, the modification of fielded data is
supported only in Latin Unicode characters.

Note: The display of non-Latin Unicode characters does not apply to transcript filenames, since transcript
deponents are defined by users, or work product filenames, which are not displayed.

General Requirements
The following describes the required and recommended formats for DII files that are used to load email, email
attachments, and eDocs.
A DII load file with a *.dii file extension, using only the tokens, is listed in DII Tokens (page 40).
@T

to identify each email, email attachment, or eDoc record.

@T

is the first line for each summary.

@T

equals the unique Docid for each email, email attachment, or eDoc record. There should be only one
@T per record.

A

blank line between document records.

@EATTACH

token is required for email attachments and @EDOC for eDocs. These tokens contain a
relative path to the native file.

@MEDIA

is required for email data with a value of eMail or Attachment. For eDocs, the @MEDIA value
must be eDoc.

@EATTACH

is required when @MEDIA has a value of Attachment and is not required when @MEDIA
has a value of eMail.

To

maintain the parent/child relationship between an e-mail and its attachments (family relationships for
eDocs), the @PARENTID and @ATTACH tokens are used.

To

include images along with the native file delivery, use the @D @I tokens at the end of the record.

@O

token is extended to support loading FullText into eDoc and eMails also.
If record has both @O and @EDOC/@EATTACH tokens, FullText is loaded from the file specified by the
@O token. If @O token does NOT exist for the record, FullText is extracted from the file specified by the
@EDOC/@EATTACH token.

@AUTHOR

and @ITEMTYPE tokens are NOT supported.

Recommended
@T

values (Begbates/Docid) should include no more than 50 characters. Non-alphabetical and
non-numerical characters should be avoided.

Specify

parent-child relationship in the DII file based on the following rule:

Data Loading Requirements

Email & eDocs

| 27

In

the DII file, email attachments should immediately follow the parent record, that is:
@T ABC000123
@MEDIA eMail
@EMAIL-BODY
Please reply with a copy of the completed report.
Thanks for your input.
Beth
@EMAIL-END
@ATTACH ABC000124; ABC000125
@T ABC000124
@MEDIA Attachment
@EATTACH \Native\ABC000124.doc
@PARENTID ABC000123
@T ABC000125
@MEDIA Attachment
@EATTACH \Native\ABC000125.doc
@PARENTID ABC000123

Data Loading Requirements

Email & eDocs

| 28

Coding
The following describes the required and recommended formats for coded data.

Recommended
Coded
Use

data should be submitted in a delimited text file, with a *.txt extension.

the following default delimiter characters:

Field Separator

Multi-entry Separator

;

Return Placeholder

Quote Separator

Users can, however, specify any custom character in the Import user interface for any of the separators above.
The

standard comma and quote characters (‘,’ ‘”’) are accepted. When these characters are present
within coded data, different characters must be used as separators.
For instance,
DOCID|SUMMARY|AUTHOR
^DOJ000001^|^Test “Summary1”^|^Smith, John^
In the above file,
Field Separator |
Quote Separator ^
field values should have any of the following formats. The date 16th August 2009 can be
represented in the load file as:

Date

08/16/2009
16/08/2009
20090816

In addition, fuzzy dates are also supported. Currently only DOCDATE field supports fuzzy dates.
If

a day is fuzzy, then replace dd with 00.

If

a month is fuzzy, then replace mm with 00.

If

a year is fuzzy, replace yyyy with 0000.

Data Loading Requirements

Coding

| 29

Format

Example

mm/dd/yyyy

00/16/2009 (month fuzzy)
08/00/2009 (day fuzzy)
08/16/0000 (year fuzzy)
00/16/0000 (month and year fuzzy)
08/00/0000 (day and year fuzzy)
00/00/2009 (month and day fuzzy)
00/00/0000 (all fuzzy)
08/16/2009 (no fuzzy)

yyyymmdd

00000816 (year fuzzy)
20090016 (month fuzzy)
20090800 (day fuzzy)
00000016 (year and month fuzzy)
00000800 (year and day fuzzy)
20090000 (month and day fuzzy)
00000000 (all fuzzy)
20090816 (no fuzzy)

dd/mm/yyyy

00/08/2009 (day fuzzy)
16/00/2009 (month fuzzy)
16/08/0000 (year fuzzy)
16/00/0000 (month and year fuzzy)
00/08/0000 (day and year fuzzy)
00/00/2009 (day and month fuzzy)
00/00/0000 (all fuzzy)
16/08/2009 – no fuzzy

Time

values should have any of the following formats. The time 1:27 PM can be represented in the load
file as:
1:27

01:27

1:27:00

01:27:00

13:27
13:27:00

Data Loading Requirements

Coding

| 30

Time values for standard tokens @TIMESENT/@TIMERCVD/@TIMESAVED/TIMECREATED will not be loaded
for a document unless accompanied by a corresponding DATE token DATESENT/ @DATERCVD/
@DATESAVED/@DATECREATED.

Recommended
You

can use Field Mapping where the user can select different fields to be populated from the DII/CSV
files. Fields would be automatically mapped during Import if the name of the database field matches the
name of the field within the DII/CSV file.

Field

names within the header row will appear exactly as they appear within the delimited text file. Use
consistent field naming for subsequent data deliveries.

DocID/BegBates/EndBates

values should include no more than 50 characters. Non-alphabetical and
non-numerical characters should be avoided.

Coding

Data Loading Requirements

Coding

| 31

Related Documents
You can review related documents the @ATTACHRANGE token or the @PARENTID and @ATTACH tokens. .
The related documents must be coded in sequential order by their DOCID. The sequence determines the first
document and the last document in the related document set.
Note: Bates number of the first document in @ATTACHRANGE populates the ParentDoc column.

Note: @ParentID populates the ParentDoc field and @ATTACH populates the AttachIDs.
Either @Attachrange or @ParentID can be used at a time.
For example:
@ATTACHRANGE ABC001-ABC005
OR
@PARENTID ABC001
OR
@ATTACH ABC001;ABC002;ABC003;ABC004;ABC005

Data Loading Requirements

Related Documents

| 32

Transcripts and Exhibits
Note: You can import and display of Latin and non-Latin Unicode characters. While the application supports the
display of fielded data in either Latin or non-Latin Unicode characters, the modification of fielded data s
supported only in latin Unicode characters.

Note: The display of non-Latin Unicode characters does not apply to transcript filenames, since transcript
deponents are defined by users, or work product filenames, which are not displayed.
From Menu > Transcript > Manage, you can upload new transcripts to any transcript collection to which they
have access. All transcripts are displayed individually, and each has its own menu that controls various transcript
management functions.

Transcripts
The following describes the required and recommended formats for transcripts.

Required
ASCII

or Unicode files (*.txt) in AMICUS format.

Recommended
Transcript
Page

size is less than one megabyte.

number specifications:

All

transcript pages are numbered.

Page

numbers are up against the left margin. The first digit of the page number should appear in
Column 1. See the figure below.

Page

numbers appear at the top of each page.

Page

numbers contain no more than six digits, including zeros, if necessary. For example, Page 34
would be shown as 0034, 00034, or 000034.

The

first line of the transcript (Line 1 of the title page) contains the starting page number of that
volume. For example, if the volume starts on Page 1, either 0001 or 00001 are correct. If the volume
starts on Page 123, either 0123 or 00123 are correct.

Line

numbers appear in Columns 2 and 3.

Text

starts at least one space after the line number. It is recommended to start text in Column 7.

No

lines are longer than 78 characters (including letters and spaces).

No

page breaks, if possible. If page breaks are necessary, they should be on the line preceding the
page number.

Consistent

numbers of lines per page, if neither page breaks nor page number formats are used.

No

headers or footers.

All

transcript lines are numbered.

Data Loading Requirements

Transcripts and Exhibits

| 33

Preferred Transcript Format

Exhibits
The following describes the required format for Exhibits.

Required
Exhibits
If

that will be loaded must be in PDF format.

an Exhibit has multiple pages, all pages must be contained in one file instead of a file per page.

Data Loading Requirements

Transcripts and Exhibits

| 34

Work Product
Note: You can import and display of Latin and non-Latin Unicode characters. While the application supports the
display of fielded data in either Latin or non-Latin Unicode characters, the modification of fielded data is
supported only in Latin Unicode characters.

Note: The display of non-Latin Unicode characters does not apply to transcript filenames, since transcript
deponents are defined by users, or work product filenames, which are not displayed.

From Menu > Work Product > Manage you can upload, view, and review Work Product files. Work Product can
be any type of file: text, word processing, PDF, or even MP3. (MP3 files are useful when you wish to send an
audio transcript or message to the members of the group who have access to Work Product). The application
does not maintain edits or keep version control information for the documents stored. Users working with Work
Product documents must have the appropriate native application, such as Microsoft Word or Adobe Acrobat, to
open them.

Data Loading Requirements

Work Product

| 35

Sample DII Files
Note: You can import and display of Latin and non-Latin Unicode characters. While the application supports the
display of fielded data in either Latin or non-Latin Unicode characters, the modification of fielded data is
supported only in Latin Unicode characters.

Note: The display of non-Latin Unicode characters does not apply to transcript filenames, since transcript
deponents are defined by users, or work product filenames, which are not displayed.

Note: When selecting the Copy ESI option, the DII source files must reside in a location accessible by the IEP
server; otherwise, import jobs will fail during the Check File process.

eDoc DII Load Files
Required DII Format (eDocs)
@T SSS00000007
@MEDIA eDoc
@EDOC \folder\SSS00000007.xls
@T SSS00000008
@MEDIA eDoc
@EDOC \Native\SSS00000008.doc

Recommended DII format (eDocs)
@T ABC00000123
@MEDIA eDoc
@EDOC \Natives\ABC00000123.xls
@APPLICATION Microsoft Excel
@DATECREATED 05/25/2002
@DATESAVED 06/05/2002
@SOURCE Dee Vader

Data Loading Requirements

Sample DII Files

| 36

eMail DII Load Files
Required DII File Format for Parent Email (Emails)
@T ABC000123
@MEDIA eMail
@EMAIL-BODY
Please reply with a copy of the completed report.
Thanks for your input.
Beth
@EMAIL-END
@ATTACH ABC000124;ABC000125

Required DII File Format for Related Email Attachment (Emails)
@T ABC000124
@MEDIA Attachment
@EATTACH \Native\ABC000124.doc
@PARENTID ABC000123

Data Loading Requirements

Sample DII Files

| 37

Recommended DII Format for Parent Email (Emails)
@T ABC000123
@MEDIA eMail
@ATTACH ABC000124; ABC000125
@EMAIL-BODY
Please reply with a copy of the completed report.
Thanks for your input.
Beth
@EMAIL-END
@FROM Abe Normal (anormal@ctsummation.com)
@TO abcody@ctsummation.com; rob.hood@wolterskluwer.com
@CC Willie Jo
@BCC Jopp@ctsummation.com
@SUBJECT Please reply
@APPLICATION Microsoft Outlook
@DATECREATED 06/16/2006
@DATERCVD 06/16/2006
@DATESENT 06/16/2006
@FOLDERNAME \ANormal\Sent Items
@READ Y
@SOURCE Abe Normal
@TIMERCVD 1:36 PM
@TIMESENT 1:35 PM

Recommended DII Format for Related Email Attachments (Emails)
@T ABC000124
@MEDIA Attachment
@EATTACH \Native\ABC000124.doc
@PARENTID ABC000123
@APPLICATION Microsoft Word
@DATECREATED 05/25/2005
@DATESAVED 06/05/2005
@SOURCE Abe Normal
@AUTHOR Abe Normal
@DOCTITLE Sales Report June 2005

Data Loading Requirements

Sample DII Files

| 38

Recommended DII Format for Native Plus Images Deliveries (Email and eDocs)
(Append to the previous recommended DII formats for eDocs or email.)
@D @|\Images\
ABC000124-001.tif
ABC000124-002.tif

Data Loading Requirements

Sample DII Files

| 39

DII Tokens
Data for all tokens must be in a single line except the @OCR…@OCR-END, @EMAIL-BODY … @EMAIL-END
and @HEADER … @HEADER-END.

TOKEN

FIELD POPULATED

DESCRIPTION OF USAGE

DOCID &
BEGBATES

This token is required for each DII record. This must be the first
token listed for the document. This must be unique in the case.
The @BEGBATES or @DOCID should not be used. @T
ABC000123

@APPLICATION

Application

The application used to view the electronic document. For
example: @APPLICATION Microsoft Word

@ATTACH

AttachDocs

IDs of attached documents. For example: @ATTACH
ABC000124;ABC000125

@ATTACHRANG
E

ParentDoc

The document number range of all attachments if more than one
attachment exists. The beginning number in the range populates
the PARENTDOC. For example:
@ATTACHRANGE WGH000008 – WGH0000010

@ATTMSG

Media & Native file is
copied into the
filesystem using the
path provided

The file name of the e-mail attachment (that is an e-mail message
itself) including the relative or absolute path to the document. The
relative path is evaluated using the path to the DII file as the root
path. The native file is then loaded. The Media field is populated
with the value eMail.

@BATESBEG

Begbates

Beginning Bates number, used with @BATESEND.
For example: @BATESBEG SGD00001

@BATESEND

EndBates

Ending Bates number. For example: @BATESEND SGD00055

@BCC

EmailBCC

Anyone sent a blind copy on an e-mail message.
For example: @BCC Nick Thomas

Custom Field

Code used to load a custom field in the database. The syntax for
the @C token is: @C The FIELDNAME
value cannot contain spaces. For example, to fill in the
DEPARTMENT field of the database with the value Accounting,
the line would read: @C DEPARTMENT Accounting

@CC

EmailCC

Anyone copied on an e-mail message. For example: @CC John
Ace

Data Loading Requirements

DII Tokens

| 40

@D @I

Link to images

Required token for each DII record that has an image associated
with it. This designates the directory location of the image file(s).
Note that only the “@D @I” sequence is allowed. The “@D @V”
sequence is not recognized.
The following 2 examples are equivalent:
--Example 1
@D @I\Images\001\
ABC00123.tif
ABC00124.tif
--Example 2
@D @I\Images\
001\ABC00123.tif
001\ABC00124.tif. Note the directory should be relative to the
load file. If this token is in the record, it must be the last token in
the record.
Also UNC paths in the Image Directory field
(For example @D \\Server\PFranc\Images) are recognized but
no hard coded drive letters.

@DATECREATE
D

CreationDateFT

The date that the file was created. For example:
@DATECREATED 01/04/2003

@DATERCVD

DeliveryTimeFT

Date that the e-mail message was received.

@DATESAVED

ModificationDateFT

Date that the file was saved.

@DATESENT

SubmitTimeFT

Date that the e-mail message was sent.

@EATTACH

Native file is copied
into the filesystem
using the path
provided

Relative path (from the load file location) of the native file to be
loaded. Valid for Attachments.

@EDOC

Native file is copied
into the filesystem
using the path
provided

Same as @EATTACH except for eDocs.
For example
@EDOC \Attachments\ABC000123.xls
Valid for edocs only.

@EMAIL-BODY
@EMAIL-END

Email body is copied
into a file in the file
system.

Body of an e-mail message. Must be a string of text contained
between @EMAIL-BODY and @EMAIL-END. The @EMAIL-END
token must be on its own line.
For example:
@EMAIL-BODY
Bill, This looks excellent. Ted
@EMAIL-END

@FILENAME

Filename of the
native

Original Filename of the native file (Edoc/Email/Attachment) For
example
@FILENAME AnnualReport.xls

@FOLDERNAME

FolderNameID

The name of the folder that the e-mail message came from.
For example: @FOLDERNAME \Inbox\Projects\ARProject

@FROM

EmailFrom

From field in an e-mail message.
For example: @FROM Kelly Morris

Data Loading Requirements

DII Tokens

| 41

@FULLTEXT

N/A (text processing
directive)

Determines how OCR is associated with the document. This token
should be placed at the top of the file, before any @T tokens. The
OCR files must have the same names as the images (not
including the extension), and they must be located in the same
directory. Variations: @FULLTEXT DOC - One text file exists for
each database record. The name of the file must be the same
name as the first image file. @FULLTEXT PAGE - One text file
exists for each page.

@FULLTEXTDIR

Link to Full text
Directory

The @FULLTEXTDIR token is a partner to the @FULLTEXT
token. @FULLTEXTDIR allows specifying a directory from which
the full-text will be copied during the import. Therefore, the full-text
files do not have to be located in the same directory as the images
at the time of import. The @FULLTEXTDIR token gives you the
flexibility to import the DII file and full-text files without requiring
you to copy the full-text files to the network first.
For example: @FULLTEXTDIR Vol001\Box001\ocrFiles
The above example shows a relative path. The application
searches for the full-text files in the same location as the DII file
that is imported and follows any subdirectories listed after the
@FULLTEXTDIR token. The @FULLTEXTDIR token applies to all
subsequent records in the DII file until it is changed or turned off.

@HEADER
@HEADER-END

EmailHeader

E-mail header content. The @HEADER-END token must be on its
own line. For example: @HEADER @HEADEREND

@INTMSGID

InternetMessageID

Internet message ID. For example: @INTMSGID
<00180c34fe5$bf2d5$050@SKEETER>

@MEDIA

Media

Indicates the type of document. This must be populated with one
of the following values: {email, attachment, and eDoc} This value
is REQUIRED. This value is used by the application to determine
how to display the document. For example : @MEDIA eDoc

@MSGID

EntryID

E-mail message ID generated by Microsoft Outlook or Lotus
Notes. For example:
@MSGID 00E8324B3A0A800F4E954B8AB427196A1304012000

@MULTILINE

Any custom field
with multiple lines

Allows carriage returns and multiple lines of text to populate a
specified text field. Text must be between @MULTILINE and
@MULTILINE-END. The @MULTILINE-END token must be on its
own line.
For example:
@MULTILINE FIELDNAME Here is the first line.
Here is the second line.
Here is the third line.
Here is the last line.
@MULTILINE-END

OCRTEXT /
FULLTEXT is copied
into a file in the file
system

This token is used to load full-text documents. The text files can be
located someplace other than the image location as specified by
the @D line of the DII file. There can only be one text file for the
record. The value following the @O should contain the relative
path (from the load file location) of the .txt file. @O
\Text\ABC000123.txt

Data Loading Requirements

DII Tokens

| 42

@OCR @OCREND

OCRTEXT is copied
into a file in the file
system

The @OCR and @OCR-END tokens offer the flexibility to include
the full-text (including carriage returns) in the DII file. The @OCREND token must appear on a separate line. For example: @OCR
@OCR-END

@PARENTID

ParentDoc

Parent document ID of an attachment. For example: @PARENTID
ABC000123

@PSTFILE0

PSTFilePath and
PSTStoreNameID

The original PST File name and ID
1) The name and/or location of the .PST file.
2) The unique ID of the .PST file.
The two values are separated by a comma. The unique ID can be
any unique value that identifies the .PST file. For example:
@PSTFILE EMAIL001\PFranc.pst, PFranc_14April_07
The .PST file’s unique ID (the second value) is populated into the
PST ID field designated in eMail
Defaults.
The PST ID value specified by the @PSTFILE token is assigned
to the record it appears in and will apply to all subsequent e-mail
records. The value is applied until either the @PSTFILE token is
turned off by setting the token to a blank value or the value
changes. The @PSTFILE token can occur multiple times in a
single DII file and assign a different value each time. This allows
processing multiple .PST files and presenting the data for all .PST
files in a single DII file.
As a best practice, the @PSTFILE token should be placed above
the @T token.

@READ

IsUnread (stores 0 if
Y and 1 if N)

Notes whether the e-mail message was read. For example:
@READ Y

@RELATED

LinkedDocs

The document IDs of related documents.
For example: @RELATED WGH000006

@SOURCE

Source

Custodian of the data. You can quickly filter documents by this
field. @SOURCE Joe Custodian

@SUBJECT

Subject

The subject of an e-mail message. For example: @SUBJECT RE:
Town Issues

@TIMECREATED

CreationDateFT

Time the file/e-mail/edoc was created

@TIMERCVD

DeliveryTimeFT

Time that the e-mail message was received.

@TIMESAVED

ModificationDateFT

Time that the file/e-mail/edoc was last saved

@TIMESENT

SubmitTimeFT

Time that the e-mail message was sent.

@TO

EmailTo

To field in an e-mail message. For example: @TO Conner Stevens

@UUID

UUID

Customer-specific and unique identifier for a record (not used
internally by the application)
For example : @UUID AE01R95

Data Loading Requirements

DII Tokens

| 43

Chapter 5

Analyzing Document Content

Using Cluster Analysis
About Cluster Analysis
You can use Cluster Analysis to group Email Threaded data and Near Duplicate data together for quicker review.
Note: If you activated Cluster Analysis as a processing option when you created the project, cluster analysis will
automatically run after processing data and will not need to be run manually.
Cluster Analysis is performed on the following file types:
Documents

(including PDFs)

Spreadsheets
Presentations
Emails

Cluster Analysis is also performed on text extracted from OCR if the OCR text comes from a PDF. Cluster
Analysis cannot be performed on OCR text extracted from a graphic.

To perform cluster analysis
1.

Load the email thread or near duplicate data using Evidence Processing or Import.

On the Home page, in the Project List panel, click the

In the Add Data dialog, click Cluster Analysis.

Click Start.
You can view the similarity results in the Similar Panel in Review.
The data for the email thread appears in the Conversation tab in Project Review. The data for Near
Duplicate appears in the Related tab in Project Review.
An entry for cluster analysis will appear in the Work List.

Add Evidence button next to the project.

Words Excluded from Cluster Analysis Processing
Noise words, such as “if,” “and,” “or,” are excluded from Cluster Analysis processing. The following words are
excluded in the processing:
a, able, about, across, after, ain't, all, almost, also, am, among, an, and, any, are, aren't, as, at, be, because,
been, but, by, can, can't, cannot, could, could've, couldn't, dear, did, didn't, do, does, doesn't, don't, either, else,

Analyzing Document Content

Using Cluster Analysis

| 44

ever, every, for, from, get, got, had, hadn't, has, hasn't, have, haven't, he, her, hers, him, his, how, however, i, if,
in, into, is, isn't, it, it's, its, just, least, let, like, likely, may, me, might, most, must, my, neither, no, nor, not, of, off,
often, on, only, or, other, our, own, rather, said, say, says, she, should, shouldn't, since, so, some, than, that, the,
their, them, then, there, these, they, they're, this, tis, to, too, twas, us, wants, was, wasn't, we, we're, we've, were,
weren't, what, when, where, which, while, who, whom, why, will, with, would, would've, wouldn't, yet, you, you'd,
you'll, you're, you've, your

Filtering Documents by Cluster Topic
Documents processed with Cluster Analysis can be filtered by the content of the documents in the evidence. The
Cluster Topic filter is created in Review under the Document Contents filter from data processed with Cluster
Analysis. Data included in the Cluster Topic is taken from the following types of documents: Word documents
and other text documents, spreadsheets, emails, and presentations.
In order for the application to filter the data with the Cluster Topic filter, the following must occur:
Prerequisites
How

for Cluster Topic (page 45)

Cluster Topic Works (page 45)

Filtering

with Cluster Topic (page 46)

Considerations

of Cluster Topic (page 46)

Prerequisites for Cluster Topic
Before Cluster Topic filter facets can be created, the data in the project must be processed by Cluster Analysis.
The data can be processed automatically when Cluster Analysis is selected in the Processing options or you can
process the data manually by performing Cluster Analysis in the Add Evidence dialog.
Evidence Processing and Deduplication Options (page 120)

How Cluster Topic Works
The application uses an algorithm to cluster the data. The algorithm accomplishes this by creating an initial set
of cluster centers called pivots. The pivots are created by sampling documents that are dissimilar in content. For
example, a pivot may be created by sampling one document that may contain information about children’s books
and sampling another document that may contain information about an oil drilling operation in the Arctic. Once
this initial set of pivots is created, the algorithm examines the entire data set to locate documents that contain
content that might match the pivot’s perimeters. The algorithm continues to create pivots and clusters
documents around the pivots. As more data is added to the project and processed, the algorithm uses the
additional data to create more clusters.
Word frequency or occurrence count is used by the algorithm to determine the importance of content within the
data set. Noise words that are excluded from Cluster Analysis processing are also not included in the Cluster
Topic pivots or clusters.

Analyzing Document Content

Using Cluster Analysis

| 45

Filtering with Cluster Topic
Once data has been processed by Cluster Analysis and facets created under the Cluster Topic filter, you can
filter the data by these facets.

Cluster Topic Filters

The topics of the facets available are cluster terms created. Documents containing these terms are included in
the cluster and are displayed when the filter is applied. Topics are comprised of two word phrases that occur in
the documents. This is to make the topic more legible.
The UNCLUSTERED facet contains any documents that are not included under a Cluster Topic filter.
For more information, see Filtering Data in Case Review in the Reviewer Guide.

Considerations of Cluster Topic
You need to aware the following considerations when examining the Cluster Topic filters:
Not

all data will be grouped into clusters at once. The application creates clusters in an incremental
fashion in order to return results as quickly as possible. Since the application is continually creating
clusters, the Cluster Topic facets are continually updated.

Duplicate

documents are clustered together as they match a specific cluster. However, if a project is
particularly large, duplicate documents may not be included as part of any cluster. This is to avoid
performance issues. You can examine any duplicate documents or any documents not included in a
cluster by applying the UNCLUSTERED facet of the Cluster Topic filter.

Analyzing Document Content

Using Cluster Analysis

| 46

Using Entity Extraction
About Entity Extraction
You can extract entity data from the content of files in your evidence and then view those entities.
You can extract the following types of entity data:
Credit

Card Numbers

Email

Addresses

People
Phone

Numbers

Social

Security Numbers

The data that is extracted is from the body of documents, not the meta data.
For example, email addresses that are in the To: or From: fields in emails are already extracted as meta data
and available for filtering. This option will extract email addresses that are contained in the body text of an email.
Using entity extraction is a two-step process:
1.

Process the data with the Entity Extraction processing options enabled.
You can select which types of data to extract.

View the extracted entities in Review.

The following tables provides details about the type of data that is identified and extracted:

Type
Credit Card
Numbers

Examples
Numbers in the following formats will be extracted as credit card numbers:
16-digit numbers
used by VISA,
MasterCard, and
Discover in the
following formats.

For example,
 1234-5678-9012-3456 (segmented by dashes)
 1234 5678 9012 3456 (segmented by spaces)
Not:
 1234567890123456 (no segments)
 12345678-90123456 (other segments)

15-digit numbers
used by American
Express in the
following formats.

For example,
 1234-5678-9012-345 (segmented by dashes)
 1234 5678 9012 345 (segmented by spaces)
Notes:
Other formats, such as 14-digit Diners Club numbers, will not
be extracted as credit card numbers

Analyzing Document Content

Using Entity Extraction

| 47

Type
Email
Addresses

Examples
Text in standard email format, such as jsmith@yahoo.com will be extracted.
Note:
Email addresses that are in the To: or From: fields in emails are
already extracted as meta data and available for filtering. This
option will extract email addresses that are contained in the
body text of an email.

People

Text that is in the form of proper names will be extracted as people.
Proper names in the content are compared against personal
names from 1880 - 2013 U.S. census data in order to validate
names.

Type
Phone Numbers

Examples
Numbers in the following formats will be extracted as phone numbers:
Standard 7-digit

For example:
 123-4567
 123.4567
 123 4567
Not: 1234567 (not segmented)

Standard 10-digit

For example:
 (123)456-7890
 (123)456 7890
 (123) 456-7809
 (123) 456.7809
 +1 (123) 456.7809
 123 456 7809
Not 1234567890 (not segmented)
Note: A leading 1, for long-distance or 001 for international, is
not included in the extraction, however, a +1 is.

Analyzing Document Content

Using Entity Extraction

| 48

Type

Examples
International

Some international formats are extracted, for example,
 +12-34-567-8901
 +12 34 567 8901
 +12-34-5678-9012
 +12 34 5678 9012
Not 12345678901 (not segmented)
Other international formats are not extracted, for example,
 123-45678
 (10) 69445464
 07700 954 321
 (0295) 416,72,16
Notes:
Be aware that you may get some false positives.
For example, a credit number 5105-1051-051-5100 may also
be extracted as the phone number 510-5100.

Type

Examples

Social Security
Numbers

Numbers in the following formats will be extracted as Social Security Numbers:



123-45-6789 (segmented by dashes)
123 45 6789 (segmented by spaces)

The following will not be extracted as Social Security Numbers:
 123456789 (not segmented)
 12345-6789 (other segments)

Enabling Entity Extraction
To enable entity extracting processing options:
1.

You enable Entity Extraction when creating a project and configuring processing options.
See Evidence Processing and Deduplication Options on page 120.

Viewing Entity Extraction Data
To view extracted entity data
1.

For the project, open Review.

In the Facet pane, expand the Document Content node.

Expand the Document Content category.

Expand a sub-category, such as Credit Card Numbers or Phone Numbers.

Apply one or more facets to show the files in the Item List that contain the extracted data.

Analyzing Document Content

Using Entity Extraction

| 49

Chapter 6

Editing Evidence

Editing Evidence Items in the Evidence Tab
Users with Create/Edit project admin permissions can view and edit evidence for a project using the Evidence
tab on the Home page.

To edit evidence in the Evidence tab
1.

Select a project from the Project List panel.

Click on the Evidence tab.

Select the evidence item you want to edit and click the Edit button.

In the External Evidence Details form, edit the desired information.

Editing Evidence

Editing Evidence Items in the Evidence Tab

| 50

Evidence Tab
Users with permissions can view information about the evidence that has been added to a project. To view the
Evidence tab, users need one of the following permissions: Administrator, Create/Edit Project, or Manage
Evidence.

Evidence Tab

Elements of the Evidence Tab
Element

Description

Filter Options

Allows the user to filter the list.

Evidence Path List

Displays the paths of evidence in the project. Click the column headers to sort by the
column.
Refreshes the Evidence Path List.

Refresh

Editing Evidence

Evidence Tab

| 51

Elements of the Evidence Tab (Continued)
Element

Description
Click to adjust what columns display in the Evidence Path List.

Columns
External Evidence
Details

Includes editable information about imported evidence. Information includes:
 That path from which the evidence was imported
 A description of the project, if you entered one
 The evidence file type
 What people were associated with the evidence
 Who added the evidence
 When the evidence was added

Processing Status

Lists any messages that occurred during processing.

Editing Evidence

Evidence Tab

| 52

Source Exif Data:

File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.6
Linearized                      : Yes
Author                          : bedwards
Create Date                     : 2015:03:02 14:35:35Z
Modify Date                     : 2015:03:02 14:38:10-07:00
XMP Toolkit                     : Adobe XMP Core 4.2.2-c063 53.352624, 2008/07/30-18:12:18
Creator Tool                    : FrameMaker 9.0
Metadata Date                   : 2015:03:02 14:38:10-07:00
Producer                        : Acrobat Distiller 9.0.0 (Windows)
Format                          : application/pdf
Title                           : Loading Data.book
Creator                         : bedwards
Document ID                     : uuid:b5f55c00-ebc5-499e-ad3c-b6d2b8479548
Instance ID                     : uuid:a1334d9b-81bd-477b-8062-c95b4c424497
Page Mode                       : UseOutlines
Page Count                      : 52

EXIF Metadata provided by EXIF.tools

Loading Data

Navigation menu

Versions of this User Manual:

Views

Navigation