Email Archive Migration Manager NEAMM User Guide V1.2.3

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 104

DownloadEmail Archive Migration Manager NEAMM - User Guide V1.2.3
Open PDF In BrowserView PDF
Email Archive Migration Manager

User Guide
Version 1.2
December 2017

DISCLAIMER
© 2017 Nuix. All rights reserved.
This publication is intended for informational purposes only. The information contained
herein is provided “as-is” and is subject to change without notice. Although reasonable care
has been taken to ensure that the facts stated in this publication are accurate and that the
opinions expressed are fair and reasonable, no representation or warranty, express or
implied, is made as to the fairness, accuracy or completeness of the information or opinions
contained herein, and no reliance should be placed on such information or opinions. Neither
Nuix nor any of its respective members, directors, officers or employees nor any other person
accepts any liability whatsoever for any loss arising from any use of such information or
opinions or otherwise arising in connection with this publication. Furthermore, this
publication contains the confidential and/or proprietary information of Nuix which may not be
reproduced, redistributed, or published in any form or by any means, in whole or in part,
without the express prior written consent of Nuix. The use, reproduction, and/or distribution
of any Nuix software described in this publication requires an applicable software license.

Revision History:
The following changes have been made to this document

Version Number

Revision Date

Author

Description

1.2.3

December 2017

Alex Chatzistamatis

Initial release

Content
INTRODUCTION ............................................................................................. 6
About Email Archive Migration Manager ............................................ 6
About this Guide ................................................................................ 6
Document Conventions...................................................................... 6
SYSTEM REQUIREMENTS ............................................................................... 7
Architecture........................................................................................ 7
Hardware ........................................................................................... 8
CPU ........................................................................................... 8
Memory (RAM) .......................................................................... 8
Storage ...................................................................................... 8
Virtualization ............................................................................ 10
Example System Configurations ............................................. 11
Software........................................................................................... 12
INSTALLATION ............................................................................................ 13
Files ................................................................................................. 13
Setup ............................................................................................... 14
GETTING STARTED ..................................................................................... 15
Main Menu ....................................................................................... 15
Interface Overview .................................................................. 15
Global Settings ................................................................................ 16
Interface Overview .................................................................. 16
Nuix License ............................................................................ 17
Nuix Directories ....................................................................... 18
Lightspeed Settings ................................................................. 19
Exchange Web Services ......................................................... 20
Database Settings ................................................................... 22
Extracting Email Data from Legacy Email Archives ......................... 24
Interface Overview .................................................................. 24
Working with Veritas Enterprise Vault ..................................... 25
Working with EMC EmailXtender/SourceOne ......................... 42
Working with HP/Autonomy Zantaz EAS ................................. 55
Working with Daegis AXS-One................................................ 68
Nuix Email Archive Migration Manager User Guide v1.2
3

Converting Legacy Email Data ........................................................ 75
Interface Overview .................................................................. 75
Performing a User NSF to User PST Conversion .................... 76
Ingesting Email Data into Exchange ................................................ 77
Interface Overview .................................................................. 77
Working with Exchange Web Services .................................... 78
Extracting Email Data from Exchange ............................................. 83
Interface Overview .................................................................. 83
Working with Exchange Web Services .................................... 84
APPENDIX I: BACKUP/RESTORE SQL DATABASES ....................................... 90
SQL Backup Workflow ..................................................................... 90
Accessing SQL Server Management ...................................... 90
Targeting Desired Archive Database(s) .................................. 90
Configuring Desired Archive Database(s) ............................... 91
Assigning Database Location and Confirming Drive Space .... 92
Correctly naming the Database and choosing the proper File
Extension................................................................................. 93
SQL Restore Workflow .................................................................... 93
Accessing SQL Server Management ...................................... 93
Restoring Desired Archive Database(s) .................................. 94
Configuring Desired Archive Database(s) ............................... 94
APPENDIX II: ARCHIVE SQL QUERIES ......................................................... 96
Veritas Enterprise Vault ................................................................... 96
EMC EmailXtender .......................................................................... 97
EMC SourceOne .............................................................................. 98
HP/Autonomy Zantaz EAS............................................................... 98
APPENDIX III: ARCHIVE METADATA ........................................................... 100
Veritas Enterprise Vault ................................................................. 100
EMC EmailXtender/Source ............................................................ 100
HP/Autonomy Zantaz EAS............................................................. 100
Daegis AXS-One ........................................................................... 101
APPENDIX IV: EWS BEST PRACTICES....................................................... 102
Leveraging Azure Virtual Machines ............................................... 102
Reduced Number of Nuix Workers ................................................ 102
EWS Throttling Workarounds ........................................................ 102
Nuix Email Archive Migration Manager User Guide v1.2
4

ABOUT NUIX ............................................................................................ 104

Nuix Email Archive Migration Manager User Guide v1.2
5

Introduction
Welcome to the Nuix Email Archive Migration Manager User Guide.

About Email Archive Migration Manager
Nuix Email Archive Migration Manager (NEAMM) provides functionality to efficiently
manage legacy email archive migration projects including:



Extracting email-based content from legacy email archive platforms such as
Veritas Enterprise Vault, EMC EmailXtender/SourceOne, HP/Autonomy Zantaz EAS,
and Daegis AXS-One.




Converting legacy NSF data to modern email formats such as PST, MSG or EML.



Ingesting data to Exchange on-premise or Exchange Office 365
mailboxes/personal archives.



Real-time statistics and progress of the migration project.

Extracting data from Exchange on-premise or Exchange Office 365
mailboxes/personal archives.

Once setup, NEAMM can be configured to perform these actions on a single system.
NEAMM can be installed and managed on multiple systems to expedite the progress of
the migration project.

About this Guide
This guide provides step by step instructions to help you configure and use NEAMM.

Document Conventions
The following conventions are used in this guide:
//This is a line of code
This is a Menu > Option.

Note

This is a note.

Tip

Tips, written in highlighted text boxes, are useful pieces
of information on how to apply what is in the guide into
practice or provide an example.

Warning

This symbol is used to indicate information that is
critical, which must be reviewed.

Nuix Email Archive Migration Manager User Guide v1.2
6

System Requirements
NEAMM has been designed to be as lightweight and portable as possible while working
on the most common Windows Operating Systems currently available. For more details,
please see the detailed sections below.

Architecture
The architecture of NEAMM installed in an on-premise environment may look like
this:
NUIX Migration Solution

Legacy Email Archive

Archive Source Data
NEAMM Migration 1
EV (.dvs)

EMC
(.emx)

EAS (.eas)

AXS-One
(.pgi)

EMC
Centera

SQLite
SQL
Exchange Web Services

Redis*
ACT/LNK
10G=GRN
1G=YLW

ACT/LNK
10G=GRN
1G=YLW

Network Switch (Management)

On-Prem/O365 Mailbox/Archive

Nuix Management Server
- Licensing

ACT/LNK
10G=GRN
1G=YLW

ACT/LNK
10G=GRN
1G=YLW

Shared Storage

Storage Solution

ACT/LNK
10G=GRN
1G=YLW

Exports (PSTs or EML)

Network Switch (Data)

NEAMM Engineer

Server HW Specs
2x 6-Core processors
256GB Memory
OS – 2x300GB 15K SAS or 1x256GB SSD (preferred)
Temp – 4x500GB 15K SATA RAID0 or 2x500GB SSD RAID0 (preferred)
Logs – 2x500GB 10K SATA RAID5
SQL DB – Required for Legacy Email Archive Migration
SQLite DB – Required for NEAMM
Redis* - NOT Required for Legacy Email Archive Migration
Note – The NEAMM Archive Server(s) will require SQL Server
to support the .bak files needed to extract the associated data
from the varios legacy email archive platforms.

Management Network
Data Network

iSCSI SAN (Optional)

Nuix Email Archive Migration Manager User Guide v1.2
7

Hardware
Please refer to the section below for sizing the hardware required for NEAMM.
Several factors must be taken into consideration for appropriate sizing including,
but not limited to: source data type, source data volume, expected processing
throughput and the project timeline. Please speak to your Nuix Account Manager and
Nuix Solutions Consultant for more details.
This sections outlines the system configuration required to extract optimal
performance from Nuix systems. This includes specifications for:
•

CPU

•

Memory (RAM)

•

Storage

•

Latency

•

Virtualization

CPU
The number of physical CPU cores available should always be equal to or greater
than the number of Nuix workers. When comparing CPUs faster clock speeds should be
preferred over additional cores.
The current processor architecture of Advanced Micro Devices (AMD) is based on
multiple, low-efficiency processor cores working in tandem. Nuix Workstation is
optimized and licensed around fewer, faster processor cores like those found in
Intel processors. In our most recent benchmarks, Intel Xeon series processors
provide more than double the core per worker performance of comparable AMD Opteron
series processors. Because of this, Intel CPU’s provide better value and performance
with Nuix Workstation. We will continue to test and evaluate both AMD and Intel’s
offerings as they become available.

Memory (RAM)
NEAMM runs a number of independent processes called Workers. Each worker runs as a
separate system process to perform the required task with the main application
acting as a broker to distribute the items being actioned.
The number of workers which available to you is determined by your project.
When determining the amount of RAM which is needed for your system:



A minimum of 8 GB of RAM for each Nuix worker is acceptable, however, Nuix
recommends allocating 16GB of RAM for each Nuix worker for optimal
performance.



1GB of RAM + (1GB * Number of Workers) should be available to the main
application.



A minimum of 4GB should be left unallocated to be used by the Operating
System.

Storage
NEAMM processing is very I/O intensive with a number of processes occurring
simultaneously which utilize your storage. At a high level these are:
Nuix Email Archive Migration Manager User Guide v1.2
8





Worker Temporary Directory – The copies of the files to be processed are
saved to the worker temp directory to process. This ensures the original data
is preserved and not altered during processing.
Source Data – The original source location from which the data is unpacked
into the temporary directory ready for processing.
Export Location – The location used to export data from the case.

The type of drives and their configuration will play a major role in the overall
processing and export performance.

Storage Types
The cost of storage increases as the performance of the drive increases. The
following table provides examples of the I/O Operations per second (IOPS) each type
is capable of.

Drive Type

IOPS

7.2k SATA

100

10k SAS

150

15k SAS

200

Desktop/Laptop MLC SSD (cheaper)

2,000-10,000

Server MLC SSD

10,000-50,000

Write Intensive SLC SSD (expensive)

100,000

PCIe IO Card

1,000,000+

Solid State Drives (SSD) provide excellent performance especially when placed into
a Redundant Array of Inexpensive Discs (RAID) configuration, however, using SSD
drives for all storage needs may not be necessary. It is recommended to always
balance CPU, RAM and Storage requirements.

Storage Optimization
Maximizing the potential of a suitably powerful system requires a well configured
storage solution. Fast disk speeds and dedicated logical units for different Nuix
locations are vital to achieving the best possible performance. Storage size will
depend heavily on project scope and the volume of data.
To achieve optimal performance from your setup, be sure to keep separate locations
(LUNs) for the following storage areas:
Worker Temp Directory


Very fast disk (SSD if possible or RAID 0 array) with Low Latency.



Minimum capacity: largest single job size x 5 (i.e. 500 gigs to process 100
gigs).



No redundancy as the required data is only accessed at the time of processing
or export and then deleted.

Source Data


Heavy Read activity during processing or export, Light Read activity during a
review.



RAID 5/6 provides high Read performance and maximum storage capacity with
moderate redundancy.

Nuix Email Archive Migration Manager User Guide v1.2
9

Export Directory


If you plan to move data to another location (off site, Review system and so
on) after Export, you can maximize export performance by using a RAID 0
storage configuration (similar to Worker Temporary Directory).



If the Export location holds data for an extended period a redundant storage
configuration with good Write performance is recommended.

Finally, if you are looking to configure your storage for optimal Nuix performance,
you can also look at array stripe size. A smaller stripe size improves performance
for reading or writing smaller files. Conversely, if you deal with primarily larger
files, a larger stripe size can also significantly affect performance for both
reading and writing.

Example Configuration
The following table shows an example configuration which provides a compromise
between cost and performance and would be suitable for high levels of processing.

Usage

Disc Type

Configuration

Worker Temporary Directory

Local SSD

RAID 0

Source Data

SAN with Fiber Connection or
10 Gigabit Ethernet

-

Export

Local SSD

RAID 0

Latency
Latency represents a delay in sending or receiving data between two devices. In the
case of a Nuix system and a storage volume it is seeking to Read or Write to,
latency can significantly impact performance. Network connections to storage,
especially Windows File Share connections (NTFS/SMB) have inherently higher latency
than directly attached devices (DAS).
To minimize latency, Nuix recommends:
•

Configuring less workers with more RAM per worker when there is high
latency with large/spanned/compressed file types. This helps minimize the
amount of disk traffic involved.

•

Reviewing the array stripe size. A smaller stripe size improves performance
in reading or writing smaller files. Conversely, for primarily larger
files, a larger stripe size will significantly affect performance for both
reading and writing.

Nuix system utilizes all the available RAM, CPU, and I/O resources when processing
data, up to the limits of your license. Beyond that, large amounts of RAM provide
you an excellent environment for a multi-user concurrent review system.
As above, optimizing disk configuration is often an overlooked element to maximize
processing performance.

Virtualization
To install Nuix in a Virtual Machine (VM) environment, ensure the VM host has
better hardware than the proposed Nuix VM specification to account for the 5-10%
virtualization performance degradation.
Nuix Email Archive Migration Manager User Guide v1.2
10

Processor Resource Pool
Directly mapping the VM’s Virtual Cores to the Host’s Physical Cores is preferred,
however, it is recommended to maintain 2 – 4 Physical Cores without a Virtual Core
Mapping in order for the VM Host OS and VM Host Software to function properly. It
is strongly recommended to isolate the Nuix VM to its own resource pool or set its
Share value to “High” if the VM Host is running multiple VM Guests.

Virtual Memory Resource Pool
For VMWare/VMSphere, Nuix recommends that:
•

The VM Host should have at least 25% more RAM (not page/disk swap) than the
RAM allocated to the Nuix VM.

•

The VM Host should be capable of hardware assisted memory virtualization.

•

The Nuix VM should be configured with a memory Share value of “High” to
maximize its access to VM Host memory resources.

•

The Nuix VM should be configured with a Reservation level of at least 50%
its total available memory allocation.

Virtual Hard Disk Drive Configuration
The storage input/output (I/O) must be paired with a Virtual Hard Disk Drive (VHD)
capable of supporting a minimum average sustained Input/output Operations Per
Second (IOPS).

Note

If the Logical Unit Number (LUN) hosting the VHD for the
NUIX VM is shared with other VM VHD’s, the Nuix VM Share
Level should be set at high or equivalent. This helps the
Nuix VM to receive higher priority on reading and writing to
the LUN than other VM’s accessing VHD’s on that LUN.

Example System Configurations
The following tables outline sample configurations which provide a generic starting
point for designing Nuix processing environment.

Note

Performance will vary based on hardware configuration,
source data type, source data volume and Case settings
selected at the time processing.

Nuix Email Archive Migration Manager User Guide v1.2
11

Physical Servers
Nuix
License

CPU
Cores

Minimum /
Recommended
RAM

Minimum non-OS Storage

2 Workers

2 i7/Xeon

16 GB / 32 GB

300 READ & 300 WRITE IOPS DAS

4 Workers

4 i7/Xeon

32 GB / 64 GB

600 READ & 600 WRITE IOPS DAS

8 Workers

8+ Xeon

64 GB / 128 GB

1200 READ & 1200 WRITE IOPS
DAS

12 Workers

12+ Xeon

96 GB / 192 GB

1800 READ & 1800 WRITE IOPS
DAS

16 Workers

16+ Xeon

128 GB / 256 GB

2400 READ & 2400 WRITE IOPS
DAS

Virtual Servers
Nuix
License

VM Host
CPU
Cores

VM Host
Minimum /
Recommended
RAM

VM Guest
Minimum /
Recommended
RAM

Minimum non-OS
Storage

2 Workers

4 i7/Xeon

4 Workers

8 i7/Xeon

8 Workers

12+ Xeon

12 Workers

16+ Xeon

16 Workers

20+ Xeon

20 GB / 40
GB
40 GB / 80
GB
80 GB / 160
GB
120 GB / 240
GB
160 GB / 320
GB

16 GB / 32
GB
32 GB / 64
GB
64 GB / 128
GB
96 GB / 192
GB
128 GB / 256
GB

300 READ & 300 WRITE
IOPS DAS
600 READ & 600 WRITE
IOPS DAS
1200 READ & 1200
WRITE IOPS DAS
1800 READ & 1800
WRITE IOPS DAS
2400 READ & 2400
WRITE IOPS DAS

Software
NEAMM requires that following pre-requisites are met:






Operating System (OS):
o

Server: Windows Server 2008 (minimum), Windows Server 2012/2016
(recommended)

o

Desktop: Windows 7 (minimum), Windows 10 (recommended)

Nuix:
o

Nuix Management Server: 7.0 and above

o

Nuix Workstation: 7.0 and above

Other:
o

Microsoft .NET Framework 3.5+

o

Microsoft SQL Server 2008/2012/2016 (for Legacy Archives)

o

Redis: 3.2


Redis is not required for all deployments. Please consult with
your Nuix representative to understand if this is necessary for
your project.

Nuix Email Archive Migration Manager User Guide v1.2
12

Installation
Files
NEAMM requires installation and consists of the following files/folders:
Application Files
NEAMM.application
Setup.exe
These files should be copied to a folder within a directory.

Tip

NEAMM can be installed on any system which can:
- obtain a local Desktop license
- obtain a Server license using Nuix Management Server
- has Nuix Workstation installed
- has the necessary access to the email archive source
data/databases
- has access to the Exchange on-premise/Office 365
mailbox/personal archives.

Nuix Email Archive Migration Manager User Guide v1.2
13

Setup
In order to install NEAMM, please perform the following steps:
Launch “setup.exe”

Click Install

When installation completes, you will see the NEAMM Main Menu.

Note

Before performing any migration work, you must configure the
embedded SQLite database. Please follow review the NEAMM
Global Settings section to perform this mandatory step.

Nuix Email Archive Migration Manager User Guide v1.2
14

Getting Started
Before you start using Nuix, we will go through the primary components of the user
interface, the menu systems and all other major components.
Once you are familiar with the interface and layout, it will be easy for you to
complete migration tasks.

Main Menu
Interface Overview
Upon launching NEAMM, you will see the main menu which is broken down into several
components. Aside from the Global Settings, each of large buttons has a primary
function, with a specific workflow associated with it.

Nuix Email Archive Migration Manager User Guide v1.2
15

Name

Description

1

Global Settings

The component which controls settings globally for
each of the different components that make up NEAMM.

2

Email Archive
Extraction

Allows you to extract email data to various formats
from the most common legacy email archive systems.

3

Email Conversion

Allows you to convert email data from one format to
another.

4

EWS Ingestion

Allows you to ingest email data into Exchange an
Exchange mailbox or archive.

5

EWS Extraction

Allows you to extract email data from an Exchange
mailbox or archive.

Global Settings
Interface Overview
Global Settings allows you to configure and control many different aspects required
for a migration ranging from Nuix licensing to directories, processing, extraction
and more. These settings must be configured every time NEAMM is launched and should
be checked prior to starting any migration work.
NEAMM Global Settings are saved a standard XML file which can be reloaded into
NEAMM whenever it is launched.

Nuix Email Archive Migration Manager User Guide v1.2
16

Name

Description

1

Settings Location

Enter a location to save your re-usable Settings XML.
Use the ellipsis to browse the file system.

2

Reload Settings

Reload an existing Settings XML file.

3

Save Settings

Save a Settings XML file that can be reused.

4

OK

Accept your Global Settings.

5

Cancel

Cancel and close Global Settings.

Nuix License
The Nuix License tab is used to allow NEAMM to obtain a license from an existing
Nuix Management Server deployment.

Name

Description

Source Type

Desktop or Server license

NMS Hostname

The IP address of FQDN of your NMS instance

NMS Port

The port for your NMS instance (default: 27443)

NMS Username

Username for your NMS instance

NMS Password

Password for your NMS instance

Warning

You must enter the IP Address and Port Number used by the
previously configured Nuix Management Server (NMS).

Tip

Create an account to use with the toolkit only as this will
ensure that you can easily determine which Audit Events were
taken by the toolkit and which by other users.

Nuix Email Archive Migration Manager User Guide v1.2
17

Nuix Directories
The Nuix Directories settings tab is used to configure all of the different
components needed for any extractions or ingestions, ranging from licensing,
directories, processing, extraction and more. These settings must be configured
before starting any migration work.

Name

Description

Case Directory

the location where Nuix Cases will be stored. The Nuix Case
will contain critical information for each batch, including
Summary Reports and Success/Exception Reports.

Nuix Processing
Files Directory

the location where NEAMM-related files will be stored. The
Nuix Files are related to each batch that is run, including:
BAT files, Ruby scripts, JSON files, etc.

Log Directory

the location where logs will be stored. The Nuix Logs can be
used to troubleshoot errors or other issues. This directory
can be purged on a routine basis if necessary.

Java Temp
Directory

the location where Java Temp will be stored. This is a
temporary location that will be used/cleared during each
processing job.

Worker Temp
Directory

the location where Worker Temp will be stored. This is a
temporary location that will be used/cleared during each
processing job.

Export Directory

the location where exports will be stored. This should be
treated as a critical location and where extracted data will
exist, as well as critical reports for each export job.

Nuix App Location

the location where Nuix App is installed. The location of
your Nuix installation is something that should have been
completed prior to reaching this step.

Nuix Email Archive Migration Manager User Guide v1.2
18

Tip

It is very important to ensure that each Nuix Directory
location has the proper storage configuration in place.
Please review the System Requirements section for more
details.

Lightspeed Settings
The Nuix Lightspeed Settings tab is used to configure the necessary Nuix instance
settings when performing an extraction from a legacy email archive platform.

Lightspeed Settings:
Name

Description

System Memory (RAM)

displays the total amount of RAM available on the system

Number of Nuix
Instances

controls the total number of concurrent Nuix instances that
is able to run on the system

Nuix App Memory

controls the amount of maximum memory each Nuix instance
will utilize

Available Memory
(RAM)

displays the total amount of memory left before factoring
in memory for each Nuix instance

Available After Nuix
Instances

displays the total amount of memory left after factoring in
memory for each Nuix instance

Number of Nuix
Workers

controls the total number of Workers per Nuix instance

Memory Per Worker
(MB)

controls the amount of maximum memory that each Nuix Worker
will utilize

Worker Timeout

controls the amount of time in seconds that a worker will
attempt to process an item before it times out, flags the
item as poisoned, and moves on.

Nuix Email Archive Migration Manager User Guide v1.2
19

MAPI Export Options:
Name

Description

PST Export Size (GB)

controls the size of each PST that Nuix exports

Add Distribution
List Metadata

includes any Distribution List recipients from the Exchange
Journal envelope in the “mapi-expanded-dl” metadata of each
MSG.

EML Export Options:
Name
Add Distribution
List Metadata

Note

Description
includes any Distribution List recipients from the Exchange
Journal envelope in the “Expanded-DL” Delivered-To header
of each RFC822

You must properly balance RAM across the OS, any other
running applications (such as SQL), the Nuix application and
the Nuix Workers. The majority of your memory should
generally be allocated to your workers, as these will be
performing the most intensive work. Insufficient memory for
the Workers will cause inconsistencies, item poisoning or
other memory related errors.

Exchange Web Services
The Nuix EWS settings tab is used to configure the necessary Nuix instance settings
when performing ingestions or extractions with Exchange, either on-premise or
Office 365.

Nuix Email Archive Migration Manager User Guide v1.2
20

EWS Connection:
Name

Description

Exchange Server

the complete URL to the Exchange server

Domain

the domain for the Exchange environment

Username

the account used to connect to Exchange (this can either be
a user account, an account with delegate access or an
account with the Application Impersonation role assigned to
it)

Password

the password for the account connecting to Exchange

Enable Impersonation

designates whether the authenticating account has the
Application Impersonation role assigned it

Tip

It is recommended that an account with the Application
Impersonation role assigned to it is used when working with
multiple Exchange mailboxes/personal archives in parallel.

Lightspeed Settings:
Name

Description

System Memory (RAM)

displays the total amount of RAM available on the system

Number of Nuix
Instances

controls the total number of concurrent Nuix instances that
is able to run on the system

Nuix App Memory

controls the amount of maximum memory each Nuix instance
will utilize

Available Memory
(RAM)

displays the total amount of memory left before factoring
in memory for each Nuix instance

Available After Nuix
Instances

displays the total amount of memory left after factoring in
memory for each Nuix instance

Number of Nuix
Workers

controls the total number of Workers per Nuix instance

Memory Per Worker
(MB)

controls the amount of maximum memory that each Nuix Worker
will utilize

Worker Timeout

controls the amount of time in seconds that a worker will
attempt to process an item before it times out, flags the
item as poisoned, and moves on.

EWS Upload Control:
Name

Description

Maximum Message Size
(MB)

controls the maximum individual message size of each toplevel item being ingested to Exchange

Enable Bulk Upload

enables bulk uploads instead of single item uploads

Bulk Upload Size
(MB)

controls the maximum size of messages being ingested to
Exchange in bulk

Remove [ ] levels
from the folder path

trims the folder path of the PST data that may be ingested
into an Exchange mailbox/personal archive

Nuix Email Archive Migration Manager User Guide v1.2
21

EWS Download Control:
Name

Description

Enable Bulk Download

enables bulk downloads instead of single item downloads

Maximum Message Size
(MB)

controls the maximum message sizes of bulk downloads (works
together with Maximum Download Count)

Maximum Download
Count

controls the maximum number of messages to download in bulk
(works together with Maximum Message Size)

Enable Collaborative
Fetching

used to download messages faster from mailboxes with
folders less than 1 GB in size

Enable Mailbox Slack
Space

used to extract deleted items from EWS mailbox or archive
slack space – may be used when looking to capture all
aspects of a mailbox or archive.

EWS Throttle Control:
Name

Description

Retry Count

controls the number of items an item is retried after it is
first flagged for being throttled

Retry Delay

controls the amount of time (in seconds) to wait before
attempting to retry the item the first time

Retry Increment

controls the amount of time (in seconds) to wait before
attempting to retry the item for each subsequent retry

Tip

It is recommended that the throttling controls are set to a
minimum of: 5 retries, 5 second delay /5 second subsequent
delay. For Office 365 migrations, it may be necessary to
set the number of retries much higher.

Database Settings
The Database tab is used to configure different aspects of the external databases
that are integrated with NEAMM.

Nuix Email Archive Migration Manager User Guide v1.2
22

Name

Description

SQLite Database
Location

a path to a folder where the NEAMM SQLite database will be
stored. This database will maintain all historic job
activity within NEAMM.

Redis Host Name

used to specify the hostname of the system running Redis

Port

used to specify the port Redis is listening on

Auth

used to specify the password for the Redis instance

Note

This SQLite database can be copied for Backup/DR purposes.
It is recommended to back this database up when there is no
Nuix activity.

Note

Redis is not installed or configured by NEAMM. This must be
setup prior to using NEAMM. Please consult with your Nuix
representative to understand if this is necessary for your
project.

Creating the NEAMM Database
NEAMM uses an embedded SQLite database to store any job-related activity.
to configure this database upon installation, perform the following:

In order

From Global Settings, click on the Database tab:
1. Enter a location for the SQLite Database location.
2. Enter a location for the Settings XML settings file.
3. Click Save.
4. Upon completion of steps 1-3, a pre-configured SQLite database will be
created and ready for any migration work.

Nuix Email Archive Migration Manager User Guide v1.2
23

Extracting Email Data from Legacy Email
Archives
Interface Overview
Upon launching the Email Archive Extraction module, you will see the interface as
shown below. Many of these options in this interface will be enabled/disabled
based on selections made. This is by design several may options may not be
relevant or necessary for specific source archives or workflows.

Name

Description

1

Legacy Archive

Select the source email archive that data will be
extracted from.

2

Archive Type

Select whether the email archive is Mailbox or
Journal based (source archive dependent).

3

Output Type

Select whether you want to perform a flat extraction
or user-based extraction.

4

Lightspeed
Extraction Output

Select whether the format of the extracted data (PST,
EML or MSG).

5

From/To Date

Used to filter the email data by date.

6

Source Information

Select if the source data is located on a folder, a
physical file itself, or the data is stored on an EMC
Centera storage device.

7

SQL Connection Info

Connect to a backup copy of the Email Archive SQL

Nuix Email Archive Migration Manager User Guide v1.2
24

databases (source archive dependent).
8

Archive Specific
Settings

Additional settings for each supported legacy email
archive.

9

WSS (Worker Side
Script)

Custom filter settings used for Journal Archives that
require export User output, or used with AXS-One for
User output (source archive dependent).

10

Add Job to Grid

Add the pre-configured job to the grid.

11

Grid

Where added jobs will be displayed.

12

Start Job

Start the selected job in the grid for processing.

13

Export Grid to CSV

Exports out the current grid view to CSV format.

14

Lightspeed Exporter
Report Consolidator

Will consolidate all Lightspeed Exporter Metrics and
Exporter Error into a single report.

15

Reload Grid

Reloads jobs in the grid from previous migrations.

16

Global Settings

View/Change previously configured Global Settings.

Tip

Before attempting to perform any migration work, be sure to
check Global Settings and make sure that the Nuix
Directories and Archive Extraction tab are configured
correctly.

Working with Veritas Enterprise Vault
When working with Veritas Enterprise Vault source data, you can use NEAMM in
various methods to target the data directly on the file system in its proprietary,
.DVS file format (or on Centera when applicable) and export it out to disk in PST,
MSG or EML format. Below are several workflows that NEAMM can handle.

Overview
Enterprise Vault archiving solutions have two components:


Source Data on Disk (.DVS)



SQL Database (EnterpriseVaultDirectory + Mailbox and/or Journal vault store
databases)

Nuix must have access to the SQL database while processing EV files on disk in
order to gather additional metadata and, most importantly, in order to reconstitute
emails and their single-instanced parts.

Note

SharePoint, File System and Instant Message vault stores are
not currently supported for legacy archive migrations.

Prerequisites
The following prerequisites must be met before Nuix can process Enterprise Vault
Nuix Email Archive Migration Manager User Guide v1.2
25

data:
EV database restoration1



A copy of the production EV database(s) must be restored to a local MSSQL
instance on the Nuix server.
These include EnterpriseVaultDirectory and all of the mailbox and/or
journal vault store databases.

EV database configuration


The dbo.PartitionEntry table must be modified to reflect the correct Vault
Store Partition file system path.

Source data identified with a specific EV Vault Store2

Source Data
Physical Files
Enterprise Vault can be considered one of the most complex email archive solutions
in terms of data handling—it can be heavily compressed, especially when collections
are enabled and reconstituting the single-instanced parts adds additional
overheard. Enabling sharing at the Vault Store Group level adds further complexity.
Vault Stores are created in EV Admin Console and can be compared to a physical
volume.
Vault Store partitions are created in EV Admin Console and are containers within
the Vault Store.
Enterprise Vault Stores each archived item and its associated SIS parts to the
currently open vault store partition for the archive into which the item is being
archived. There are three main types of files used in the archive:


*.DVS – This file contains all un-sharable (per user) properties relating to
the archived item.



*.DVSSP – This file (or files) contains an individual SIS part



*. DVSCC – This file contains the HTML content conversion of the SIS part

Large files, in Enterprise Vault, are items being archived that are larger than 50
MB in size and these are processed differently from smaller files. To begin with,
the archived item will not be compressed (regardless of the compression setting)
and it will not be eligible for collection. If Enterprise Vault is configured to
migrate data, then the SIS parts and content collections are migrated directly
rather than being placed into a CAB file first. Large files will also have a
content conversion copy (DVSCC) created but it will only be indexed if the copy is
less than 30 MB in size. File extensions for large files are also slightly
different:


If sharing is not enabled, large files are stored as *.DVF files.



If sharing is enabled, large files are stored as *.DVFSP files.

EV data can be processed without a database connection; however, the database is
required for expansion of additional metadata and reconstituting single-instanced
parts.
1

Many large EV archive deployments include multiple Vault Stores and archive
databases, each of which will have a separate database.
2

Nuix Email Archive Migration Manager User Guide v1.2
26



Large file converted content files are stored as *.DVFCC files.

Collections (.CAB Files)
In many environments, collections can be enabled in a vault store partition. This
will cause DVS and Vault Stream files (.DVSSP/.DVSCC) to be containerized in .CAB
(cabinet) files. Collections are typically enabled in order to further reduce the
overall size of the archive. While the intention is good, this complicates
migrations downstream. While Nuix can process these .CAB files, it can reduce
processing throughput, since these are compressed files. Additionally, these .CAB
files present challenges if needing to perform mailbox-level extractions.



.CAB files can reduce processing throughput due to their heavily compressed
nature.



If performing a journal archive extraction with Lightspeed, the .CAB is only
processed once.



If performing a mailbox archive extraction with Lightspeed, data will processed
from Enterprise Vault by user, thus it will only extract the specific data that
is located within each .CAB.

Nuix Email Archive Migration Manager User Guide v1.2
27

Sharing (single-instancing / SIS)
Enterprise Vault uses Vault Store Groups to configure item sharing. A Vault Store
Group consists of one or more Vault Stores. A Vault Store Group is also a single
instance sharing boundary and, as such, any items stored within a Vault Store in
the same Vault Store Group can potentially be eligible for sharing with other items
stored within that Group. This sharing option does not apply between Vault Stores
in different Vault Store Groups.
The Vault Stores contained within Vault Store Groups can each be set to one of
three levels of sharing, depending on storage requirements:
 Share within group – Items are eligible for sharing with other items from any
other Vault Store in the Group that is set to the same level of sharing.
 Share within Vault Store – Items are only eligible for sharing with other
items archived into the same Vault Store.
 No sharing – Do not initiate any sharing at all.

When processing Enterprise Vault, Nuix must have access to ALL data in order to
ensure correct reconstruction of single instanced emails and attachments. When Nuix
processes an item in the .DVS file, Nuix will then query the item in the database
and determine whether it has any single-instanced parts. A .DVSSP file may exist in
the same directory as the parent .DVS file or it could be in a completely different
Vault Store Partition or Vault Store.
Failure to ensure the following steps occur will result in missed or incomplete
data:
The entire EV source data corpus must be presented to Nuix during processing.
All of the appropriate EV databases must be correctly restored.
Nuix must be able to connect to SQL successfully.

Note

.DVSSP/.DVSCC files were not implemented in EV until EV
version 8 and above. Prior to EV v8, top-level items and all
attachments were stored in a single .DVS file. If the EV
environment does not contain any data that is archived with
EV v8 or above, an SQL database may not be needed.

Nuix Email Archive Migration Manager User Guide v1.2
28

File Distribution & Naming
Prior to EV 8.0, files are named in a hash and date-based format:
(XXXXXXXXXXXXXXX~YYYYMMDDHHMMSSXXXX~X.DVS)
For example: 182000000000000~200608111431200000~0.DVS
In order to identify these items in the SQL database, these file names are queried
against the IdChecksumHigh, IdChecksumLow, IdDateTime, and IdUniqueNo in various
tables and databases to determine the EV Transaction ID. This Transaction ID is
then used for any remaining queries that need to occur.
In EV 8.0 and above, the convoluted naming scheme was replaced with a much simpler
one using the actual Transaction ID. Having the files named using the Transaction
ID removes a lot of the complexity with previous versions. Files are named in a
hash format:
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.DVS
For example: 6085D3C8F2E240B1BF80A518B2D68670.DVS

Typically, each EV archive server will write EV files to a specific Vault Store
Partition location as specific in the EV Admin Console. It is also common for the
Vault Store Partition to include date-based subfolders (2010, 2011, or 2012). All
EV files are stored in Vault Store Partitions.
The names for each of these files and the path to the items within the Vault Store
partition are generated from the Enterprise Vault Transaction ID of the item, the
current date, and a number of other attributes. The first level folder under the
Vault Store Partition root is named using the current year, the second level uses
the current month and day (hyphenated), the third level uses the first character of
the item’s Transaction ID, and the final level is the next three characters of the
Transaction ID.
The actual names of the DVS, DVSSP, and DVSCC files start with the full Transaction
ID; in this case, 50BACC0E626DE74E9422921811B69E31.

Nuix Email Archive Migration Manager User Guide v1.2
29

Typical best practice is to process all Enterprise Vault
data from the original storage locations. However, it is
possible to move the data, provided the following conditions
are met:


Data from individual Vault Store partitions is always
kept separate.



Any moved files are identified with a specific Vault
Store Partition.



If single-instancing is in place, ALL data must be
moved and presented to the Nuix server to ensure all
email can be fully rehydrated.

Note

Tip

A typical best practice is to break up processing of EV data
into logical subsets based on server, date, and/or overall
volume. If moving EV data for processing, the database will
require edits to reflect the updated Vault Store Partition
location. Depending on the sharing level, all data may have
to be moved together in order for SIS content to be
reconstituted correctly.

Centera
It is common for EV deployments to be archived using Centera storage. In order to
target data on a Centera device, you must produce a list of the C-Clips that
correspond to your data. For Enterprise Vault archives, the list of Clips is
available in the “dbo.SavesetStore” table in each individual EV Vault Store DB.
Prerequisites
The following prerequisites must be met before EMC Centera data can be processed
using Nuix:
Pool Entry Authorization (PEA) File – if applicable


If the EMC Centera device has been configured with additional storage
pools, a PEA file will need to be obtained and placed in a dedicated
location on the Nuix server. A PEA file is an encrypted file used to
communicate and distribute authentication credentials to Centera and
contains the default key, key, and credential.

Environment User Variable – if applicable


If a PEA file is necessary, it must be obtained and placed in a dedicated
location on the Nuix server. Next, an Environment Variable must be created
on the Nuix server with a value of the PEA file path.

List of Centera Access Nodes (IPs)


The Centera Access Nodes are simply the IP addresses of the nodes
available on Centera. This is typically a list of two or four IP
addresses. If replication is enabled, more addresses may be available.

List of C-Clips


C-Clips, also known as Clips, are unique alphanumeric strings that
reference specific source data within Centera. The source data varies
depending on the archive; however, the concept is always the same. A list

Nuix Email Archive Migration Manager User Guide v1.2
30

of C-Clips can be generated using several techniques and passed in to Nuix
in order to process data.
Pool Entry Authorization
A Pool Entry Authorization (PEA) file, generated while creating or updating an
access profile, is a clear-text, XML-formatted, non-encrypted file that can be used
by system administrators to communicate and distribute authentication credentials
to application administrators. A PEA file is optional for profiles with non-encoded
secrets (created using the File and Prompt options) but is mandatory for profiles
with base-64 encoded secrets (created with the Generate option).

Note

A PEA file may not be necessary in all environments unless
Centera has been specifically setup and configured to
require one. For example, a Centera setup with only the
default pool may not require a PEA file for authentication.
If a PEA file is not needed for authentication, the
Environment Variable is not necessary.

Configuration
If the EMC Centera device has been configured with additional storage pools, a PEA
file will need to be obtained and placed in a dedicated location on the Nuix
server. An Environment Variable will also need to be created for Nuix to reference
the PEA file.
Obtain the PEA file from the customer.

Place the PEA file in a dedicated directory on the Nuix server.
Create an Environment Variable that references the location of the PEA file.
It is critical that the Environment Variable is configured
properly. Refer to the image below for more information.

Warning



The Variable name must be: CENTERA_PEA_LOCATION



The Variable value must be the absolute path to the
physical location of the PEA file on the Nuix server



Click OK to save the variable

Nuix Email Archive Migration Manager User Guide v1.2
31

Access Node IP Addresses
An EMC Centera access node is a node that has the access role applied to it. Access
nodes are gateways to the data stored in Centera. These nodes have IP addresses on
the network and are responsible for authentication. If you successfully connect to
one such node, you have access to the entire cluster. However, to connect faster,
you can specify several available nodes with the access role.
Configuration
The Centera access nodes are simply the IP addresses of the nodes available on
Centera. This is typically a list of two or four IP addresses. If replication is
enabled on Centera, more addresses may be available.
After a list of Centera access nodes has been obtained from the customer, next
steps include:
Create an .IPF (standard text file – “IP Address File”) file with each line
representing the IP of each Centera access node.

Nuix Email Archive Migration Manager User Guide v1.2
32

Place this .IPF file in a dedicated directory on the Nuix server.


This .IPF file will be required when processing data on Centera.

C-Clip List
Centera-based data is referenced by C-Clips, which must be passed in to Nuix at
time of processing. Unless the client provides these Clips, you will need to
generate them using the methods outlined in this section.
Generating the Clip List(s) for Enterprise Vault
The list of C-Clips are available in the ‘dbo.SavesetStore’ table in each of the
Vault Store SQL databases.


Copy the list of clips out of each data Vault Store database table and paste
them into a .CLP file (standard text file – “Clip List file”.



The clip list can now be carved up into manageable sets of .CLP files.

For Enterprise Vault, a common workflow may include create Clip Lists containing
around 6,000,000 clips per file. This should average out to around 1 TB of
compressed source data.

Note

Generating a list of C-Clips is dependent on the number of
pools that exist in Centera. If only the default pool
exists, the list of C-Clips can only be generated for the
default pool. Every additional pool that is created will
have its own set of C-Clips associated with it.

Understanding the differences between legacy archive file
types is critical when processing the same archive data
on Centera. For example, processing EMC EmailXtender or
EMC SourceOne C-Clips is fundamentally different than
processing Symantec EV C-Clips.

Warning

A single EmailXtender C-Clip is equivalent to one .EMX
file, which could contain hundreds or thousands to-level
emails, whereas a single Enterprise Vault C-Clip is
equivalent to one .DVS file which can only contain a
single top-level email.
Processing 10,000 EmailXtender C-Clips would take
substantially longer than processing 10,000 Enterprise
Vault C-Clips.

Database
EV email archive deployments require at least a single database
(EnterpriseVaultDirectory) and often have multiple associated databases (one per
Vault Store). Additionally, if the archive is stored on a Centera device, database
access will be required for retrieval of Centera Clips.
A best practice is to restore a copy of all EV archive databases locally on the
Nuix server(s).

Nuix Email Archive Migration Manager User Guide v1.2
33

Tip

Always restore local database copies for best performance.

Authentication
The credentials used to access SQL will be passed via the Nuix startup file. It is
recommended that the account provided uses built-in SQL authentication. Since the
SQL instance is local and used only for Nuix processing, we recommend simply using
the ‘sa’ (SQL Administrator) account for Nuix access.
Configuration
After the EV SQL databases are restored, the dbo.PartitionEntry table will need to
be updated in order to point to the location of the source data as it presented on
the Nuix server(s). More information on .dbo.PartitionEntry can be found below.
Critical Tables
It is important to understand the data contained in several tables, and the way
Nuix uses these tables to interpret EV data.
dbo.PartitionEntry
The full list of Vault Store Partitions are located in the dbo.PartitionEntry table
under the PartitionRootPath column. The value in this column must be updated to
reflect the path to the Vault Store Partitions on the Nuix Server(s).

dbo.Archive
The full list of users within the SQL Database can be found within the dbo.Archive
table under the ArchiveName column. Use this list to cull down to the set of users
to be exported. This information is used in conjunction with the EV Manifest
Workflow.

Nuix Email Archive Migration Manager User Guide v1.2
34

Supported Workflows
Mailbox Archive to User PST
1. In the Legacy Archive dropdown box, select Veritas Enterprise Vault
2. Select Mailbox from the Archive Type radio button selections
3. Select User from the Output Type radio button selections
4. PST will automatically be selected as the Lightspeed Extraction Output

Note

Due to the complex nature of mailbox folder structures that
may exist in EV Mailbox Archives, PST is the only option.
PSTs are much easier to manage since emails will be grouped
by User on the file system, and long file path names issues
will be avoided.

5. If a date filter is requested, enter a date in the From Date: and To Date:
date chooser field.
6. Enter the SQL connection string details in the SQL Connection Info settings.
Be sure to click Test SQL Connection to ensure you can properly connect to
SQL.

Warning

Testing the SQL is critical, as you will want to ensure that
you can correctly query the SQL databases for additional
information.

Tip

It is recommended to use a built-in SQL service account
(instead of Windows authentication) with READ access to ALL
of the Enterprise Vault databases.

7. The Enterprise Vault Settings will default to:
Name

Default Value

Description

Skip Additional SQL
Lookups

True

Using the connected EV SQL database, Nuix
will not perform any additional EV SQL
queries, other than the basic required for
reconstituting single-instanced attachments
(SIS).
When set to False, additional lookups will
be performed and will slow down processing
items substantially.
It is recommended to ALWAYS set this to
True unless otherwise noted.

Use
FileTransactionID
over
ParentTransactionID

False

When enabled, EV SQL lookups will look use
the FileTransctionID column name instead of
the usual ParentTransactionID.

User List:

empty

A User List is relevant only when

Nuix Email Archive Migration Manager User Guide v1.2
35

extracting data from an EV Mailbox Archive.
The User List CSV file must contain 1 EV
Mailbox Archive Name per line, for example:
John.Doe
John.Smith

Note

The EV Mailbox Archive Names in the User List CSV file must
match EXACTLY the way they do in the ArchiveName column
located in the EnterpriseVaultDirectory.dbo.Archive table.

Warning

Worker Side Script (WSS) is not available for EV Mailbox
Archive extractions. EV Mailbox Archive extractions
typically require all data to be exported with any content
filtering.

8. Review the settings you’ve selected for this job and click the Add Batch to
Grid button.
a. If you have more mailbox archives you need to process, select a new
list of users and click Add Batch to Grid, continuing this until all of
users have been added to a Batch on the Grid.

Warning

Performance may vary, especially based on the settings
configured on the Archive Extraction tab in Global Settings.
Be sure to ALWAYS review settings prior to starting Jobs.

9. When you are ready to begin processing, click on the Start Job button.
Journal Archive to User PST
1. In the Legacy Archive dropdown box, select Veritas Enterprise Vault
2. Select Journal from the Archive Type radio button selections
3. Select User from the Output Type radio button selections
4. Select PST as the Lightspeed Extraction Output

Note

It is highly recommend that you select PST when your Output
Type is set to User. PSTs are much easier to manage since
emails will be grouped by User on the file system, and long
file path names issues will be avoided.

5. If a date filter is requested, enter a date in the From Date: and To Date:
date chooser field.
6. Choose whether the Source Data exists in Folders, Files or on Centera.
a. If Folders or Files, use the navigation pane to select the drive letter
Nuix Email Archive Migration Manager User Guide v1.2
36

where the source data exists. You may also choose to Compute Batch
Size. This will allow you to obtain additional metrics while the job
is processing, like Percentage Completed and Total Bytes to process.

Tip

It is recommend that for each Job, target no more than 500
GB – 1 TB of compressed, source data per Job. This is for
optimal performance as well as an advantage in system
failure scenarios. If a system crashes mid-processing, it
would be more efficient to restart a Job that will take 24
hours to complete, versus a Job that may take 7 days to
complete.

b. If Centera is selected, select the Clip List (*.CLP) you want to
process, browse to the PEA File (*.PEA) and browse to the IP File
(*.IPF).

Tip

It is recommend that for each Job, target no more than
10,000 Clips of compressed data per Job. This is for optimal
performance as well as an advantage in system failure
scenarios. If a system crashes mid-processing, it would be
more efficient to restart a Job that will take 24 hours to
complete, versus a Job that may take 7 days to complete.

7. Enter the SQL connection string details in the SQL Connection Info settings.
Be sure to click Test SQL Connection to ensure you can properly connect to
SQL.

Warning

Testing the SQL is critical, as you will want to ensure that
you can correctly query the SQL databases for additional
information.

Tip

It is recommended to use a built-in SQL service account
(instead of Windows authentication) with READ access to ALL
of the Enterprise Vault databases.

8. The Enterprise Vault Settings will default to:
Name

Default Value

Description

Skip Additional SQL
Lookups

True

Using the connected EV SQL database, Nuix
will not perform any additional EV SQL
queries, other than the basic required for
reconstituting single-instanced attachments
(SIS).
When set to False, additional lookups will
be performed and will slow down processing
items substantially.
It is recommended to ALWAYS set this to

Nuix Email Archive Migration Manager User Guide v1.2
37

True unless otherwise noted.
Use
FileTransactionID
over
ParentTransactionID

False

When enabled, EV SQL lookups will look use
the FileTransctionID column name instead of
the usual ParentTransactionID.

User List:

empty

Not applicable to a Journal Archive
workflow.

For emails that were archived from a journal, Nuix will add
the distribution list recipients to a new metadata field
called “Expanded-DL”, however, based on your Output Type
(PST, MSG or EML), the metadata may not be preserved unless
Add Distribution List Recipients is enabled under MAPI
Export Options or EML Export Options on the Lightspeed
Settings tab in Global Settings.

Warning

9. The Worker Side Script (WSS) settings will allow you to:
Name

Default Value

Description

Exclude
Unresponsive
Items

True

When enabled, Nuix will not export any items that
do not respond to your Search Terms and Mapping
CSV.
If you set to False, you will need to adjust your
Mapping CSV to include an entry for:
unresponsive,unresponsive.pst

Verbose
Logging

False

When enabled, Nuix will not include any verbose
logging at the WSS-level for troubleshooting
purposes.
If set to True, this will allow for easier
troubleshooting, however, the size of the logs
will be substantially larger.

Content
Filtering

Email, RSS Feed,
Calendar,
Contact

When enabled, all top-level emails (RSS Feeds
included), Calendar and Contact items will be
extracted.
If you want to filter any of these kinds out,
simply de-select the item kind you wish to
filter.

Search Terms
CSV

empty

You must select a list of Search Terms in CSV
format. Search Terms is a 2 column CSV that
includes a Flag in Column A and a Search Term in
Column B. You do not need a header column.
Each line should include an SMTP address, X400 or
X500 address similar to this. You can also add
multiple Search Terms to the same Flag, for
example:
Alex.Chatzistamatis,alex.chatzistamatis@nuix.com

Nuix Email Archive Migration Manager User Guide v1.2
38

Alex.Chatzistamatis,alex@nuix.com
Nuix will scan for these search terms across
Communication Metadata (From, To, Cc, Bcc) and
the Expanded-DL metadata field.
Mapping CSV

empty

You must select a Mapping in CSV format. Like
Search Terms, Mapping is a 2 column CSV that
includes a Flag Name in Column A and an Output
Name in Column B. You do not need a header
column.
Each line should include
Search Terms CSV and the
multiple Search Terms to
not need to add multiple
CSV, for example:

the Flag from your
Output Name. If you add
a single Flag, you do
Flags in the Mapping

Alex.Chatzistamatis,Alex.Chatzistamatis.pst

10.

Review the settings you’ve selected for this job and click the Add
Batch to Grid button.
a. If you have more mailbox archives you need to process, select a new
list of users and click Add Batch to Grid, continuing this until all of
users have been added to a Batch on the Grid.

Warning

11.

Performance may vary, especially based on the settings
configured on the Lightspeed Settings tab in Global
Settings. Be sure to ALWAYS review settings prior to
starting Jobs.

When you are ready to begin processing, click on the Start Job button.

Journal Archive to Flat PST, MSG or EML
1. In the Legacy Archive dropdown box, select Veritas Enterprise Vault
2. Select Journal from the Archive Type radio button selections
3. Select Flat from the Output Type radio button selections
4. Select PST, MSG or EML as the Lightspeed Extraction Output

Note

It is highly recommend that you select PST when your Output
Type is set to User. PSTs are much easier to manage since
emails will be grouped by User on the file system, and long
file path names issues will be avoided.

5. If a date filter is requested, enter a date in the From Date: and To Date:
date chooser field.
6. Choose whether the Source Data exists in Folders, Files or on Centera.
a. If Folders or Files, use the navigation pane to select the drive letter
where the source data exists. You may also choose to Compute Batch
Nuix Email Archive Migration Manager User Guide v1.2
39

Size. This will allow you to obtain additional metrics while the job
is processing, like Percentage Completed and Total Bytes to process.

Tip

It is recommend that for each Job, target no more than 500
GB – 1 TB of compressed, source data per Job. This is for
optimal performance as well as an advantage in system
failure scenarios. If a system crashes mid-processing, it
would be more efficient to restart a Job that will take 24
hours to complete, versus a Job that may take 7 days to
complete.

b. If Centera is selected, select the Clip List (*.CLP) you want to
process, browse to the PEA File (*.PEA) and browse to the IP File
(*.IPF).

Tip

It is recommend that for each Job, target no more than
10,000 Clips of compressed data per Job. This is for optimal
performance as well as an advantage in system failure
scenarios. If a system crashes mid-processing, it would be
more efficient to restart a Job that will take 24 hours to
complete, versus a Job that may take 7 days to complete.

7. Enter the SQL connection string details in the SQL Connection Info settings.
Be sure to click Test SQL Connection to ensure you can properly connect to
SQL.

Warning

Testing the SQL is critical, as you will want to ensure that
you can correctly query the SQL databases for additional
information.

Tip

It is recommended to use a built-in SQL service account
(instead of Windows authentication) with READ access to ALL
of the Enterprise Vault databases.

8. The Enterprise Vault Settings will default to:
Name

Default Value

Description

Skip Additional SQL
Lookups

True

Using the connected EV SQL database, Nuix
will not perform any additional EV SQL
queries, other than the basic required for
reconstituting single-instanced attachments
(SIS).
When set to False, additional lookups will
be performed and will slow down processing
items substantially.
It is recommended to ALWAYS set this to
True unless otherwise noted.

Use

False

When enabled, EV SQL lookups will look use

Nuix Email Archive Migration Manager User Guide v1.2
40

FileTransactionID
over
ParentTransactionID
User List:

the FileTransctionID column name instead of
the usual ParentTransactionID.
empty

Warning

Not applicable to a Journal Archive
workflow.

For emails that were archived from a journal, Nuix will add
the distribution list recipients to a new metadata field
called “Expanded-DL”, however, based on your Output Type
(PST, MSG or EML), the metadata may not be preserved unless
Add Distribution List Recipients is enabled under MAPI
Export Options or EML Export Options on the Lightspeed
Settings tab in Global Settings.

9. Review the settings you’ve selected for this job and click the Add Batch to
Grid button.
a. If you have more mailbox archives you need to process, select a new
list of users and click Add Batch to Grid, continuing this until all of
users have been added to a Batch on the Grid.

Warning

10.

Performance may vary, especially based on the settings
configured on the Lightspeed Settings tab in Global
Settings. Be sure to ALWAYS review settings prior to
starting Jobs.

When you are ready to begin processing, click on the Start Job button.

Nuix Email Archive Migration Manager User Guide v1.2
41

Working with EMC EmailXtender/SourceOne
When working with EMC EmailXtender/SourceOne source data, you can use NEAMM in
various methods to target the data directly on the file system in its proprietary,
.EMX file format (or on Centera when applicable) and export it out to disk in PST,
MSG or EML format. Below are several workflows that NEAMM can handle.

Overview
EMC archiving solutions have two components:


Files on disk



SQL database

Nuix must have access to the database while processing EMC files on disk in order
to expand distribution lists.

Prerequisites
The following prerequisites must be met before EAS data can be processed using
Nuix:
EMC database restoration3


A copy of the production EMC database(s) must be restored to a local MSSQL
instance on the Nuix server.

Source data identified with a specific EMC database4

Source Data
Physical Files
EMC data is written to the file system in the form of .EMX files. These files are
compressed containers that store individual messages with attachments. Each EMX
file can contain a few messages or up to several thousand messages.
File Structure
EMX files are highly compressed files that contain potentially thousands of emails,
calendar entries, or contacts. The first item under each EMX file is a single .VOL
file that contains metadata about the container itself and holds very little value.
Below that, each top-level item will be nested within the EMX wrapper. This wrapper
is used by EMC itself for identification purposes and contains no value. Below the
EMC wrapper, the top-level item is nested. If ingesting into a case these files

EMC data can be processed without a database connection; however, the database is
required for expansion of distribution lists.
3

Many large EMC archive deployments include multiple archiving servers, each of
which will have a separate database.
4

Nuix Email Archive Migration Manager User Guide v1.2
42

should be excluded.

Note

The .VOL file is not critical to email migrations and will
be automatically skipped by Lightspeed.

Distribution List Expansion
Nuix is able to expand distribution lists (DL) at time of processing if connected
to the EMC database. This is often an important component of an archive extraction
and should be discussed with the client prior to beginning any work. There are
several options for where to inject the expanded addresses in the email metadata.
Designate the Field for Expanded Addresses
Nuix default behavior is to expand distribution lists and write the addresses to an
‘Expanded-DL’ metadata field. It is also default behavior to push these addresses
into the ‘To’ field. Therefore, it is necessary to adjust system properties in your
Nuix startup file either to redirect DL expansion from the ‘To’ field to the ‘BCC’
field, or avoid this all together.

Nuix Email Archive Migration Manager User Guide v1.2
43

File Distribution and Naming
Files are named in the following date-based format:
YYYYMMDDXXXXXX.emx
For example, 20100316181524.emx
It is common for multiple containers to be created on the same date, so the
additional characters XXXXXX are necessary. It is believed that these are also
timestamp based (HHMMSS), but it is uncertain precisely how these are calculated.
Typically, each EMC archive server will write all EMX files to a single root
directory. It is also common for the root directory to include date-based
subfolders (2010, 2011, 2012), or to write the journal mail to a distinct folder.
However, the data is laid out on disk and the database provides a map. You can view
the path to each individual EMX file in the database at dbo.Volume.CurrentUNCPath.

Note

Each .EMX file can range in size, but they are generally 80–
100 MB size each. More detail on this is provided in the
EMC Database section.

Tip

Typical best practice is to break up processing of EMC data
into logical subsets based on server, date, and/or overall
volume. It is possible to move EMC data for processing
without requiring any edits to the database.

Centera
It is common that EMC data is archived to an EMC Centera storage deployment. In
order to target data on a Centera device, you must produce a list of the C-Clips
that correspond to your data. For EmailXtender this must be done by running a
utility from the original archive server; for SourceOne, you can retrieve the Clips
from the database.

Note

A C-Clip in Centera is equivalent to one .EMX file on the
file system. Each clip will range from 80–100 MB each.

Prerequisites
The following prerequisites must be met before EMC Centera data can be processed
using Nuix:
Pool Entry Authorization (PEA) File – if applicable


If the EMC Centera device has been configured with additional storage
pools, a PEA file will need to be obtained and placed in a dedicated
location on the Nuix server. A PEA file is an encrypted file used to
communicate and distribute authentication credentials to Centera and

Nuix Email Archive Migration Manager User Guide v1.2
44

contains the default key, key, and credential.
Environment User Variable – if applicable


If a PEA file is necessary, it must be obtained and placed in a dedicated
location on the Nuix server. Next, an Environment Variable must be created
on the Nuix server with a value of the PEA file path.

List of Centera Access Nodes (IPs)


The Centera Access Nodes are simply the IP addresses of the nodes
available on Centera. This is typically a list of two or four IP
addresses. If replication is enabled, more addresses may be available.

List of C-Clips


C-Clips, also known as Clips, are unique alphanumeric strings that
reference specific source data within Centera. The source data varies
depending on the archive; however, the concept is always the same. A list
of C-Clips can be generated using several techniques and passed in to Nuix
in order to process data.

Pool Entry Authorization
A Pool Entry Authorization (PEA) file, generated while creating or updating an
access profile, is a clear-text, XML-formatted, non-encrypted file that can be used
by system administrators to communicate and distribute authentication credentials
to application administrators. A PEA file is optional for profiles with non-encoded
secrets (created using the File and Prompt options) but is mandatory for profiles
with base-64 encoded secrets (created with the Generate option).

Note

A PEA file may not be necessary in all environments unless
Centera has been specifically setup and configured to
require one. For example, a Centera setup with only the
default pool may not require a PEA file for authentication.
If a PEA file is not needed for authentication, the
Environment Variable is not necessary.

Configuration
If the EMC Centera device has been configured with additional storage pools, a PEA
file will need to be obtained and placed in a dedicated location on the Nuix
server. An Environment Variable will also need to be created for Nuix to reference
the PEA file.
Obtain the PEA file from the customer.

Place the PEA file in a dedicated directory on the Nuix server.
Create an Environment Variable that references the location of the PEA file.

Nuix Email Archive Migration Manager User Guide v1.2
45

It is critical that the Environment Variable is configured
properly. Refer to the image below for more information.

Warning



The Variable name must be: CENTERA_PEA_LOCATION



The Variable value must be the absolute path to the
physical location of the PEA file on the Nuix server



Click OK to save the variable

Access Node IP Addresses
An EMC Centera access node is a node that has the access role applied to it. Access
nodes are gateways to the data stored in Centera. These nodes have IP addresses on
the network and are responsible for authentication. If you successfully connect to
one such node, you have access to the entire cluster. However, to connect faster,
you can specify several available nodes with the access role.
Nuix Email Archive Migration Manager User Guide v1.2
46

Configuration
The Centera access nodes are simply the IP addresses of the nodes available on
Centera. This is typically a list of two or four IP addresses. If replication is
enabled on Centera, more addresses may be available.
After a list of Centera access nodes has been obtained from the customer, next
steps include:
Create an .IPF (standard text file – “IP Address File”) file with each line
representing the IP of each Centera access node.

Place this .IPF file in a dedicated directory on the Nuix server.


This .TXT file will be required when processing data on Centera.

C-Clip List
Centera-based data is referenced by C-Clips, which must be passed in to Nuix at
time of processing. Unless the client provides these Clips, you will need to
generate them using the methods outlined in this section.
Generating the Clip List(s) for EmailXtender
The EmailXtender deployment contains everything you need to retrieve the Centera
Clips, but you will require access to the EmailXtender archive server to perform
the necessary steps. This is typically performed in conjunction with a client tech
that can facilitate the necessary access. The following is a walkthrough of the
process:
Identify DxDmChk.exe utility
Identify source data for Clip retrieval5
Execute configured DxDmChk utility
Repeat Steps 2 and 3 as necessary to retrieve all Clips.
Generating the Clip List(s) for SourceOne
The SourceOne database contains a list of all Centera Clips in the archive. This
list can be found in the Volume table in the VolStubXML column. Each row of the
Volume table corresponds to a specific EMX file, each of which will be archived to
a single Centera Clip. If you wish to scope lists of Clips based on date range or
archive server, you can reference the other columns in this table to do so.

Note

Generating a list of C-Clips is dependent on the number of
pools that exist in Centera. If only the default pool
exists, the list of C-Clips can only be generated for the
default pool. Every additional pool that is created will
have its own set of C-Clips associated with it.

You will often want to retrieve the Clips in batches according to date range or
volume, as opposed to retrieving all Clips at once.
5

Nuix Email Archive Migration Manager User Guide v1.2
47

Understanding the differences between legacy archive file
types is critical when processing the same archive data
on Centera. For example, processing EMC EmailXtender or
EMC SourceOne C-Clips is fundamentally different than
processing Symantec EV C-Clips.

Warning

A single EmailXtender C-Clip is equivalent to one .EMX
file, which could contain hundreds or thousands to-level
emails, whereas a single Enterprise Vault C-Clip is
equivalent to one .DVS file which can only contain a
single top-level email.
Processing 10,000 EmailXtender C-Clips would take
substantially longer than processing 10,000 Enterprise
Vault C-Clips.



The clip list can now be carved up into manageable sets of .CLP files.

For EMC archives, a common workflow may include create Clip Lists containing around
10,000 clips per file. This should average out to around 1 TB of compressed source
data.

Database
EMC email archive deployments require at least a single database and often have
multiple associated databases (one per email archive server). Nuix can process EMC
data without these databases; however, a database is required for expanding
distribution lists. Additionally, if the archive is SourceOne and the data is
stored on a Centera device, database access will be required for retrieval of
Centera Clips.
Best practice is to restore a copy of all EMC archive databases locally on the Nuix
server(s). However, it is possible to use production SQL instances if this is
desired, since, unlike with other archives, EMC databases require no editing for
use with Nuix.

Tip

Always restore local database copies for best performance.

Authentication
The credentials used to access SQL will be passed in via the Nuix startup file. It
is recommended that the account provided uses built-in SQL authentication. Since
the SQL instance is local and used only for Nuix processing, we recommend simply
using the ‘sa’ (SQL Administrator) account for Nuix access.
Configuration
EMC databases do not require any special configuration for use with Nuix.
EmailXtender and SourceOne databases use nearly identical structure; however, there
are differences. The only requirement for the Nuix engineer is to ensure that, in
the case of SourceOne data, the Nuix startup includes a switch6 to tell Nuix to use
the SourceOne schema.

6

Dnuix.data.xtender.addressDbSchema=sourceOne

Nuix Email Archive Migration Manager User Guide v1.2
48

Critical Tables
It is important to understand the data contained in several tables and the way Nuix
uses these tables to interpret EAS data.
dbo.Volume
The Volume table contains several columns of interest. Number of messages, date
range, and volume can all be tracked in this table on a per-EMX file basis.

Additionally, for SourceOne data on Centera, the Centera Clip listing is generated
from the VolStubXML column in this table.

dbo.EmailAddresses
The EmailAddresses table contains a listing of all individual email addresses in
the archive. The majority of these entries are SMTP, but SYS, EX, and PST addresses
are also stored here.


SYS: These entries are system-generated addresses by EMC.
o



EX: These entries are objects from the Exchange Server. They can include the
display name of a user, email address, or a full object value (LegDN).
o



Example: EX:"Alex
Chatzistamatis"

SMTP: These entries are the SMTP addresses that were on the messages at the
time the item was sent/received.
o



Example: SYS:"ES1ExchJournal"

Example: SMTP:"Alex Chatzistamatis"

PST: These entries are items imported from a PST.
o

Example: PST:"archive.pst"\\\\nuixfs01\\pst\\achatz01\\archive.pst

Nuix Email Archive Migration Manager User Guide v1.2
49

Supported Workflows
Journal Archive to User PST
1. In the Legacy Archive dropdown box, select EMC EmailXtender or EMC SourceOne
2. Select Journal from the Archive Type radio button selections
3. Select User from the Output Type radio button selections
4. Select PST as the Lightspeed Extraction Output

Note

It is highly recommend that you select PST when your Output
Type is set to User. PSTs are much easier to manage since
emails will be grouped by User on the file system, and long
file path names issues will be avoided.

5. If a date filter is requested, enter a date in the From Date: and To Date:
date chooser field.
6. Choose whether the Source Data exists in Folders, Files or on Centera.
a. If Folders or Files, use the navigation pane to select the drive letter
where the source data exists. You may also choose to Compute Batch
Size. This will allow you to obtain additional metrics while the job
is processing, like Percentage Completed and Total Bytes to process.

Tip

It is recommend that for each Job, target no more than 500
GB – 1 TB of compressed, source data per Job. This is for
optimal performance as well as an advantage in system
failure scenarios. If a system crashes mid-processing, it
would be more efficient to restart a Job that will take 24
hours to complete, versus a Job that may take 7 days to
complete.

b. If Centera is selected, select the Clip List (*.CLP) you want to
process, browse to the PEA File (*.PEA) and browse to the IP File
(*.IPF).

Tip

It is recommend that for each Job, target no more than
10,000 Clips of compressed data per Job. This is for optimal
performance as well as an advantage in system failure
scenarios. If a system crashes mid-processing, it would be
more efficient to restart a Job that will take 24 hours to
complete, versus a Job that may take 7 days to complete.

7. Enter the SQL connection string details in the SQL Connection Info settings.
Be sure to click Test SQL Connection to ensure you can properly connect to
SQL.

Warning

Testing the SQL is critical, as you will want to ensure that
you can correctly query the SQL databases for additional
information.

Nuix Email Archive Migration Manager User Guide v1.2
50

It is recommended to use a built-in SQL service account
(instead of Windows authentication) with READ access to ALL
of the EmailXtender/SourceOne databases.

Tip

8. The EMC Settings will default to:
Name

Default
Value

Description

Address
Filtering:

SYS, EX,
PST

With all three (3) enabled, non-SMTP addresses will be
filtered from metadata. Non-SMTP addresses include
addresses EMC may have added including: system addresses
and PST ingestion information, as well as X400/X500
addresses. Examples of addresses are listed below:
SYS:"ES1ExchJournal"
EX:"Alex Chatzistamatis
"
PST:"archive.pst"<\\\\nuixfs01\\pst\\achatz01\\archive.pst>

Expand DL
to:

“ExpandedDL”

Using the connected EMC SQL database, Nuix will query for
any distribution list recipients on every email. Any
responsive recipients will be added depending on the value
selected in this dropdown.
To: + “Expanded-DL” = distribution list recipients will be
added to the To: field AND the “Expanded-DL” field.
Bcc: + “Expanded-DL” = distribution list recipients will be
added to the Bcc: field AND the “Expanded-DL” field.
“Expanded-DL” = distribution list recipients will be added
to the “Expanded-DL” metadata field.

The Expand DL to: selection will only ADD the metadata to a
new metadata field called “Expanded-DL”, however, based on
your Output Type (PST, MSG or EML), the metadata may not be
preserved unless Add Distribution List Recipients is enabled
under MAPI Export Options or EML Export Options on the
Lightspeed Settings tab in Global Settings.

Warning

9. The Worker Side Script (WSS) settings will allow you to:
Name

Default Value

Description

Exclude
Unresponsive
Items

True

When enabled, Nuix will not export any items that
do not respond to your Search Terms and Mapping
CSV.
If you set to False, you will need to adjust your

Nuix Email Archive Migration Manager User Guide v1.2
51

Mapping CSV to include an entry for:
unresponsive,unresponsive.pst
Verbose
Logging

False

When enabled, Nuix will not include any verbose
logging at the WSS-level for troubleshooting
purposes.
If set to True, this will allow for easier
troubleshooting, however, the size of the logs
will be substantially larger.

Content
Filtering

Email, RSS Feed,
Calendar, Contact

When enabled, all top-level emails (RSS Feeds
included), Calendar and Contact items will be
extracted.
If you want to filter any of these kinds out,
simply de-select the item kind you wish to
filter.

Search Terms
CSV

empty

You must select a list of Search Terms in CSV
format. Search Terms is a 2 column CSV that
includes a Flag in Column A and a Search Term in
Column B. You do not need a header column.
Each line should include an SMTP address, X400 or
X500 address similar to this. You can also add
multiple Search Terms to the same Flag, for
example:
Alex.Chatzistamatis,alex.chatzistamatis@nuix.com
Alex.Chatzistamatis,alex@nuix.com
Nuix will scan for these search terms across
Communication Metadata (From, To, Cc, Bcc) and
the Expanded-DL metadata field.

Mapping CSV

empty

You must select a Mapping in CSV format. Like
Search Terms, Mapping is a 2 column CSV that
includes a Flag Name in Column A and an Output
Name in Column B. You do not need a header
column.
Each line should include
Search Terms CSV and the
multiple Search Terms to
not need to add multiple
CSV, for example:

the Flag from your
Output Name. If you add
a single Flag, you do
Flags in the Mapping

Alex.Chatzistamatis,Alex.Chatzistamatis.pst

10.

Review the settings you’ve selected for this job and click the Add
Batch to Grid button.
a. If you have more mailbox archives you need to process, select a new
list of users and click Add Batch to Grid, continuing this until all of
users have been added to a Batch on the Grid.

Nuix Email Archive Migration Manager User Guide v1.2
52

Warning

11.

Performance may vary, especially based on the settings
configured on the Lightspeed Settings tab in Global
Settings. Be sure to ALWAYS review settings prior to
starting Jobs.

When you are ready to begin processing, click on the Start Job button.

Journal Archive to Flat PST, MSG or EML
1. In the Legacy Archive dropdown box, select EMC EmailXtender or EMC SourceOne
2. Select Journal from the Archive Type radio button selections
3. Select Flat from the Output Type radio button selections
4. Select PST, MSG or EML as the Lightspeed Extraction Output

Note

It is highly recommend that you select PST when your Output
Type is set to User. PSTs are much easier to manage since
emails will be grouped by User on the file system, and long
file path names issues will be avoided.

5. If a date filter is requested, enter a date in the From Date: and To Date:
date chooser field.
6. Choose whether the Source Data exists in Folders, Files or on Centera.
a. If Folders or Files, use the navigation pane to select the drive letter
where the source data exists. You may also choose to Compute Batch
Size. This will allow you to obtain additional metrics while the job
is processing, like Percentage Completed and Total Bytes to process.

Tip

It is recommend that for each Job, target no more than 500
GB – 1 TB of compressed, source data per Job. This is for
optimal performance as well as an advantage in system
failure scenarios. If a system crashes mid-processing, it
would be more efficient to restart a Job that will take 24
hours to complete, versus a Job that may take 7 days to
complete.

b. If Centera is selected, select the Clip List (*.CLP) you want to
process, browse to the PEA File (*.PEA) and browse to the IP File
(*.IPF).

Tip

It is recommend that for each Job, target no more than
10,000 Clips of compressed data per Job. This is for optimal
performance as well as an advantage in system failure
scenarios. If a system crashes mid-processing, it would be
more efficient to restart a Job that will take 24 hours to
complete, versus a Job that may take 7 days to complete.

7. Enter the SQL connection string details in the SQL Connection Info settings.
Nuix Email Archive Migration Manager User Guide v1.2
53

Be sure to click Test SQL Connection to ensure you can properly connect to
SQL.

Warning

Testing the SQL is critical, as you will want to ensure that
you can correctly query the SQL databases for additional
information.

Tip

It is recommended to use a built-in SQL service account
(instead of Windows authentication) with READ access to ALL
of the EmailXtender/SourceOne databases.

8. The EMC Settings will default to:
Name

Default
Value

Description

Address
Filtering:

SYS, EX,
PST

With all three (3) enabled, non-SMTP addresses will be
filtered from metadata. Non-SMTP addresses include
addresses EMC may have added including: system addresses
and PST ingestion information, as well as X400/X500
addresses. Examples of addresses are listed below:
SYS:"ES1ExchJournal"
EX:"Alex Chatzistamatis
"
PST:"archive.pst"<\\\\nuixfs01\\pst\\achatz01\\archive.pst>

Expand DL
to:

“ExpandedDL”

Using the connected EMC SQL database, Nuix will query for
any distribution list recipients on every email. Any
responsive recipients will be added depending on the value
selected in this dropdown.
To: + “Expanded-DL” = distribution list recipients will be
added to the To: field AND the “Expanded-DL” field.
Bcc: + “Expanded-DL” = distribution list recipients will be
added to the Bcc: field AND the “Expanded-DL” field.
“Expanded-DL” = distribution list recipients will be added
to the “Expanded-DL” metadata field.

Warning

The Expand DL to: selection will only ADD the metadata to a
new metadata field called “Expanded-DL”, however, based on
your Output Type (PST, MSG or EML), the metadata may not be
preserved unless Add Distribution List Recipients is enabled
under MAPI Export Options or EML Export Options on the
Lightspeed Settings tab in Global Settings.

9. Review the settings you’ve selected for this job and click the Add Batch to
Grid button.
a. If you have more mailbox archives you need to process, select a new
list of users and click Add Batch to Grid, continuing this until all of
Nuix Email Archive Migration Manager User Guide v1.2
54

users have been added to a Batch on the Grid.

Warning

10.

Performance may vary, especially based on the settings
configured on the Lightspeed Settings tab in Global
Settings. Be sure to ALWAYS review settings prior to
starting Jobs.

When you are ready to begin processing, click on the Start Job button.

Working with HP/Autonomy Zantaz EAS
When working with HP/Autonomy Zantaz EAS source data, you can use NEAMM in various
methods to target the data directly on the file system in its proprietary, .EAS
file format (or on Centera when applicable) and export it out to disk in PST, MSG
or EML format. Below are several workflows that NEAMM can handle.

Overview
The Zantaz EAS archiving solution has two components:


Files on disk



SQL Database

Nuix must have access to the database while processing EAS files on disk in order
to correctly parse the uniquely formatted EAS data.

Prerequisites
The following prerequisites must be met before EAS data can be processed using
Nuix:
EAS database restoration


A copy of the production EAS database must be restored to a local MSSQL
instance on the Nuix server.

EAS database configuration


The restored EAS database must be configured per the instructions in the
“EAS Database” section of this document.

Source data identified with a specific Docstore


If source data has been moved from the original archive storage location,
it must be clearly marked as belonging to a specific Docstore.

Source Data
Physical Files
At the storage locations indicated for each Docstore, Zantaz EAS data is written to
the file system in the form of .EAS files. These files are compressed containers
which house individual messages and attachments.

File Structure
Typical EAS files contain single copies of each message with attachments, as seen
Nuix Email Archive Migration Manager User Guide v1.2
55

in (2) below. If single-instancing has been enabled, loose attachments may also be
found in the container, such as (3) below.

Single-Instancing
It is possible for EAS admins to enable true single-instancing in their archive. A
quick look at the ContainsAttachment table in the database will confirm whether
this option has been enabled at any point in the history of the archive. If this
table contains values, single-instancing was previously enabled. If the table is
blank, it was not.
As of Nuix 6.2.7, support for single-instancing is in place. Processing this data
correctly requires that ALL EAS Docstores are correctly listed in
dbo.DocumentServer and that they remain available to the Nuix server at all times.
This is due to the fact that single-instancing can occur across Docstores. In other
words, the attachment for an email in file 20120103.eas on Docserver 1 might reside
in file 20110904.eas on Docserver 5, or it might live in 20121205.eas on Docserver
3. There is no way to know in advance.
As of Nuix 6, EAS data is automatically presented as individual user-based email.
Default Nuix behavior is to process user-based copies for each email in an EAS
container. This is necessary to ensure all metadata can be maintained at export,
such as user folder structure or read/unread flags.

Nuix Email Archive Migration Manager User Guide v1.2
56

File Distribution & Naming
Files are named in a date-based format YYYYMMDD.eas—for example, 20150108.eas. If
multiple containers are created on the same date, an additional character will be
added to subsequent files. So you may see something like the following on the
filesystem: 20150108.eas, 20150108A.eas, 20150108B.eas.
Each active Docstore will archive one of these containers each day; therefore, it
is expected that multiple individual EAS files will have the same filename. This is
why Nuix can only process data from one Docstore in a single run and why we must
provide the DocServerID of the target Docstore at the time of processing.
Typical best practice is to process all EAS data from the
original storage locations. However, it is possible to move
the data, provided the following conditions are met:

Note



Data from individual Docstores is always kept
separate.



Any moved files are identified with a specific
Docstore.



If single-instancing is in place, ALL data must be
moved and presented to the Nuix server to ensure all
email can be fully rehydrated.

Centera
It is common for EAS deployments to be archived on Centera storage. In order to
target data on a Centera device, you must produce a list of the C-Clips which
correspond to your data.
Prerequisites
The following prerequisites must be met before EMC Centera data can be processed
using Nuix:

Nuix Email Archive Migration Manager User Guide v1.2
57

Pool Entry Authorization (PEA) File – if applicable


If the EMC Centera device has been configured with additional storage
pools, a PEA file will need to be obtained and placed in a dedicated
location on the Nuix server. A PEA file is an encrypted file used to
communicate and distribute authentication credentials to Centera and
contains the default key, key, and credential.

Environment User Variable – if applicable


If a PEA file is necessary, it must be obtained and placed in a dedicated
location on the Nuix server. Next, an Environment Variable must be created
on the Nuix server with a value of the PEA file path.

List of Centera Access Nodes (IPs)


The Centera Access Nodes are simply the IP addresses of the nodes
available on Centera. This is typically a list of two or four IP
addresses. If replication is enabled, more addresses may be available.

List of C-Clips


C-Clips, also known as Clips, are unique alphanumeric strings that
reference specific source data within Centera. The source data varies
depending on the archive; however, the concept is always the same. A list
of C-Clips can be generated using several techniques and passed in to Nuix
in order to process data.

Pool Entry Authorization
A Pool Entry Authorization (PEA) file, generated while creating or updating an
access profile, is a clear-text, XML-formatted, non-encrypted file that can be used
by system administrators to communicate and distribute authentication credentials
to application administrators. A PEA file is optional for profiles with non-encoded
secrets (created using the File and Prompt options) but is mandatory for profiles
with base-64 encoded secrets (created with the Generate option).

Note

A PEA file may not be necessary in all environments unless
Centera has been specifically setup and configured to
require one. For example, a Centera setup with only the
default pool may not require a PEA file for authentication.
If a PEA file is not needed for authentication, the
Environment Variable is not necessary.

Configuration
If the EMC Centera device has been configured with additional storage pools, a PEA
file will need to be obtained and placed in a dedicated location on the Nuix
server. An Environment Variable will also need to be created for Nuix to reference
the PEA file.
Obtain the PEA file from the customer.

Nuix Email Archive Migration Manager User Guide v1.2
58

Place the PEA file in a dedicated directory on the Nuix server.
Create an Environment Variable that references the location of the PEA file.
It is critical that the Environment Variable is configured
properly. Refer to the image below for more information.

Warning



The Variable name must be: CENTERA_PEA_LOCATION



The Variable value must be the absolute path to the
physical location of the PEA file on the Nuix server



Click OK to save the variable

Access Node IP Addresses
An EMC Centera access node is a node that has the access role applied to it. Access
nodes are gateways to the data stored in Centera. These nodes have IP addresses on
the network and are responsible for authentication. If you successfully connect to
one such node, you have access to the entire cluster. However, to connect faster,
you can specify several available nodes with the access role.
Configuration
The Centera access nodes are simply the IP addresses of the nodes available on
Centera. This is typically a list of two or four IP addresses. If replication is
Nuix Email Archive Migration Manager User Guide v1.2
59

enabled on Centera, more addresses may be available.
After a list of Centera access nodes has been obtained from the customer, next
steps include:
Create an .IPF (standard text file – “IP Address File”) file with each line
representing the IP of each Centera access node.

Place this .IPF file in a dedicated directory on the Nuix server.


This .TXT file will be required when processing data on Centera.

C-Clip List
Centera-based data is referenced by C-Clips, which must be passed in to Nuix at
time of processing. Unless the client provides these Clips, you will need to
generate them using the methods outlined in this section.
Generating the Clip List(s) for Zantaz EAS
Install the jre-6-27-windows-i586-s.exe instance of Java on a machine



It can be a Nuix machine, but it doesn't have to be
Install the package to a distinct path: C:\Nuix\Java

Unzip the JSCSScript-win32-3.2.35.zip into the C:\Nuix\Java\bin folder
Launch the command prompt


Start | Run | Cmd 

Move to the root:


cd \

Change to the Nuix\Java\bin directory:


cd nuix\java\bin

Launch the EMC Centera API


java -jar JCASScript.jar

If it launches, it will return something like:


CASScript>

Connect to the desired Centera Pool. The following example uses a PEA file:


poolOpen 10.10.13.203?C:\Nuix\Dxprofile.pea

If you are connected you will get a response that says:


Connected to: 10.10.13.203?C:\Nuix\dxprofile.pea

Run the following command:
Nuix Email Archive Migration Manager User Guide v1.2
60




queryToFile  
If a date filter needs to be applied, the querySetLowerBound and
querySetUpperBound commands can be set first, followed by the queryToFile
command.

Note

Generating a list of C-Clips is dependent on the number of
pools that exist in Centera. If only the default pool
exists, the list of C-Clips can only be generated for the
default pool. Every additional pool that is created will
have its own set of C-Clips associated with it.

Understanding the differences between legacy archive file
types is critical when processing the same archive data
on Centera. For example, processing EMC EmailXtender or
EMC SourceOne C-Clips is fundamentally different than
processing Symantec EV C-Clips.

Warning

A single EmailXtender C-Clip is equivalent to one .EMX
file, which could contain hundreds or thousands to-level
emails, whereas a single Enterprise Vault C-Clip is
equivalent to one .DVS file which can only contain a
single top-level email.
Processing 10,000 EmailXtender C-Clips would take
substantially longer than processing 10,000 Enterprise
Vault C-Clips.



The clip list can now be carved up into manageable sets of .CLP files.

Database
A Zantaz EAS installation only requires a single database. In order for Nuix to
correctly interact with the EAS database, several conditions have to be met.
Namely, certain tables require a specific schema, and it is likely that at least
one table will need to be edited. For this reason7 it is required that a COPY of the
EAS database be used (to be restored locally on the Nuix server), rather than the
production SQL instance.

Authentication
The credentials used to access SQL will be passed in via the Nuix startup file. It
is recommended that the account provided uses built-in SQL authentication. Since
the SQL instance is local and used only for Nuix processing, we recommend simply
using the ‘sa’ (SQL Administrator) account for Nuix access.

It is also best practice to avoid interacting with the production database to
minimize impact on the client environment and maximize performance of Nuix
processes.
7

Nuix Email Archive Migration Manager User Guide v1.2
61

Configuration
The DocumentServer table must be edited to correctly reflect the path to all source
data.
Schema
Ensure the following tables use the dbo schema:


Distlist



DistlistRef



DocumentServer



EmailAddresses



EmailMessages



Folder



Refer



Users

Ensure the following tables use the easadmin schema:


DataArchive



ProfileLocation

Critical Tables
It is important to understand the contents of several tables and the way Nuix uses
these tables to interpret EAS data.
easadmin.DataArchive
The DataArchive table contains three columns: DataArchiveID, Filename, and
DocServerID. In order for Nuix to correctly parse an EAS file, it must retrieve the
correct DataArchiveID for each EAS file being processed from this table. Since
filenames can be repeated between Docstores, the DocServerID passed in at startup
provides the cross-reference to ensure Nuix retrieves the correct DataArchiveID.

easadmin.ProfileLocation
The ProfileLocation is one of two tables in the EAS database that contain an itemlevel listing of every message in the archive (dbo.Refer is the other).

Nuix Email Archive Migration Manager User Guide v1.2
62

The
The
(or
the

ProfileLocation table is a critical path element for Nuix to process EAS data.
StartingPos and CompressedSize columns tell Nuix where each individual email
attachment) begins and ends in the compressed EAS container. Without this, only
first message in any given EAS file can be processed.

The UncompressedSize column is useful for queries that seek to retrieve volume
totals for user archives. This can be helpful in estimating export volume prior to
beginning a Nuix run.

Supported Workflows
Journal Archive to User PST
1. In the Legacy Archive dropdown box, select HP/Autonomy EAS
2. Select Journal from the Archive Type radio button selections
3. Select User from the Output Type radio button selections
4. Select PST as the Lightspeed Extraction Output

Note

It is highly recommend that you select PST when your Output
Type is set to User. PSTs are much easier to manage since
emails will be grouped by User on the file system, and long
file path names issues will be avoided.

5. If a date filter is requested, enter a date in the From Date: and To Date:
date chooser field.
6. Choose whether the Source Data exists in Folders, Files or on Centera.
a. If Folders or Files, use the navigation pane to select the drive letter
where the source data exists. You may also choose to Compute Batch
Size. This will allow you to obtain additional metrics while the job
is processing, like Percentage Completed and Total Bytes to process.

Tip

It is recommend that for each Job, target no more than 500
GB – 1 TB of compressed, source data per Job. This is for
optimal performance as well as an advantage in system
failure scenarios. If a system crashes mid-processing, it
would be more efficient to restart a Job that will take 24
hours to complete, versus a Job that may take 7 days to
complete.

b. If Centera is selected, select the Clip List (*.CLP) you want to
process, browse to the PEA File (*.PEA) and browse to the IP File
(*.IPF).

Nuix Email Archive Migration Manager User Guide v1.2
63

It is recommend that for each Job, target no more than
10,000 Clips of compressed data per Job. This is for optimal
performance as well as an advantage in system failure
scenarios. If a system crashes mid-processing, it would be
more efficient to restart a Job that will take 24 hours to
complete, versus a Job that may take 7 days to complete.

Tip

7. Enter the SQL connection string details in the SQL Connection Info settings.
Be sure to click Test SQL Connection to ensure you can properly connect to
SQL.

Warning

Testing the SQL is critical, as you will want to ensure that
you can correctly query the SQL databases for additional
information.

Tip

It is recommended to use a built-in SQL service account
(instead of Windows authentication) with READ access to ALL
of the EAS databases.

8. The EAS Settings will default to:
Name

Default Value

Description

Doc Server ID

none

An integer value must be entered which
correlates to the EAS DocStore that is being
selected for each Job.

For emails that were archived from a journal, Nuix will add
the distribution list recipients to a new metadata field
called “Expanded-DL”, however, based on your Output Type
(PST, MSG or EML), the metadata may not be preserved unless
Add Distribution List Recipients is enabled under MAPI
Export Options or EML Export Options on the Lightspeed
Settings tab in Global Settings.

Warning

9. The Worker Side Script (WSS) section will allow you to:
Name

Default Value

Description

Exclude
Unresponsive
Items

True

When enabled, Nuix will not export any items that
do not respond to your Search Terms and Mapping
CSV.
If you set to False, you will need to adjust your
Mapping CSV to include an entry for:
unresponsive,unresponsive.pst

Nuix Email Archive Migration Manager User Guide v1.2
64

Verbose
Logging

False

When enabled, Nuix will not include any verbose
logging at the WSS-level for troubleshooting
purposes.
If set to True, this will allow for easier
troubleshooting, however, the size of the logs
will be substantially larger.

Content
Filtering

Email, RSS Feed,
Calendar, Contact

When enabled, all top-level emails (RSS Feeds
included), Calendar and Contact items will be
extracted.
If you want to filter any of these kinds out,
simply de-select the item kind you wish to
filter.

Search Terms
CSV

empty

You must select a list of Search Terms in CSV
format. Search Terms is a 2 column CSV that
includes a Flag in Column A and a Search Term in
Column B. You do not need a header column.
Each line should include an SMTP address, X400 or
X500 address similar to this. You can also add
multiple Search Terms to the same Flag, for
example:
Alex.Chatzistamatis,alex.chatzistamatis@nuix.com
Alex.Chatzistamatis,alex@nuix.com
Nuix will scan for these search terms across
Communication Metadata (From, To, Cc, Bcc) and
the Expanded-DL metadata field.

Mapping CSV

empty

You must select a Mapping in CSV format. Like
Search Terms, Mapping is a 2 column CSV that
includes a Flag Name in Column A and an Output
Name in Column B. You do not need a header
column.
Each line should include
Search Terms CSV and the
multiple Search Terms to
not need to add multiple
CSV, for example:

the Flag from your
Output Name. If you add
a single Flag, you do
Flags in the Mapping

Alex.Chatzistamatis,Alex.Chatzistamatis.pst

10.

Review the settings you’ve selected for this job and click the Add
Batch to Grid button.
a. If you have more mailbox archives you need to process, select a new
list of users and click Add Batch to Grid, continuing this until all of
users have been added to a Batch on the Grid.

Warning

Performance may vary, especially based on the settings
configured on the Lilghtspeed Settings tab in Global
Settings. Be sure to ALWAYS review settings prior to
starting Jobs.

Nuix Email Archive Migration Manager User Guide v1.2
65

11.

When you are ready to begin processing, click on the Start Job button.

Journal Archive to Flat PST, MSG or EML
1. In the Legacy Archive dropdown box, select HP/Autonomy EAS
2. Select Journal from the Archive Type radio button selections
3. Select Flat from the Output Type radio button selections
4. Select PST, MSG or EML as the Lightspeed Extraction Output

Note

It is highly recommend that you select PST when your Output
Type is set to User. PSTs are much easier to manage since
emails will be grouped by User on the file system, and long
file path names issues will be avoided.

5. If a date filter is requested, enter a date in the From Date: and To Date:
date chooser field.
6. Choose whether the Source Data exists in Folders, Files or on Centera.
a. If Folders or Files, use the navigation pane to select the drive letter
where the source data exists. You may also choose to Compute Batch
Size. This will allow you to obtain additional metrics while the job
is processing, like Percentage Completed and Total Bytes to process.

Tip

It is recommend that for each Job, target no more than 500
GB – 1 TB of compressed, source data per Job. This is for
optimal performance as well as an advantage in system
failure scenarios. If a system crashes mid-processing, it
would be more efficient to restart a Job that will take 24
hours to complete, versus a Job that may take 7 days to
complete.

b. If Centera is selected, select the Clip List (*.CLP) you want to
process, browse to the PEA File (*.PEA) and browse to the IP File
(*.IPF).

Tip

It is recommend that for each Job, target no more than
10,000 Clips of compressed data per Job. This is for optimal
performance as well as an advantage in system failure
scenarios. If a system crashes mid-processing, it would be
more efficient to restart a Job that will take 24 hours to
complete, versus a Job that may take 7 days to complete.

7. Enter the SQL connection string details in the SQL Connection Info settings.
Be sure to click Test SQL Connection to ensure you can properly connect to
SQL.

Warning

Testing the SQL is critical, as you will want to ensure that
you can correctly query the SQL databases for additional

Nuix Email Archive Migration Manager User Guide v1.2
66

information.

It is recommended to use a built-in SQL service account
(instead of Windows authentication) with READ access to ALL
of the EAS databases.

Tip

8. The EAS Settings will default to:
Name

Default Value

Description

Doc Server ID

none

An integer value must be entered which
correlates to the EAS DocStore that is being
selected for each Job.

Warning

For emails that were archived from a journal, Nuix will add
the distribution list recipients to a new metadata field
called “Expanded-DL”, however, based on your Output Type
(PST, MSG or EML), the metadata may not be preserved unless
Add Distribution List Recipients is enabled under MAPI
Export Options or EML Export Options on the Lightspeed
Settings tab in Global Settings.

9. Review the settings you’ve selected for this job and click the Add Batch to
Grid button.
a. If you have more mailbox archives you need to process, select a new
list of users and click Add Batch to Grid, continuing this until all of
users have been added to a Batch on the Grid.

Warning

10.

Performance may vary, especially based on the settings
configured on the Lightspeed Settings tab in Global
Settings. Be sure to ALWAYS review settings prior to
starting Jobs.

When you are ready to begin processing, click on the Start Job button.

Nuix Email Archive Migration Manager User Guide v1.2
67

Working with Daegis AXS-One
When working with Daegis AXS-One source data, you can use NEAMM in various methods
to target the data directly on the file system in its proprietary, .PGI file format
and export it out to disk in PST, MSG or EML format. Below are several workflows
that NEAMM can handle.

Overview
The Daegis AXS-One archiving solution has two components:


DATA files on disk



SIS files on disk

Nuix must have access to the DATA and SIS store pairs while processing AXS-One
files on disk in order to correctly parse the uniquely formatted AXS-One data. AXSOne does not store any critical information related to the archive in a relational
database, like SQL.

Prerequisites
The following prerequisites must be met before AXS-One data can be processed using
Nuix:
Source data identified with a specific DATA and SIS store pairs.


If source data has been moved from the original archive storage location,
it must be clearly marked as belonging to a specific DATA and SIS store
pair.

Physical Files
It is common for an AXS-One installation to archive emails to designated partitions
called “DATA” and “SIS” stores. Together, these stores can be referred to as a
“pair” or “DATA/SIS pair.” Each DATA and SIS store created by the administrators
must have a unique storage location and can have unique settings applied to it.
Generally, the DATA partition contains the emails themselves, while the SIS
partition contains any single-instanced data. DATA and SIS stores can be “paired”
with each other, or multiple DATA stores can have one or more SIS stores.

In order to correctly process AXS-One data, the paths for each SIS store must be
presented to Nuix. Therefore, one or more SIS store must be available locally or
mapped to the Nuix server.
The path to each SIS store must be passed to Nuix at startup via the following
switch (if more than one SIS volume exists, be sure to separate them with a pipe
(|) followed by the next value):
nuix.data.axsone.sisFolders="T:\AXSONE\SIS99|U:\AXSONE\SIS22"
At the storage locations indicated for each DATA store, Daegis AXS-One data is
written to the file system in the form of a “set” of files. In this “set,” the .PGI
Nuix Email Archive Migration Manager User Guide v1.2
68

and the .DCM are the most critical for processing. The .PGI file is essentially a
pointer or map to the .DCM file. The .DCM file is a compressed container that
houses one or more messages and their attachments. Multiple “sets” may exist for
data that has archived content. A “set” is easily identifiable when files all share
the same base name.
There are other files created by AXS-One that may exist in the “set” such as in the
screenshot below; however, they are not relevant.


._HDR files contain header information from all messages within the .DCM
file.



.00n files are companion files to the .DCM.



.LCZ files, if present, are small Lucene text indexes



.SIS files contain single-instanced data, all associated to the data stored
within the .DCM

File Structure
Typical AXS-One files contain single copies of each message with attachments, as
seen below. If single-instancing has been enabled, loose attachments may also be
found in the container, such as below.

Nuix Email Archive Migration Manager User Guide v1.2
69

Single Instancing
It is possible for AXS-One admins to enable single-instancing in their archive.
When enabled, a SIS store will exist alongside the DATA store.
As of Nuix 6.0, support for single-instancing is available. Processing this data
correctly requires that ALL AXS-One SIS are presented to the Nuix Server and their
locations have been passed into Nuix using the Nuix startup switch that was
previously discussed in the “AXS-One Source Data” section of this document.
The DATA folder is where the archived messages primarily are stored. The SIS folder
is where the single-instanced attachments are stored. Nuix reads the hash of the
attachment from the email and attempts to find it in the paired SIS folder.



The SIS folder has a structure where the first few characters of the
hash of the attachment are used to nest the files.
»
For example: “W665MEFS6RUYEWYN3IHYHP2UNS.sis” would be found in
..\SISxx\2014\W6\65\ME.



The .sis_log file, located in the directory mentioned above, will
indicate the documents that reference this single-instanced item and
their location in the DATA store.

As long as the switch is enabled when Nuix is loaded with the SIS locations, Nuix
has the necessary logic hard-coded that will automatically find any singleinstanced attachments and rehydrate them with the appropriate message.

File Distribution & Naming
As of Nuix 6, AXS-One data is automatically presented as it was archived. Default
Nuix behavior is to process a single copy of each message in an AXS-One container.
This is necessary to ensure all relevant metadata can be maintained at export.
Identifying the mailbox that the message was archived in as well as other relevant
information can be identified using the AXS-One metadata which Nuix will gather
from the AXS-One wrappers [Unnamed Container].

Folders are created in a date-based format (YYYYMMDD). AXS-One file “sets” are
named in a hash-based format. For example, a file path could look like
C:\DATA99\20150201\00840119.pgi. If multiple “sets” are created on the same date,
all subsequent files will have a uniquely named hash. If multiple DATA stores
exist, multiple “sets” may get created, each uniquely named.

Note

Typical best practice is to process all AXS-One data from
the original storage locations. However, it is possible to
move the data, provided the following conditions are met:

Nuix Email Archive Migration Manager User Guide v1.2
70



Data from individual DATA stores is always kept
separate.



Any moved files are identified with their specific
DATA store and folder names are maintained.



If single-instancing is in place, SIS stores can
either be left in their original location or ALL data
must be moved. The exact folder structure must be
maintained to ensure all single-instanced data is
rehydrated correctly.

Supported Workflows
Mailbox Archive to User PST
1. In the Legacy Archive dropdown box, select Daegis AXS-One
2. Select Journal from the Archive Type radio button selections
3. Select User from the Output Type radio button selections
4. Select PST as the Lightspeed Extraction Output

Note

It is highly recommend that you select PST when your Output
Type is set to User. PSTs are much easier to manage since
emails will be grouped by User on the file system, and long
file path names issues will be avoided.

5. If a date filter is requested, enter a date in the From Date: and To Date:
date chooser field.
6. Choose whether the Source Data exists in Folders or Files.
a. If Folders or Files, use the navigation pane to select the drive letter
where the source data exists. You may also choose to Compute Batch
Size. This will allow you to obtain additional metrics while the job
is processing, like Percentage Completed and Total Bytes to process.

Tip

Warning

It is recommend that for each Job, target no more than 500
GB – 1 TB of compressed, source data per Job. This is for
optimal performance as well as an advantage in system
failure scenarios. If a system crashes mid-processing, it
would be more efficient to restart a Job that will take 24
hours to complete, versus a Job that may take 7 days to
complete.

Daegis AXS-One is currently not supported on EMC Centera
storage platforms.
Please see your Nuix representative for more details.

7. Select the location of the AXS-One SIS folders:
Nuix Email Archive Migration Manager User Guide v1.2
71

a. One or more SIS folders can be selected.
Since Daegis AXS-One does not utilize a relational database
like SQL, there is no requirement for connecting to a SQL
database, however, it is critical to select the correct SIS
folders that match the appropriate DATA folders.

Note

8. The AXS-One Settings will default to:
Name

Default Value

Description

Skip SIS
Lookups

False

If enabled, Nuix will skip SIS lookups and attempt
to process the .PGI file directly.
This should be enabled if SIS folders are not
enabled or there are issues retrieving attachments
from the SIS folders.

9. The Worker Side Script (WSS) section will allow you to:
Name

Default Value

Description

Exclude
Unresponsive
Items

True

When enabled, Nuix will not export any items that
do not respond to your Search Terms and Mapping
CSV.
If you set to False, you will need to adjust your
Mapping CSV to include an entry for:
unresponsive,unresponsive.pst

Verbose
Logging

False

When enabled, Nuix will not include any verbose
logging at the WSS-level for troubleshooting
purposes.
If set to True, this will allow for easier
troubleshooting, however, the size of the logs
will be substantially larger.

Content
Filtering

Email, RSS Feed,
Calendar, Contact

When enabled, all top-level emails (RSS Feeds
included), Calendar and Contact items will be
extracted.
If you want to filter any of these kinds out,
simply de-select the item kind you wish to
filter.

Search Terms
CSV

empty

You must select a list of Search Terms in CSV
format. Search Terms is a 2 column CSV that
includes a Flag in Column A and a Search Term in
Column B. You do not need a header column.
Each line should include an SMTP address, X400 or
X500 address similar to this. You can also add
multiple Search Terms to the same Flag, for
example:
Alex.Chatzistamatis,alex.chatzistamatis@nuix.com

Nuix Email Archive Migration Manager User Guide v1.2
72

Alex.Chatzistamatis,alex@nuix.com
Nuix will scan for these search terms across
Communication Metadata (From, To, Cc, Bcc) and
the Expanded-DL metadata field.
Mapping CSV

empty

You must select a Mapping in CSV format. Like
Search Terms, Mapping is a 2 column CSV that
includes a Flag Name in Column A and an Output
Name in Column B. You do not need a header
column.
Each line should include
Search Terms CSV and the
multiple Search Terms to
not need to add multiple
CSV, for example:

the Flag from your
Output Name. If you add
a single Flag, you do
Flags in the Mapping

Alex.Chatzistamatis,Alex.Chatzistamatis.pst

10.

Review the settings you’ve selected for this job and click the Add
Batch to Grid button.
a. If you have more mailbox archives you need to process, select a new
list of users and click Add Batch to Grid, continuing this until all of
users have been added to a Batch on the Grid.

Warning

11.

Performance may vary, especially based on the settings
configured on the Lightspeed Settings tab in Global
Settings. Be sure to ALWAYS review settings prior to
starting Jobs.

When you are ready to begin processing, click on the Start Job button.

Mailbox Archive to Flat PST, MSG or EML
1. In the Legacy Archive dropdown box, select Daegis AXS-One
2. Select Journal from the Archive Type radio button selections
3. Select Flat from the Output Type radio button selections
4. Select PST, MSG or EML as the Lightspeed Extraction Output

Note

It is highly recommend that you select PST when your Output
Type is set to User. PSTs are much easier to manage since
emails will be grouped by User on the file system, and long
file path names issues will be avoided.

5. If a date filter is requested, enter a date in the From Date: and To Date:
date chooser field.
6. Choose whether the Source Data exists in Folders or Files.
a. If Folders or Files, use the navigation pane to select the drive letter
where the source data exists. You may also choose to Compute Batch
Nuix Email Archive Migration Manager User Guide v1.2
73

Size. This will allow you to obtain additional metrics while the job
is processing, like Percentage Completed and Total Bytes to process.
It is recommend that for each Job, target no more than 500
GB – 1 TB of compressed, source data per Job. This is for
optimal performance as well as an advantage in system
failure scenarios. If a system crashes mid-processing, it
would be more efficient to restart a Job that will take 24
hours to complete, versus a Job that may take 7 days to
complete.

Tip

Warning

Daegis AXS-One is currently not supported on EMC Centera
storage platforms.
Please see your Nuix representative for more details.

7. Select the location of the AXS-One SIS folders:
a. One or more SIS folders can be selected.
Since Daegis AXS-One does not utilize a relational database
like SQL, there is no requirement for connecting to a SQL
database, however, it is critical to select the correct SIS
folders that match the appropriate DATA folders.

Note

8. The AXS-One Settings will default to:
Name

Default Value

Description

Skip SIS
Lookups

False

If enabled, Nuix will skip SIS lookups and attempt
to process the .PGI file directly.
This should be enabled if SIS folders are not
enabled or there are issues retrieving attachments
from the SIS folders.

9. Review the settings you’ve selected for this job and click the Add Batch to
Grid button.
a. If you have more mailbox archives you need to process, select a new
list of users and click Add Batch to Grid, continuing this until all of
users have been added to a Batch on the Grid.

Warning

10.

Performance may vary, especially based on the settings
configured on the Lightspeed Extraction tab in Global
Settings. Be sure to ALWAYS review settings prior to
starting Jobs.

When you are ready to begin processing, click on the Start Job button.

Nuix Email Archive Migration Manager User Guide v1.2
74

Converting Legacy Email Data
Interface Overview
Upon launching the Email Conversion module, you will see the interface as shown
below. Many of these options in this interface will be enabled/disabled based on
selections made. This is by design several may options may not be relevant or
necessary for specific source archives or workflows.

Name

Description

1

Source Data Type

Select the format of the source data (NSF).

2

Output Data Type

Select the format of the final conversion (PST, MSG
or EML)

3

Perform Top-Level
Item Deduplication

Use Redis to perform a global deduplication based on
the MD5 hash of the top-level item.

4

From/To Date

Used to filter the email data by date.

5

Custodian Source
Location

Select the location of the Source Data.

6

Mapping File

Select the Mapping File to be used by Lightspeed.

7

Select / Select By
Group ID

Select all jobs in Grid or Select by Group ID

8

Grid

Where added jobs will be displayed.

Nuix Email Archive Migration Manager User Guide v1.2
75

9

Start Job

Start the selected job in the grid for processing.

10

Export Grid to CSV

Exports out the current grid view to CSV format.

11

Lightspeed Exporter
Report Consolidator

Will consolidate all Lightspeed Exporter Metrics and
Exporter Error into a single report.

12

Reload Grid

Reloads jobs in the grid from previous migrations.

13

Global Settings

View/Change previously configured Global Settings.

Tip

Before attempting to perform any migration work, be sure to
check Global Settings and make sure that the Nuix
Directories, Lightspeed Settings and Database tabs are
configured correctly.

Performing a User NSF to User PST Conversion
1. In the Source Data Type dropdown box, select NSF
2. Select PST as the Output Data Type
3. If top-level item deduplication is necessary, enable the Perform Top-Level
Item Deduplication checkbox.
Performing Top-Level Item Deduplication requires the use of
Redis.

Note

Redis is not installed or configured by NEAMM. This must be
setup prior to using NEAMM. Please consult with your Nuix
representative to understand if this is necessary for your
project.

4. Select the directory where your NSFs are stored in the Custodian Source
Location field and click Load Source Info.
5. Select the appropriate Mapping File and click Load Mapping Data to map your
NSF to PST output.
6. If a date filter is requested, enter a date in the From Date: and To Date:
date chooser field.
a. You may add multiple date ranges using “+” sign.
7. To begin extracting all of the selected EWS Mailboxes, click Select All,
otherwise, Select By Group ID from the dropdown.

Warning

Performance may vary, especially based on the settings
configured on the Lightspeed Settings tab in Global
Settings. Be sure to ALWAYS review settings prior to
starting Jobs.

8. If a date filter is requested, enter a date in the From Date: and To Date:
date chooser field.
Nuix Email Archive Migration Manager User Guide v1.2
76

a. You may add multiple date ranges using “+” sign.
b. Click Add to Selected to add your date ranges to the selected batches.
9. When you are ready to begin processing, click on the Start Job button.

Ingesting Email Data into Exchange
Interface Overview
Upon launching the EWS Ingestion module, you will see the interface as shown below.
Many of these options in this interface will be enabled/disabled based on
selections made. This is by design several may options may not be relevant or
necessary for specific source archives or workflows.

Name

Description

1

PST Location

Select the PSTs from the File System

2

Mapping File

Select the Lightspeed Mapping File

3

Select / Select By
Group ID

Select all jobs in Grid or Select by Group ID

4

Quick Filter

Search filter on any column in the grid

5

Clear

Clears the grid

Nuix Email Archive Migration Manager User Guide v1.2
77

6

Grid

Where added jobs will be displayed

7

Start Job

Start the selected job in the grid for processing

8

Export Exceptions
to PST

Export items not uploaded to EWS to PST format

9

Export Grid to CSV

Exports out the current grid view to CSV format

10

Show Processing
Details

Shows success and exception details per custodian

11

Lightspeed Exporter
Report Consolidator

Will consolidate all Lightspeed Exporter Metrics and
Exporter Error into a single report

12

Reload Grid

Reloads jobs in the grid from previous migrations

13

Global Settings

View/Change previously configured Global Settings

Before attempting to perform any migration work, be sure to
check Global Settings and make sure that the Nuix
Directories and Exchange Web Services tab are configured
correctly.

Tip

Working with Exchange Web Services
Overview
Microsoft’s Exchange email solution allows access to software vendors using their
Exchange Web Services (EWS) API. Using this API, Microsoft provides the ability to
either extract or ingest data into EWS. An Exchange mailbox consists of two primary
components:




Mailbox
o

Information Store

o

Recoverable Items


Purges



Deletions

Archive
o

Information Store

o

Recoverable items


Purges



Deletions

In order to connect to an EWS mailbox, you must first connect with a user or
service account. After successful authentication, Nuix will be able to extract or
ingest data.

Nuix Email Archive Migration Manager User Guide v1.2
78

Prerequisites
The following prerequisites should be met in order for Nuix to interact with EWS:
EWS Service Account created


Username (SMTP address) and Password

EWS environment prepared



A valid EWS SMTP address must be available for each mailbox that will be
targeted
If targeting an EWS personal archive, be sure that it is online and
available to the mailbox

Source Data
An Exchange mailbox or archive includes two different “partitions”:
The first “partition” is the standard mailbox folder that is exposed to the
users such as: Inbox, Deleted items, Sent Items, Drafts, any customer folder
a user can create and more.
The second “partition” is the Recoverable Items folder partition which may
include: Deletions, Versions, Purges, Audits, DiscoveryHold and Calendar
Logging.
Only Exchange Administrators can access and/or view all of the different folders in
Recoverable Items. The Recoverable Items folder is hidden from users. The only
exception to this rule, is a specific folder in Recoverable Items called
“Deletions”. This “Deletions” folder does not appear in the standard mailbox
folder that the mailbox owner can see, however, the mailbox owner can access the
content of this folder by using the “Recover Deletion Items” option in Outlook.
The Recoverable Items folder contains the following subfolders:


Deletions - this subfolder contains all items deleted from the Deleted Items
folder. (In Outlook, a user can permanently delete an item by pressing
Shift+Delete.) This subfolder is exposed to users through the Recover Deleted
Items feature in Outlook and Outlook Web App.



Versions – if In-Place Hold or Litigation Hold is enabled, this subfolder
contains the original and modified copies of the deleted items. This folder
isn't visible to end users.



Purges - if either Litigation Hold or single item recovery is enabled, this
subfolder contains all items that are purged. This folder isn't visible to
end users.



Audits – if mailbox audit logging is enabled for a mailbox, this subfolder
contains the audit log entries. To learn more about mailbox audit logging,
see Mailbox audit logging.



DiscoveryHolds - if In-Place Hold is enabled, this subfolder contains all
items that meet the hold query parameters and are purged.



Calendar Logging - this subfolder contains calendar changes that occur within
a mailbox. This folder isn’t available to users.

Nuix Email Archive Migration Manager User Guide v1.2
79

Access Requirements
In order for Nuix to ingest data to EWS mailboxes, the following access
requirements will need to be provisioned:
EWS fully qualified tenant name (FQTN), such as example.onmicrosoft.com.
EWSservice accounts, such as NuixAppImp@example.com (UPN, internal/external
addresses, and passwords).
EWS test accounts, such as TestUser@example.com (UPN, internal/external
addresses, and passwords)


These should be standard mailboxes that would mirror production with a
main mailbox and/or archive, if available.

Configuration of service accounts



These accounts should be given delegate access or application
impersonation of the production mailboxes.
Multiple accounts may need to be made available depending on final
architecture specifications.

Connect to EWS via PowerShell
$cred = Get-Credential
$Session = New-PSSession -ConfigurationName Microsoft.Exchange -ConnectionUri
https://ps.outlook.com/powershell/ -Credential $cred -Authentication Basic
AllowRedirection
Import-PSSession $Session

Nuix Email Archive Migration Manager User Guide v1.2
80

Assigning Mailbox Delegation Rights to Service Accounts
Add-MailboxPermission -identity Test1@example.com -user NuixDelegate1@example.com AccessRights FullAccess

Note

Nuix will require the following PowerShell command to be
executed, which will give each upload account delegate
access to all of the mailboxes in the environment.
Get-Mailbox | Add-mailboxpermission -user
NuixDelegate1@example.com -AccessRights FullAccess

Assigning Application Impersonation to Service Accounts
New-ManagementRoleAssignment –Name: NuixImpersonation –
Role:ApplicationImpersonation –User: NuixAppImp@example.com

Tip

Nuix strongly recommends the usage of service accounts with
the Application Impersonation role when running multiple
Nuix instances across multiple systems. Service accounts
with Mailbox Delegation access will be throttled at a rate
much higher than service accounts with impersonation
enabled.
Using Impersonation also removes the need for providing
Service Accounts with FULL access to all of the Exchange
mailboxes.

Supported Workflows
Ingesting PST Data into an EWS Mailbox/Archive
1. Browse to your PSTs using the Custodian PST Location
2. When prompted to consolidate your PSTs, be sure to do so if you used Nuix to
create your PSTs. If your PSTs were provided in user folders already, there
is no need to consolidate.
3. Select the Mapping CSV which maps your PSTs to an EWS mailbox or archive.

Tip

You must select a CSV file that maps the Custodian PSTs to
the destination EWS mailbox/archive. Mapping CSV is a 5
column CSV that includes the Custodian Name in Column A, a
folder you want to place the data into in Column B, the
destination EWS partition in column C, the custodian’s
Exchange SMTP Address in Column D, and Group ID in Column E.
You do not need a header column.
Each line should look similar to this:
Alex Chatzistamatis,ARCHIVEDATA,alex.chatzistamatis@nuix.com,archive,1
The list of all possible locations for Column C include:
•

mailbox

Nuix Email Archive Migration Manager User Guide v1.2
81

•

archive

•

purges

•

archive_purges

4. To begin extracting all of the selected EWS Mailboxes, click Select All,
otherwise, Select By Group ID from the dropdown.

Warning

Performance may vary, especially based on the settings
configured on the Exchange Web Services tab in Global
Settings. Be sure to ALWAYS review settings prior to
starting Jobs.

5. When you are ready to begin processing, click on the Start Job button.
Reprocessing EWS Exceptions
Below is a list of common errors when working with EWS and proposed solutions:








microsoft.exchange.webservices.data.ServiceRequestException: The request
failed. Server responded with 500 - Internal Server Error
o

The Server 500 error is the standard throttling error.

o

Throttling can be controlled with the switches mentioned above.

java.lang.IllegalArgumentException: FTS data is too large
o

This error occurs when the data that is being pushed into EWS is larger
than the allotted size for the tenancy.

o

Size limits can be controlled with the switch mentioned above.

o

The default size will need to be increased in the tenancy; however,
Microsoft needs to approve this.

java.io.IOException: Upload failed [Mailbox has exceeded maximum mailbox size
o

This error occurs when an individual’s mailbox has exceeded its maximum
size (usually 50GB)

o

If this occurs, the processing run should be considered invalidated,
and a full run of the same exact data should occur again, AFTER the
previously ingested data has been deleted.

Skipping item for deactivated exporter for destination
“user.name@mailbox.com”
o

This error occurs when the mailbox has not been set up correctly, does
not exist, or the admin account does not have delegate access to push
information into it.

Nuix Email Archive Migration Manager User Guide v1.2
82

Extracting Email Data from Exchange
Interface Overview
Upon launching the EWS Extraction module, you will see the interface as shown
below. Many of these options in this interface will be enabled/disabled based on
selections made. This is by design several may options may not be relevant or
necessary for specific source archives or workflows.

Name

Description

1

Lightspeed
Extraction Output

Select whether the format of the extracted data (PST,
EML or MSG)

2

Custodian SMTP CSV

The list of custodian EWS mailboxes to be processed

3

From/To Date

Used to filter the email data by date

4

Select / Select By
Group ID

Select all jobs in Grid or Select by Group ID

5

Grid

Where added jobs will be displayed

6

Start Job

Start the selected job in the grid for processing

7

Export Grid to CSV

Exports out the current grid view to CSV format

8

Reload Grid

Reloads jobs in the grid from previous migrations

9

Global Settings

View/Change previously configured Global Settings

Nuix Email Archive Migration Manager User Guide v1.2
83

Before attempting to perform any migration work, be sure to
check Global Settings and make sure that the Nuix
Directories and Exchange Web Services tab are configured
correctly.

Tip

Working with Exchange Web Services
Overview
Microsoft’s Exchange email solution allows access to software vendors using their
Exchange Web Services (EWS) API. Using this API, Microsoft provides the ability to
either extract or ingest data into EWS. An Exchange mailbox consists of two primary
components:




Mailbox
o

Information Store

o

Recoverable Items


Purges



Deletions

Archive
o

Information Store

o

Recoverable items


Purges



Deletions

In order to connect to an EWS mailbox, you must first connect with a user or
service account. After successful authentication, Nuix will be able to extract or
ingest data.

Prerequisites
The following prerequisites should be met in order for Nuix to interact with EWS:
EWS Service Account created


Username (SMTP address) and Password

EWS environment prepared



A valid EWS SMTP address must be available for each mailbox that will be
targeted
If targeting an EWS personal archive, be sure that it is online and
available to the mailbox

Source Data
An Exchange mailbox or archive includes two different “partitions”:
The first “partition” is the standard mailbox folder that is exposed to the
users such as: Inbox, Deleted items, Sent Items, Drafts, any customer folder
a user can create and more.
The second “partition” is the Recoverable Items folder partition which may
Nuix Email Archive Migration Manager User Guide v1.2
84

include: Deletions, Versions, Purges, Audits, DiscoveryHold and Calendar
Logging.
Only Exchange Administrators can access and/or view all of the different folders in
Recoverable Items. The Recoverable Items folder is hidden from users. The only
exception to this rule, is a specific folder in Recoverable Items called
“Deletions”. This “Deletions” folder does not appear in the standard mailbox
folder that the mailbox owner can see, however, the mailbox owner can access the
content of this folder by using the “Recover Deletion Items” option in Outlook.
The Recoverable Items folder contains the following subfolders:


Deletions - this subfolder contains all items deleted from the Deleted Items
folder. (In Outlook, a user can permanently delete an item by pressing
Shift+Delete.) This subfolder is exposed to users through the Recover Deleted
Items feature in Outlook and Outlook Web App.



Versions – if In-Place Hold or Litigation Hold is enabled, this subfolder
contains the original and modified copies of the deleted items. This folder
isn't visible to end users.



Purges - if either Litigation Hold or single item recovery is enabled, this
subfolder contains all items that are purged. This folder isn't visible to
end users.



Audits – if mailbox audit logging is enabled for a mailbox, this subfolder
contains the audit log entries. To learn more about mailbox audit logging,
see Mailbox audit logging.



DiscoveryHolds - if In-Place Hold is enabled, this subfolder contains all
items that meet the hold query parameters and are purged.



Calendar Logging - this subfolder contains calendar changes that occur within
a mailbox. This folder isn’t available to users.

Nuix Email Archive Migration Manager User Guide v1.2
85

Access Requirements
In order for Nuix to extract data from EWS mailboxes, the following access
requirements will need to be provisioned:
EWS fully qualified tenant name (FQTN), such as example.onmicrosoft.com.
EWSservice accounts, such as NuixAppImp@example.com (UPN, internal/external
addresses, and passwords).
EWS test accounts, such as TestUser@example.com (UPN, internal/external
addresses, and passwords)


These should be standard mailboxes that would mirror production with a
main mailbox and/or archive, if available.

Configuration of service accounts



These accounts should be given delegate access or application
impersonation of the production mailboxes.
Multiple accounts may need to be made available depending on final
architecture specifications.

Connect to EWS via PowerShell
$cred = Get-Credential
$Session = New-PSSession -ConfigurationName Microsoft.Exchange -ConnectionUri
https://ps.outlook.com/powershell/ -Credential $cred -Authentication Basic
AllowRedirection
Import-PSSession $Session
Assigning Mailbox Delegation Rights to Service Accounts
Nuix Email Archive Migration Manager User Guide v1.2
86

Add-MailboxPermission -identity Test1@example.com -user NuixDelegate1@example.com AccessRights FullAccess

Note

Nuix will require the following PowerShell command to be
executed, which will give each upload account delegate
access to all of the mailboxes in the environment.
Get-Mailbox | Add-mailboxpermission -user
NuixDelegate1@example.com -AccessRights FullAccess

Assigning Application Impersonation to Service Accounts
New-ManagementRoleAssignment –Name: NuixImpersonation –
Role:ApplicationImpersonation –User: NuixAppImp@example.com

Tip

Nuix strongly recommends the usage of service accounts with
the Application Impersonation role when running multiple
Nuix instances across multiple systems. Service accounts
with Mailbox Delegation access will be throttled at a rate
much higher than service accounts with impersonation
enabled.
Using Impersonation also removes the need for providing
Service Accounts with FULL access to all of the Exchange
mailboxes.

Supported Workflows
Extracting EWS Mailbox/Archive data to a PST
1. In the Lightspeed Extraction Output dropdown box, select PST, MSG or EML

Note

Due to the complex nature of mailbox folder structures that
may exist, we recommend extracting to PST. PSTs are much
easier to manage since emails will be grouped by User on the
file system, and long file path names issues will be
avoided.

2. Browse and select a Custodian SMTP CSV.
selected the file.

Click Load SMTP Info after you’ve

You must select a CSV file that contains a list of Custodian
SMTP mailboxes and the locations you are looking to target.
Custodian SMTP CSV File is a 3 column CSV that includes the
Exchange SMTP Address in Column A, a list of locations to
target in Column B, and Group ID in Column 3. You do not
need a header column.

Tip

Each line should look similar to this:
alex.chatzistamatis@nuix.com,mailbox/mailbox_recoverable/arc
hive/archive_recoverable,1
marty.mcfly@nuix.com,mailbox/mailbox_recoverable/archive/arc
hive_recoverable,1

Nuix Email Archive Migration Manager User Guide v1.2
87

The list of all possible locations for Column B include:
•

root_mailbox

•

root_archive

•

mailbox_purges

•

archive_purges

•

mailbox_recoverable

•

archive_recoverable

•

public folders

•

mailbox/archive

•

mailbox/mailbox_recoverable

•

archive/archive_recoverable

•

mailbox/mailbox_recoverable/archive/archive_recovera
ble

3. If a date filter is requested, enter a date in the From Date: and To Date:
date chooser field.
4. To begin extracting all of the selected EWS Mailboxes, click Select All,
otherwise, Select By Group ID from the dropdown.

Warning

Performance may vary, especially based on the settings
configured on the Exchange Web Services tab in Global
Settings. Be sure to ALWAYS review settings prior to
starting Jobs.

5. When you are ready to begin processing, click on the Start Job button.

Nuix Email Archive Migration Manager User Guide v1.2
88

Nuix Email Archive Migration Manager User Guide v1.2
89

Appendix I: Backup/Restore SQL
Databases
The SQL database contains critical information associated with the email archive
software. Nuix uses information from the SQL DB to maintain single instancing as
well as extract Distribution List and BCC recipient information, otherwise
unattainable from the source data. Once the SQL database is restored on the Nuix
server, the source data and SQL DB will be targeted together during the processing
phase.

SQL Backup Workflow
Accessing SQL Server Management Studio
Targeting Desired Archive Database(s)
Configuring Desired Archive Database(s)
Assigning Database Location and Confirming Drive Space
Appropriately Naming Database and Selection Proper File Extension

Accessing SQL Server Management


Open Microsoft SQL Server Management Studio via the program menu:



At the Connect to Server window:
–
Confirm Authentication selection is “SQL Server Authentication”
–
Type in Login\Password credentials.

–

Select OK

Targeting Desired Archive Database(s)
Nuix Email Archive Migration Manager User Guide v1.2
90



In Microsoft SQL Server Management Studio:
–
Expand Database.
–
Right click on desired database to backup and select Tasks\Back Up…\.

Configuring Desired Archive Database(s)


At the Back Up Database window:
–
Confirm “Source Database” name
–
“Backup Type” is Full
–
Confirm “Backup Set” name matches Source Database name with –Full
Database Backup following
–
Leave “Backup set will expire:” at the default, which is 0
–
Select “Add” in the “Destination” section

Nuix Email Archive Migration Manager User Guide v1.2
91

Assigning Database Location and Confirming Drive Space


At the Selection Backup Destination screen:
–
Select File name and then the ellipsis to set desired location.



At the Locate Database Files screen:
–
Set a location that will store the data by expanding the appropriate
drive.

Nuix Email Archive Migration Manager User Guide v1.2
92

Correctly naming the Database and choosing the proper
File Extension


Name the file name according to the database name and include .bak after the
file name.





Select OK.
Select OK at the Select Backup Destination screen.
Select OK at the Back Up Database screen.

SQL Restore Workflow
Accessing SQL Server Management Studio
Restoring Desired Archive Database(s)

Accessing SQL Server Management


Open Microsoft SQL Server Management Studio via the program menu:



At the Connect to Server window:
–
Confirm Authentication selection is “SQL Server Authentication”

Nuix Email Archive Migration Manager User Guide v1.2
93

–



Type in Login\Password credentials

Select OK.

Restoring Desired Archive Database(s)


In Microsoft SQL Server Management Studio:
–
Expand Database
–
Right click on desired database to backup and select Tasks\Restore
Database…\

Configuring Desired Archive Database(s)


At the Restore Database window:
–
Select the database you wish to restore in the “From device:” section.
–
Select the database name you wish to apply to this database in the “To
database:” section.
–
Confirm the database restore by selecting the checkbox.



You can configure Restore Options by choosing “Options” in the left pane.

Nuix Email Archive Migration Manager User Guide v1.2
94



Click Ok

Nuix Email Archive Migration Manager User Guide v1.2
95

Appendix II: Archive SQL Queries
Veritas Enterprise Vault
The following queries are hard-coded directly into the Nuix Engine when handling
Enterprise Vault data.
Main SQL queries used during Nuix processing of Symantec Enterprise Vault data to
confirm that the Vault Store databases exist and the Vault Store Partitions have a
valid path on the system:
"SELECT DatabaseDSN, VaultStoreIdentity, VaultStoreEntryId " +
"FROM [" + directoryDatabase + "].[dbo].[VaultStoreEntry] " +
"ORDER BY VaultStoreIdentity";
"SELECT IdPartition, VaultStoreEntryId, PartitionRootPath " +
"FROM [" + directoryDatabase + "].[dbo].[PartitionEntry] " +
"WHERE VaultStoreEntryId = ?";
SQL query used during Nuix processing of Symantec Enterprise Vault data to find all
associated parent Transaction IDs, SIS Parts (DVSSP files) to reconstitute the
message and all single-instanced attachments:
"SELECT sp.ParentTransactionId, sp.IdPartition, sp.VaultStoreIdentity,
s.CollectionIdentity, sp.ArchivedDateUTC " +
"FROM Saveset s, Saveset_SISPart ssp, SISPart sp " +
"WHERE s.SavesetIdentity = ssp.SavesetIdentity AND ssp.SISPartIdentity =
sp.SISPartIdentity " +
" AND sp.FPDistinctionByte = ? AND sp.FPHashPart1 = ? AND s.IdTransaction = ?" +
"ORDER BY s.CollectionIdentity DESC";
SQL query used during Nuix processing of Symantec Enterprise Vault data to find the
folder name and path that the item is archived in:
"SELECT af.FolderName, af.FolderPath " +
"FROM" +
" [" + directoryDatabase + "].dbo.ArchiveFolder af," +
" [" + directoryDatabase + "].dbo.[Root] r," +
" [" + vaultStoreEntry.databaseName + "].dbo.Vault v," +
" [" + vaultStoreEntry.databaseName + "].dbo.Saveset s " +
"WHERE" +
" s.VaultIdentity = v.VaultIdentity AND" +
" v.VaultID = r.VaultEntryId AND" +
" r.RootIdentity = af.RootIdentity AND" +
" s.IdTransaction = ?";
SQL query used during Nuix processing of Symantec Enterprise Vault data to find the
user Exchange/AD details for each transaction:
"SELECT eme.mbxNtUser, eme.ADMbxDN " +
"FROM " +
" [" + directoryDatabase + "].[dbo].[ExchangeMailboxEntry] eme, " +
" [" + vaultStoreEntry.databaseName + "].[dbo].[view_Saveset_Archive_Vault] sav "
+
"WHERE " +
" eme.DefaultVaultId = sav.VaultId AND " +
Nuix Email Archive Migration Manager User Guide v1.2
96

" sav.IdTransaction = ?";
SQL query used during Nuix processing of Symantec Enterprise Vault data to find the
archive details (Archived Data, Vault Store Entry ID, Archive Name, Archive
Description, Archive Point ID, Saveset ID) for each transaction:
"SELECT " +
" [" + vaultStoreEntry.databaseName +
"].dbo.view_Saveset_Archive_Vault.ArchivedDate, " +
" [" + directoryDatabase + "].dbo.ArchiveView.VaultStoreEntryId, " +
" [" + directoryDatabase + "].dbo.ArchiveView.ArchiveName, " +
" [" + directoryDatabase + "].dbo.ArchiveView.ArchiveDescription, " +
" [" + directoryDatabase + "].dbo.ArchiveView.[SID] " +
"FROM" +
" [" + directoryDatabase + "].dbo.ArchiveView " +
"INNER JOIN " +
" [" + vaultStoreEntry.databaseName + "].dbo.view_Saveset_Archive_Vault " +
" ON" +
" [" + vaultStoreEntry.databaseName +
"].dbo.view_Saveset_Archive_Vault.ArchivePointId = [" + directoryDatabase +
"].dbo.ArchiveView.VaultentryId " +
"WHERE" +
" [" + vaultStoreEntry.databaseName +
"].dbo.view_Saveset_Archive_Vault.IdTransaction = ?";

EMC EmailXtender
Main SQL query used during Nuix processing of EMC EmailXtender data to lookup
distribution list recipients and append to ‘Expanded-DL’ metadata property:
"SELECT [EmailAddress]" +
" FROM [" + databaseName + "].[dbo].[EmailAddress] " +
" WHERE [EmailId] IN (" +
" SELECT [EmailId] " +
" FROM [" + databaseName + "].[dbo].[Route]" +
" WHERE [MD5HashKey] = ? AND [RouteTypeId] = ?) ";
Additional notes:




‘MD5HashKey’ is equal to the value of the ‘Xtender Hash Key’ located in an
email’s metadata.
Here is a list of all the ‘RouteTypeId’ options that are available in
EmailXtender:
–
0 = All Recipients
–
1 = To Recipients
–
2 = From (sender)
–
4 = CC Recipients
–
8 = BCC Recipients
–
16 = Distribution List
–
32 = Discovered
–
64 = Routeable (DL Recipients)

Nuix Email Archive Migration Manager User Guide v1.2
97

EMC SourceOne
Main SQL query used during Nuix processing of EMC SourceOne data to lookup
distribution list recipients and append to ‘Expanded-DL’ metadata property:
"SELECT [EmailAddress]" +
" FROM [" + databaseName + "].[dbo].[EmailAddress] " +
" WHERE [EmailId] IN (" +
" SELECT [EmailId] " +
" FROM [" + databaseName + "].[dbo].[Route]" +
" WHERE [MessageId] = ? AND [RouteType] = ?) ";
Additional notes:




‘MessageId’ is equal to the value of the ‘0x’ + ‘Xtender Hash Key’ located
in an email’s metadata.
Here is a list of all the ‘RouteType’ options that are available in
SourceOne:
–
1 = To Recipients
–
2 = From (sender)
–
3 = CC Recipients
–
4 = BCC Recipients
–
5 = Distribution List
–
6 = Routeable (DL Recipients)

HP/Autonomy Zantaz EAS
Main SQL query used during Nuix processing of Zantaz EAS data to get the Archive
ID:
filename.contains("%") ? "LIKE " : "= ";
"SELECT TOP 1 DATAARCHIVEID
FROM easadmin.dataarchive
WHERE filename " + like + "?
AND docserverid = ?";
SQL query used during Nuix processing of Zantaz EAS to get embedded Centera IDs:
"SELECT pl.msgid, pl.dataarchiveid, r.userid, r.folderid " +
"FROM easadmin.dataarchive da " +
"JOIN easadmin.profilelocation pl " +
" ON da.dataarchiveid = pl.dataarchiveid " +
"JOIN dbo.refer r " +
" ON pl.msgid = r.msgid " +
"WHERE da.filename = ? AND docserverid = ?";
SQL query used during Nuix processing of Zantaz EAS to get embedded IDs:
"SELECT pl.msgid, r.userid, r.folderid " +
"FROM easadmin.profilelocation pl " +
"JOIN dbo.refer r" +

Nuix Email Archive Migration Manager User Guide v1.2
98

" ON pl.msgid = r.msgid " +
"WHERE pl.dataarchiveid = ?";
SQL query used during Nuix processing of Zantaz EAS to get MsgId:
"SELECT pl.msgid, pl.dataarchiveid, r.userid, r.folderid " +
"FROM easadmin.profilelocation pl " +
"LEFT JOIN dbo.refer r" +
" ON pl.msgid = r.msgid " +
"LEFT JOIN easadmin.dataarchive da" +
" ON pl.dataarchiveid = da.dataarchiveid " +
"WHERE pl.msgid = ? AND da.docserverid = ?";
SQL query used during Nuix processing of Zantaz EAS to get MAPI metadata:
"SELECT " + "msgread, categories, pr_importance, msgflagtext, flagstatus,
flagcomplete, contacts, pr_expiry_time, replytime " +
// ",flagdueby, flagduebynext, remindersettings " +
"FROM dbo.refer " +
"WHERE msgid = ? AND userid = ? AND folderid = ?";
SQL query used during Nuix processing of Zantaz EAS to get folder structure:
"SELECT foldername FROM dbo.FOLDER WHERE folderid = ?";
SQL query used during Nuix processing of Zantaz EAS to get user information:
"SELECT OBJDISTNAME, USERNAME " +
"FROM dbo.USERS " +
"WHERE userid = ?";
SQL query used during Nuix processing of Zantaz EAS to get recipients:
"SELECT a.emailaddress, m.typefld " +
"FROM dbo.EMAILMESSAGES m, dbo.EMAILADDRESSES a " +
"WHERE a.emailid = m.emailid AND m.msgid = ?";
SQL query used during Nuix processing of Zantaz EAS to get message region:
"SELECT * FROM easadmin.profilelocation WHERE msgid = ? AND dataarchiveid = ?";

Nuix Email Archive Migration Manager User Guide v1.2
99

Appendix III: Archive Metadata
Veritas Enterprise Vault

EMC EmailXtender/Source

HP/Autonomy Zantaz EAS

Nuix Email Archive Migration Manager User Guide v1.2
100

Daegis AXS-One

Nuix Email Archive Migration Manager User Guide v1.2
101

Appendix IV: EWS Best Practices
Leveraging Azure Virtual Machines
Whether extracting data from O365 or ingesting data into O365, it is highly
recommended to provision Azure servers for the Nuix workflow. Azure and O365 are
essentially hosted in the same Microsoft data centers, which remove many of the
complications that would be present if perming the extraction or ingestion from an
on-premise Nuix server. These complications include server-level and network-level
throttling (mentioned below), but also bandwidth limitations and the network
distance data will have to travel in order for it to be extracted or ingested.
Multiple Azure servers can be used into order to scale Nuix vertically on one
machine and horizontally on multiple machines to achieve the desired throughput.

Note

Consult the Azure pricing guide to determine the appropriate
Azure environment size: https://azure.microsoft.com/en-

us/pricing/calculator/

Reduced Number of Nuix Workers
Keeping the number of Nuix workers to a minimum during extractions or ingestions is
detrimental to your overall performance. Too many workers can create excess
connections and create throttling.

Tip

Nuix strongly recommends using 1, 2 or 4 workers when
ingesting to or extracting from EWS. More than 4 workers
per Nuix instance may cause increased levels of throttling
which will cause the ingestion/extraction to take longer and
may also lead to a higher than normal number of failed
items.

EWS Throttling Workarounds
It is well documented that Microsoft will throttle connection uploads and downloads
into a particular environment based off of:


IP Address



Mailbox



Connection account



Number of connections / amount of data being pushed at once

To workaround this throttling, it is recommended to:


Adjust the Exchange Web Services Settings in your Global Settings
accordingly.



Use single item downloads or uploads over bulk downloads or uploads.



Use Application Impersonation service accounts over Delegate accounts.

Nuix Email Archive Migration Manager User Guide v1.2
102



Scale Nuix vertically on a single machine and horizontally across multiple
machines in Azure.



Consolidate your data per custodian if ingesting data, and push this data
into O365 from a single Nuix instance.



Never overlap data across multiple Nuix instances. This overlap will cause
unnecessary throttling, which in turn will increase the likelihood of
exceptions.

Nuix Email Archive Migration Manager User Guide v1.2
103

About Nuix
Nuix (www.nuix.com) protects, informs, and empowers society in the knowledge age.
Leading organizations around the world turn to Nuix when they need fast, accurate
answers for investigation, cybersecurity incident response, insider threats,
litigation, regulation, privacy, risk management, and other essential challenges.

Nuix Email Archive Migration Manager User Guide v1.2
104



Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.5
Linearized                      : No
Page Count                      : 104
Language                        : en-US
Tagged PDF                      : Yes
Title                           : Email Archive Migration Manager
Author                          : Michael Fowler
Creator                         : Microsoft® Word 2016
Create Date                     : 2019:01:24 14:48:14-05:00
Modify Date                     : 2019:01:24 14:48:14-05:00
Producer                        : Microsoft® Word 2016
EXIF Metadata provided by EXIF.tools

Navigation menu