Guide To AccessData EDiscovery_MapsEDRM_3 4 10[1] AD E Discovery Implementing In House Solution Maps EDRM

2012-09-27

: Pdf Guide To Ad Ediscovery Implementing In-House Solution Maps Edrm Guide_to_AD_eDiscovery_Implementing_In-House_Solution_Maps_EDRM

Open the PDF directly: View PDF .
Page Count: 12

AccessData Corporation

Implementing an In-house

eDiscovery Solution that Maps

to the EDRM:

A Guide to AccessData eDiscovery, Its Components

and How It Addresses the Entire eDiscovery Lifecycle

White Paper

Contents

Introduction .................................................................................................................................................................. 1

Requirements for an Effective In-house Solution ..................................................................................................... 2

Walking Organizations through the EDRM Lifecycle, while Eliminating Excessive Costs ................................... 3

Information Management ........................................................................................................................................... 3

Litigation Hold Notifications and Tracking .................................................................................................................. 4

Identification and Pre- and Post-Collection Early Case Assessment ......................................................................... 4

Collection and Preservation ....................................................................................................................................... 5

Processing and Deduplication ................................................................................................................................... 6

First –pass Review Facilitates Early Case Assessment and Accuracy of Data Sets Produced ................................. 7

Producing Load Files and Reduced Data Sets for Third-party Review Tools ............................................................ 8

Technical Overview...................................................................................................................................................... 8

The Components ....................................................................................................................................................... 9

Conclusion ................................................................................................................................................................. 10



1|Page



Introduction

Traditionally, electronic document production “eDiscovery” was considered a manual, time-consuming task—and an expensive one.

During the eDiscovery process, companies must identify, collect, preserve, process, review, analyze, produce and present any and all

electronically stored information (ESI) that might be relevant to a specific matter. The Electronic Discovery Reference Model (EDRM)1

was developed to guide law firms, corporations and consultants through this process. While service providers and various software

companies have come out with offerings that address pieces of this process, very few have introduced a solution that addresses the

entire eDiscovery lifecycle.

Figure 1

Some companies hire consulting firms to address eDiscovery, costing hundreds of dollars per hour and thousands of dollars per

gigabyte. In other cases, companies unable to handle the cost burden associated with eDiscovery have had untrained IT employees

manually collect the data. This exposes those organizations to a great deal of risk, because if collection is done incorrectly, companies

can be fined heavily, and in extreme cases, opposing counsel is given full access to the entire network.

Others try to cut costs by using in-house solutions, which can generate significant cost savings. With an in-house solution, typically

litigation support personnel, from within the IT department, will perform the initial collections and cull the data down using search

criteria provided by the attorneys. These solutions can be very difficult to use and only IT personnel are able to perform any of the

operations, which does nothing to address the IT- Legal collaboration challenges faced by organizations today. However, the in-house,

end-to-end solution can be the most cost-effective and operationally effective option for organizations looking to gain control over their

ESI.

The goal of the AccessData (AD) eDiscovery solution is to allow both IT and Legal to work together to perform the eDiscovery process

in a simple, easy-to-use and automated fashion. In order for an in-house solution to effectively address the operational, organizational

and cost issues associated with eDiscovery, it must meet several requirements.

This paper will review those requirements, discuss how AccessData eDiscovery meets those requirements, and provide a technical

overview of the solution’s architecture.





1 www.edrm.net



2|Page



Requirements for an Effective In-house Solution

In order for an organization to successfully address the entire eDiscovery lifecycle, while significantly reducing cost and risk, it is

necessary to implement an in-house solution that meets several requirements. In addition to, of course, being forensically sound

technology with a history of court acceptance, the solution must also offer the following:

The solution must integrate with and/or support the organization’s existing IT infrastructure.

When an in-house solution integrates with the existing infrastructure it facilitates the authentication of users and the identification and

selection of custodians. In addition, the solution must support an organization’s encryption technologies in order to automatically

access encrypted machines, as well as to identify and collect password-protected and encrypted documents. Finally, it must be

capable of searching and collecting not only the unstructured data, but data from the organization’s semi-structured and structured data

repositories, such as Exchange servers, SharePoint, Enterprise Vault , Documentum or Lotus Domino.

A Word About Unstructured and Structured Data

Unstructured Data

Unstructured data has no model or form to it and is free flowing. In the eDiscovery world this would be the most common form

of data. It would include all data stored on desktops, laptops, file shares, external hard drives, thumb drives and even optical

media.

Structured Data

Structured data is data contained within a data model, such as an email within an Exchange server or a credit card number

stored within an Oracle database or documents stored in a document repository. These types of data sets can be difficult to

collect as they require extraction and preservation of the usable data from the data store, making a defensible and repeatable

process expensive and difficult to accomplish.

The solution must be automated.

An automated, in-house solution facilitates a repeatable, defensible eDiscovery process, reducing the opportunity for human error and

enabling consistent compliance with internal policies and the Federal Rules of Civil Procedure. The solution must be capable of

performing automated network-wide audits/collections with the option of scheduling recurring collections for a specific ongoing matter.

Due to the possibility of connectivity issues and the fact that custodians are not always present on the network, the solution must also

have an auto-restart function, which will automatically pick up a collection where it left off, while identifying any files or emails that have

been altered since the last collection attempt. The solution must provide meaningful workflow that drives users from one step of the

process to the next in an automated fashion. Finally, all operations must be automatically logged for chain of custody auditing purposes

and the reporting must be extensive to illustrate every action taken and every result.

The solution must provide sophisticated search options.

In order to carve out the most accurate set of potentially relevant data while reducing risk, a more sophisticated approach to searching

is required. In addition to simple keyword and date range searches, an organization can further refine its set of data by using more

advanced options, such as fuzzy, proximity, synonym or stemming searches. In addition, for the sake of expediency and accuracy, the

solution must provide relevancy ranking, as well as the ability to “test” search criteria.

The solution must have extensive email and file support .

It is critical for an in-house eDiscovery solution to provide broad file support and for it to be powered by a forensic engine that can

handle extremely large data sets, while dealing with the difficulties different types of data present.

The solution must be easy to use with real-world workflow built in.

Every player in the engagement should be able to utilize the solution to perform his or her specific tasks—IT personnel, in-house

counsel, paralegals, and outside counsel, if necessary. If the solution relies on scripts to operate, it may be more difficult to use and

likely cannot be used by anyone other than an IT professional. Organizations can benefit from “wizard-driven” solutions that guide

users through each step of the EDRM process, without the use of scripts. However, the workflow provided by the solution must map to

the way the organization actually approaches eDiscovery. In addition, a web interface can streamline the overall process, enabling all

parties involved to review and comment on the data from any location.



3|Page



Figure 2: AD eDiscovery Dashboard

Furthermore, a solution that is easy to use requires minimal training, in order for the organization to get up and running effectively with

the technology. Many in-house solutions are simply too difficult for everybody in an organization to use effectively. In many cases the

organization finds itself paying for professional services from the vendor who sold them the in-house solution, because nobody at the

organization is able to operate the solution on his or her own.

The solution must provide extensive reporting.

The following are examples of the types of reports useful to an organization performing its own eDiscovery in house:

• Matter summaries: to see statistics of all collections across a matter with the ability to drill down into individual collection metrics

• Collection summaries: to see values per collection and per custodian with rates, file breakout and collected files with metadata

• Deduplication reporting: so users can view detailed deduping information for all documents and email, even on massive data sets

• File breakout and processing exceptions: to show processed and encrypted data with metrics per custodian and across the matters,

flagging any documents that failed to process

• Search summaries: showing when the search took place, who performed it, and what items were searched per custodian/per asset

• Detailed search reporting: to see the names of files that had hits, how many total hits per file, hit count by category, display keywords

(terms) and search options, what filters were applied, variations of keywords searched for

The solution must offer multiple exporting options.

In addition to being able to export load files for popular review platforms, (i.e. Concordance and Summation) the solution must allow for

easy export of data in native format and other commonly required formats.

Walking Organizations through the EDRM Lifecycle, while Eliminating Excessive Costs

AD eDiscovery meets the above requirements, walking

the user through the eDiscovery lifecycle with an easy-to-

use, wizard-driven interface. From early case

assessment to producing the relevant data set, users

can address electronic discovery in house, without the

costly help of service providers.

Utilizing a hybrid of on-demand searching and a pre-

indexed approach, AD eDiscovery enables the

identification and collection of responsive ESI, including

unstructured, semi-structured and structured data.

The solution allows in-house counsel and litigation

support personnel to work together to effectively

address eDiscovery in a streamlined, automated

manner. AD eDiscovery is easiest enough to use that

every player in the eDiscovery engagement can utilize

the technology to perform his or her specific tasks.

For example, litigation support can perform collection and processing operations, while, via a secure web interface, in-house counsel

reviews and comments on the results, and a paralegal logs in to track the status of all operations. In fact, the interface is so intuitive

that, if policies allow, in-house counsel and/or paralegals can test search criteria and perform processing and analysis operations

without the aid of more technical IT personnel. The solution is designed for ease of use, in order to facilitate collaboration among all

parties involved.

Information Management

Proactive information management is critical to accomplishing an efficient electronic discovery process—and to mitigating an

organization’s exposure. Ensuring that records retention policies are enforced is prudent and easy with AccessData eDiscovery’s

automated auditing capabilities. Organizations can periodically search assets for documents and email that are outside the parameters

of the retention policy and remediate from a central location. In fact, organizations can schedule periodic audits with AccessData

eDiscovery, thereby implementing a consistent, documented approach to records retention enforcement.



4|Page



Litigation Hold Notifications and Tracking

Automated litigation hold capabilities are integrated into AccessData eDiscovery. Users can send litigation hold alerts to relevant

custodians and track which custodians have viewed the alerts. The process is wizard driven and allows the user to attach the

organization's standard litigation hold documents to the email notification. This integrated capability not only makes the entire

eDiscovery process more efficient, but it dramatically reduces error, as the litigation hold wizard requires the user to complete a series

of set tasks before he or she is able to send the litigation hold notifications.

• Add/remove custodians from an existing hold.

• View real-time status of hold notifications.

• Single sign-on portal for custodians and IT to confirm holds and view hold status.

• Interview custodians and data owners.

Identification and Pre- and Post-Collection Early Case Assessment

AD eDiscovery’s early case assessment auditing capability allows litigation support personnel and/or in-house counsel to actually test

search criteria to see what the results will be prior to collecting. This allows in-house counsel to refine the search criteria to ensure that

only the optimal data set will be collected and to better understand the data prior to collection.

However, AD eDiscovery also enables an organization to immediately, upon receiving a litigation hold notice, collect and preserve all

user created documents, email, etcetera, based on a variety of criteria in preparation for possible litigation. This means that the

organization can test search criteria prior to collecting, or it can collect everything and test search criteria on a preserved and indexed

data set, using more sophisticated search techniques. Either way, the organization eliminates hourly service provider fees by handling

these steps in house. (See Figure 3 on next page.)

AccessData eDiscovery provides a number of sophisticated search methods that will greatly enhance an organization’s ability to

perform effective early case assessment, while greatly reducing the amount of irrelevant data that is produced. For more information on

AccessData’s search and early case assessment functionality read “Gaining the Advantage through Early Case Assessment:

Sophisticated Searching Techniques to Assess Exposure and Save Money”.

Fuzzy

Fuzzy searches enable the attorney find documents with a keyword even if that keyword has been misspelled. Note, AccessData

technology even allows the attorney to adjust the "fuzziness" of the search to reduce false positives.

Stemming

A search with stemming finds grammatical variations of keywords by reducingi words to their stem, base or root form. For example,

toggling the stemming option with the keyword "apply~" will match any document that contains the words "apply", "applies" or "applied".

Phonic

Phonic searches find keywords that sound like—but don't necessarily have the same meaning as—the designated keyword. For example,

a phonic-enabled search for "fryer" would return all documents containing the word "fryer" and "friar".

Synonym

Synonym searches find documents with words that match the keyword and words that have a similar meaning. For example, a synonym-

enabled search for the keyword "idle" would return all documents containing the words "jobless", "unemployed" and so on.

Proximity

Proximity searches are those that identify documents with one keyword within a certain number of words from another keyword. For

example, a proximity search for "patent w/5 infringement" are much more likely to return documents with "patent infringement" and

"infringement of a patent", but likely exclude documents containing "infringement" and "patently wrong".

Regular Expressions

Regular expressions provide a concise and flexible means for identifying strings of text of interest, such as particular characters, words or

patterns of characters (such as a social security number). The expression itself is written in a formal language that enables very precise

searching of text-based documents. For example, all Visa credit card numbers start with a "4" but new cards have 16 digits and old cards

have 13. Thus, a regular expression that would find Visa credit card numbers in a document would be:

\<4\d\d\d[\-\. ](\d\d\d\d[\-\. ]){2}\d\d\d\d\>



5|Page



Figure 3: Estimated service provider fees for each phase of the EDRM

Collection and Preservation

Once the search criteria have been finalized, the designated authority approves the collection and executes the collection operation. It

is important to note that AD eDiscovery allows for identification and collection of both unstructured and structured data. AccessData’s

connectors for structured data, such as Lotus Notes, Microsoft Exchange, Symantec Vault, Documentum and Microsoft SharePoint,

allow users to search for, collect and preserve structured data in a defensible, repeatable manner, without it being a cumbersome and

expensive task.

Distributed collection capabilities allow the organization to utilize multiple computers and assign specific collection operations in various

locations to increase the speed and efficiency of collections in complex environments. Users can see real-time status of a collection by

asset, including percent complete status, time to complete statistics, what file is a being collected, how many files have been collected

and total bytes collected by asset or across the entire collection.

Collecting from custodians who work outside the office or travel frequently can prove to be quite challenging, as their laptops are not

always logged into the network. With AD eDiscovery, laptops with agents can be dynamically routed to optimal collection points. For

example, with dynamic locality, users as they move from New York to London are automatically routed to the optimal collection point,

and collections will resume at the point of failure without having to start over. When that custodian’s machine has logged off the New

York network and then comes online again in London, it will automatically restart the collection in London, picking up where it left off

with no user intervention. Any files that were altered since the last collection attempt will also be collected.

Figure 4: Collection Details



6|Page



Processing and Deduplication

AccessData eDiscovery supports Lotus Notes NSF, Outlook PST/OST, Exchange EDB, Outlook Express DBX, Eudora, Netscape,

AOL, RFC 833, and EML formats, such as Microsoft Internet mail, Earthlink, Thunderbird, QuickMail, as well as 400 different file types,

providing the broadest support when it comes to handling of ESI. The wizard-driven processing functionality gives users control of how

data is processed. The processing engine opens up compound documents, email archives, deduplicates, categorizes, identifies

protected files, and indexes the data in preparation for review and export. Powered by FTK®, the processing engine has been “battle-

tested” to handle the largest and most complex datasets.

Figure 6: Processing options available via FTK platform when using FTK mode.



Figure 5: eDiscovery processing dialog providing options from “eDiscovery mode” with full indexing, or “field mode” for faster processing.



7|Page



First –pass Review Facilitates Early Case Assessment and Accuracy of Data Sets Produced

Without the need for expensive consultants, organizations are able to automatically identify, collect and preserve ESI, then process the

data for analysis and what is called “first pass review”. Unlike most in-house solutions, AD eDiscovery allows for in-house counsel and

/or litigation support personnel to engage in a first-pass review of the culled data to ensure that the most accurate set of potentially

relevant data is produced.

FIGURE 7: Silverlight, web-based review platform

AD eDiscovery’s third-generation, Web-based Silverlight review platform delivers advanced analytics and collaborative review of

electronically stored information for the purposes of early case assessment and/or final approval of the data set to be exported.

— Custom Data Views: Enables users to define exactly what reviewers can see. For example, only email from Custodian # 4 can be seen by

Reviewers A and B.

— Email Discussion Threads: links together related messages into chronological threads which display entire discussions, including all

replies, carbon copies and forwards. By following the thread, you can quickly identify all the participants, and determine who knew what. In

addition, it has thread suppression support and the ability to tag any items up or down the chain.

— Detailed Reporting

o Matter, collection and search summaries

o Deduplication

o File breakouts and processing exceptions

o Detailed search reports

— Tagging/Labeling Options for Custom and Bulk Tagging: with annotation pop-ups and three-way (include/exclude/neutral) label filters

to quickly toggle between coded and un-coded documents.

— Bookmark Items into Categories and Include Comments

— Hit Highlighting: Highlights search terms in files, emails and attachments, allowing you to quickly and easily find what you are looking for

without having to read every single word. Users can view hit highlights in the 400+ different data types we support in both natural and text

view.

— Split-screen Support: users can view item on the left and file contents on the right



8|Page



— Enhanced Search Capabilities:

o Relevancy ranking, based on automatic term weighting.

o Fuzzy searching with a slider.

o Stemming, related words, phonic, wildcard, proximity, and concept.

o Nearly all search types can be combined to make your search as complex as you want.

o Easy –to-use search help panel.

— Automatic Data Categorization/Classification: per custodian or across the entire matter

Producing Load Files and Reduced Data Sets for Third-party Review Tools

The solution provides organizations the ability to create reduced (culled) data sets in a number of different formats, including native

format organized by custodian or as a single instance with reduced PST or NSF. Also, user have the ability to create load files for

export to popular third-party review tools, including Concordance, EDRM XML, ICONECT, Summation and Introspect.

Technical Overview

AccessData eDiscovery is a designed to be a distributed environment, with multiple components able to handle unstructured, semi-

structured and structured sources.

Figure 7: eDiscovery Workflow



9|Page



Figure 8: AD eDiscovery Architecture

The Components

Application Server

— manages workflow and eDiscovery operations within the application

Web Server

— provides web services for users to drive eDiscovery operations within the application

— hosts website for first pass review data modeling

Collection Worker(s)

— The work administrator service that does the actual search and forensic-level collection from data sources (structured/unstructured/semi-

structured)

— designed to scale up and out

Processing Worker(s) (FTK)

— performs the post-collection processing of data

— expands archives (PSTs/NSFs), indexes, de-duplication, file identification, secondary culling/filtering, and production

— scales up and out

Processing Database (Oracle)

— facilitates secondary culling/filtering, data modeling, searching, deduplication and production

— scales up to handle massive amounts of data

Orchestration and Logging Database (SQL)

— tracks all eDiscovery matters, workflows and operations

IDOL Worker

— index engine and structured data connectors from Autonomy for file identification

Agent

— a service that runs on target nodes, providing secure forensic-level access to and preservation of ESI

External Storage

— stores collected/preserved and culled data



10|Page



Conclusion

AccessData eDiscovery facilitates a project management approach to electronic discovery and streamlines eDiscovery processes from

the point of identification all the way to production. By enabling all players in the eDiscovery engagement to utilize the same platform,

the collaboration challenges IT and Legal normally face can be easily addressed.

AD eDiscovery enables automated, targeted searches and the forensic collection of only potentially relevant data from multiple

custodians simultaneously, all from a central location. With state of the art workflow, AD eDiscovery provides absolute control, matter

delivery, and custodian and collection management capabilities. It gives visibility into all ESI across the enterprise and allows for

proactive enforcement of records retention policies via automated auditing for easy identification, flagging and/or deleting of

noncompliant data. These capabilities are key to optimal litigation preparedness and proactive enforcement, and result in massive time

reduction and cost savings.

About AccessData®

AccessData has pioneered digital investigations for more than twenty years. Its automated, in-house eDiscovery solution allows users

to identify, collect, forensically preserve, process, deduplicate and produce electronically stored data. It allows users to collect from

laptops, desktops, databases, servers, email and more than 30 popular structured data repositories. Additionally, AccessData

eDiscovery enables large-scale auditing to detect data leakage and files that aren’t in compliance with an organization’s retention

policies. For more information visit: www.eDiscoveryWithAccessData.com

written permission. All other marks and brands may be claimed as the property of their respective owners.





Guide To AccessData EDiscovery_MapsEDRM_3 4 10[1] AD E Discovery Implementing In House Solution Maps EDRM

Navigation menu

Versions of this User Manual:

Views

Navigation