Big Data Protector Guide 6.6.5
User Manual:
Open the PDF directly: View PDF
Page Count: 259 [warning: Documents this large are best viewed by clicking the View PDF Link!]
- Copyright
- Contents
- 1 Introduction to this Guide
- 2 Overview of the Big Data Protector
- 3 Installing and Uninstalling Big Data Protector
- 3.1 Installing Big Data Protector on a Cluster
- 3.1.1 Verifying Prerequisites for Installing Big Data Protector
- 3.1.2 Extracting Files from the Installation Package
- 3.1.3 Updating the BDP.config File
- 3.1.4 Installing Big Data Protector
- 3.1.5 Applying Patches
- 3.1.6 Installing the DFSFP Service
- 3.1.7 Configuring HDFSFP
- 3.1.8 Configuring HBase
- 3.1.9 Configuring Impala
- 3.1.10 Configuring HAWQ
- 3.1.11 Configuring Spark
- 3.2 Installing or Uninstalling Big Data Protector on Specific Nodes
- 3.3 Utilities
- 3.4 Uninstalling Big Data Protector from a Cluster
- 3.4.1 Verifying the Prerequisites for Uninstalling Big Data Protector
- 3.4.2 Removing the Cluster from the ESA
- 3.4.3 Uninstalling Big Data Protector from the Cluster
- 3.4.3.1 Removing HDFSFP Configuration for Yarn (MRv2)
- 3.4.3.2 Removing HDFSFP Configuration for MapReduce, v1 (MRv1)
- 3.4.3.3 Removing Configuration for Hive Protector if HDFSFP is not Installed
- 3.4.3.4 Removing Configurations for Hive Support in HDFSFP
- 3.4.3.5 Removing the Configuration Properties when HDFSFP is not Installed
- 3.4.3.6 Removing HBase Configuration
- 3.4.3.7 Removing the Defined Impala UDFs
- 3.4.3.8 Removing the Defined HAWQ UDFs
- 3.4.3.9 Removing the Spark Protector Configuration
- 3.4.3.10 Running the Uninstallation Script
- 3.1 Installing Big Data Protector on a Cluster
- 4 Hadoop Application Protector
- 4.1 Using the Hadoop Application Protector
- 4.2 Prerequisites
- 4.3 Samples
- 4.4 MapReduce APIs
- 4.4.1 openSession()
- 4.4.2 closeSession()
- 4.4.3 getVersion()
- 4.4.4 getCurrentKeyId()
- 4.4.5 checkAccess()
- 4.4.6 getDefaultDataElement()
- 4.4.7 protect()
- 4.4.8 protect()
- 4.4.9 protect()
- 4.4.10 unprotect()
- 4.4.11 unprotect()
- 4.4.12 unprotect()
- 4.4.13 bulkProtect()
- 4.4.14 bulkProtect()
- 4.4.15 bulkProtect()
- 4.4.16 bulkUnprotect()
- 4.4.17 bulkUnprotect()
- 4.4.18 bulkUnprotect()
- 4.4.19 reprotect()
- 4.4.20 reprotect()
- 4.4.21 reprotect()
- 4.4.22 hmac()
- 4.5 Hive UDFs
- 4.5.1 ptyGetVersion()
- 4.5.2 ptyWhoAmI()
- 4.5.3 ptyProtectStr()
- 4.5.4 ptyUnprotectStr()
- 4.5.5 ptyReprotect()
- 4.5.6 ptyProtectUnicode()
- 4.5.7 ptyUnprotectUnicode()
- 4.5.8 ptyReprotectUnicode()
- 4.5.9 ptyProtectInt()
- 4.5.10 ptyUnprotectInt()
- 4.5.11 ptyReprotect()
- 4.5.12 ptyProtectFloat()
- 4.5.13 ptyUnprotectFloat()
- 4.5.14 ptyReprotect()
- 4.5.15 ptyProtectDouble()
- 4.5.16 ptyUnprotectDouble()
- 4.5.17 ptyReprotect()
- 4.5.18 ptyProtectBigInt()
- 4.5.19 ptyUnprotectBigInt()
- 4.5.20 ptyReprotect()
- 4.5.21 ptyProtectDec()
- 4.5.22 ptyUnprotectDec()
- 4.5.23 ptyProtectHiveDecimal()
- 4.5.24 ptyUnprotectHiveDecimal()
- 4.5.25 ptyReprotect()
- 4.6 Pig UDFs
- 5 HDFS File Protector (HDFSFP)
- 5.1 Overview of HDFSFP
- 5.2 Features of HDFSFP
- 5.3 Protector Usage
- 5.4 File Recover Utility
- 5.5 HDFSFP Commands
- 5.6 Ingesting Files Securely
- 5.7 Extracting Files Securely
- 5.8 HDFSFP Java API
- 5.9 Developing Applications using HDFSFP Java API
- 5.10 Quick Reference Tasks
- 5.11 Sample Demo Use Case
- 5.12 Appliance components of HDFSFP
- 5.13 Access Control Rules for Files and Folders
- 5.14 Using the DFS Cluster Management Utility (dfsdatastore)
- 5.15 Using the ACL Management Utility (dfsadmin)
- 5.15.1 Adding an ACL Entry for Protecting Directories in HDFS
- 5.15.2 Updating an ACL Entry
- 5.15.3 Reprotecting Files or Folders
- 5.15.4 Deleting an ACL Entry to Unprotect Files or Directories
- 5.15.5 Activating Inactive ACL Entries
- 5.15.6 Viewing the ACL Activation Job Progress Information in the Interactive Mode
- 5.15.7 Viewing the ACL Activation Job Progress Information in the Non Interactive Mode
- 5.15.8 Searching ACL Entries
- 5.15.9 Listing all ACL Entries
- 5.16 HDFS Codec for Encryption and Decryption
- 6 HBase
- 6.1 Overview of the HBase Protector
- 6.2 HBase Protector Usage
- 6.3 Adding Data Elements and Column Qualifier Mappings to a New Table
- 6.4 Adding Data Elements and Column Qualifier Mappings to an Existing Table
- 6.5 Inserting Protected Data into a Protected Table
- 6.6 Retrieving Protected Data from a Table
- 6.7 Protecting Existing Data
- 6.8 HBase Commands
- 6.9 Ingesting Files Securely
- 6.10 Extracting Files Securely
- 6.11 Sample Use Cases
- 7 Impala
- 7.1 Overview of the Impala Protector
- 7.2 Impala Protector Usage
- 7.3 Impala UDFs
- 7.3.1 pty_GetVersion()
- 7.3.2 pty_WhoAmI()
- 7.3.3 pty_GetCurrentKeyId()
- 7.3.4 pty_GetKeyId()
- 7.3.5 pty_StringEnc()
- 7.3.6 pty_StringDec()
- 7.3.7 pty_StringIns()
- 7.3.8 pty_StringSel()
- 7.3.9 pty_UnicodeStringIns()
- 7.3.10 pty_UnicodeStringSel()
- 7.3.11 pty_IntegerEnc()
- 7.3.12 pty_IntegerDec()
- 7.3.13 pty_IntegerIns()
- 7.3.14 pty_IntegerSel()
- 7.3.15 pty_FloatEnc()
- 7.3.16 pty_FloatDec()
- 7.3.17 pty_FloatIns()
- 7.3.18 pty_FloatSel()
- 7.3.19 pty_DoubleEnc()
- 7.3.20 pty_DoubleDec()
- 7.3.21 pty_DoubleIns()
- 7.3.22 pty_DoubleSel()
- 7.4 Inserting Data from a File into a Table
- 7.5 Protecting Existing Data
- 7.6 Unprotecting Protected Data
- 7.7 Retrieving Data from a Table
- 7.8 Sample Use Cases
- 8 HAWQ
- 8.1 Overview of the HAWQ Protector
- 8.2 HAWQ Protector Usage
- 8.3 HAWQ UDFs
- 8.3.1 pty_GetVersion()
- 8.3.2 pty_WhoAmI()
- 8.3.3 pty_GetCurrentKeyId()
- 8.3.4 pty_GetKeyId()
- 8.3.5 pty_VarcharEnc()
- 8.3.6 pty_VarcharDec()
- 8.3.7 pty_VarcharHash()
- 8.3.8 pty_VarcharIns()
- 8.3.9 pty_VarcharSel()
- 8.3.10 pty_UnicodeVarcharIns()
- 8.3.11 pty_UnicodeVarcharSel()
- 8.3.12 pty_IntegerEnc()
- 8.3.13 pty_IntegerDec()
- 8.3.14 pty_IntegerHash()
- 8.3.15 pty_IntegerIns()
- 8.3.16 pty_IntegerSel()
- 8.3.17 pty_DateEnc()
- 8.3.18 pty_DateDec()
- 8.3.19 pty_DateHash()
- 8.3.20 pty_DateIns()
- 8.3.21 pty_DateSel()
- 8.3.22 pty_RealEnc()
- 8.3.23 pty_RealDec()
- 8.3.24 pty_RealHash()
- 8.3.25 pty_RealIns()
- 8.3.26 pty_RealSel()
- 8.4 Inserting Data from a File into a Table
- 8.5 Protecting Existing Data
- 8.6 Unprotecting Protected Data
- 8.7 Retrieving Data from a Table
- 8.8 Sample Use Cases
- 9 Spark
- 9.1 Overview of the Spark Protector
- 9.2 Spark Protector Usage
- 9.3 Spark APIs
- 9.3.1 getVersion()
- 9.3.2 getCurrentKeyId()
- 9.3.3 checkAccess()
- 9.3.4 getDefaultDataElement()
- 9.3.5 hmac()
- 9.3.6 protect()
- 9.3.7 protect()
- 9.3.8 protect()
- 9.3.9 protect()
- 9.3.10 protect()
- 9.3.11 protect()
- 9.3.12 protect()
- 9.3.13 protect()
- 9.3.14 protect()
- 9.3.15 protect()
- 9.3.16 protect()
- 9.3.17 protect()
- 9.3.18 protect()
- 9.3.19 unprotect()
- 9.3.20 unprotect()
- 9.3.21 unprotect()
- 9.3.22 unprotect()
- 9.3.23 unprotect()
- 9.3.24 unprotect()
- 9.3.25 unprotect()
- 9.3.26 unprotect()
- 9.3.27 unprotect()
- 9.3.28 unprotect()
- 9.3.29 unprotect()
- 9.3.30 unprotect()
- 9.3.31 unprotect()
- 9.3.32 reprotect()
- 9.3.33 reprotect()
- 9.3.34 reprotect()
- 9.3.35 reprotect()
- 9.3.36 reprotect()
- 9.3.37 reprotect()
- 9.3.38 reprotect()
- 9.4 Displaying the Cleartext Data from a File
- 9.5 Protecting Existing Data
- 9.6 Unprotecting Protected Data
- 9.7 Retrieving the Unprotected Data from a File
- 9.8 Spark APIs and Supported Protection Methods
- 9.9 Sample Use Cases
- 9.10 Spark SQL
- 9.11 Spark Scala
- 10 Data Node and Name Node Security with File Protector
- 11 Appendix: Return Codes
- 12 Appendix: Samples
- 12.1 Roles in the Samples
- 12.2 Data Elements in the Security Policy
- 12.3 Role-based Permissions for Data Elements in the Sample
- 12.4 Data Used by the Samples
- 12.5 Protecting Data using MapReduce
- 12.6 Protecting Data using Hive
- 12.7 Protecting Data using Pig
- 12.8 Protecting Data using HBase
- 12.9 Protecting Data using Impala
- 12.10 Protecting Data using HAWQ
- 12.11 Protecting Data using Spark
- 13 Appendix: HDFSFP Demo
- 13.1 Roles in the Demo
- 13.2 HDFS Directories used in Demo
- 13.3 User Permissions for HDFS Directories
- 13.4 Prerequisites for the Demo
- 13.5 Running the Demo
- 13.5.1 Protecting Existing Data in HDFS
- 13.5.2 Ingesting Data into a Protected Directory
- 13.5.3 Ingesting Data into an Unprotected Public Directory
- 13.5.4 Reading the Data by Authorized Users
- 13.5.5 Reading the Data by Unauthorized Users
- 13.5.6 Copying Data from One Directory to Another by Authorized Users
- 13.5.7 Copying Data from One Directory to Another by Unauthorized Users
- 13.5.8 Deleting Data by Authorized Users
- 13.5.9 Deleting Data by Unauthorized Users
- 13.5.10 Copying Data to a Public Directory by Authorized Users
- 13.5.11 Running MapReduce Job by Authorized Users
- 13.5.12 Reading Data for Analysis by Authorized Users
- 14 Appendix: Using Hive with HDFSFP
- 14.1 Data Used by the Samples
- 14.2 Ingesting Data to Hive Table
- 14.3 Tokenization and Detokenization with HDFSFP
- 14.3.1 Verifying Prerequisites for Using Hadoop Application Protector
- 14.3.2 Ingesting Data from HDFSFP Protected External Hive Table to HDFSFP Protected Internal Hive Table in Tokenized Form
- 14.3.3 Ingesting Detokenized Data from HDFSFP Protected Internal Hive Table to HDFSFP Protected External Hive Table
- 14.3.4 Ingesting Data from HDFSFP Protected External Hive Table to Internal Hive Table not protected by HDFSFP in Tokenized Form
- 14.3.5 Ingesting Detokenized Data from Internal Hive Table not protected by HDFSFP to HDFSFP Protected External Hive Table
- 15 Appendix: Configuring Talend with HDFSFP
- 15.1 Verifying Prerequisites before Configuring Talend with HDFSFP
- 15.2 Verifying the Talend Packages
- 15.3 Configuring Talend with HDFSFP
- 15.4 Starting a Project in Talend
- 15.5 Configuring the Preferences for Talend
- 15.6 Ingesting Data in the Target HDFS Directory in Protected Form
- 15.7 Accessing the Data from the Protected Directory in HDFS
- 15.8 Configuring Talend Jobs to run with HDFSFP with Target Exec as Remote
- 15.9 Using Talend with HDFSFP and MapReduce
- 16 Appendix: Migrating Tokenized Unicode Data from and to a Teradata Database