Read Me Install Guide Lines
User Manual:
Open the PDF directly: View PDF
.
Page Count: 6
| Download | |
| Open PDF In Browser | View PDF |
Data Warehouse & ETL Offload Code Samples Overview and Install Guide Copyright © 2018, Oracle and/or its affiliates. All rights reserved. The Universal Permissive License (UPL), Version 1.0 The Data Warehouse & ETL Offload Code Samples provide sample code artifacts to support data warehousing and ETL offload solution patterns in the Oracle Public Cloud and in an Oracle Cloud at Customer deployment. The code artifacts and sample dataset provided are described below by folder. Installation and execution guidelines are included below. Code Samples & Installation Content Pre-Requisites and Assumptions .................................................................................................................. 1 Global Files - SampleSourceFiles................................................................................................................... 2 Loading Autonomous Data Warehouse ........................................................................................................ 3 GoldenGate_Parameter_Samples ............................................................................................................ 3 ODISampleAutonomousDataWarehouseCloudLoad ................................................................................ 4 BigDataCloudETLOffloadSparkNotebook ...................................................................................................... 5 CloudAtCustomerODIETLOffload .................................................................................................................. 5 Pre-Req’s and Assumptions These code samples are built using Oracle Cloud services (either Oracle Public Cloud or Oracle Cloud at Customer) and are provided as-is with no expressed warranty. Solution documentation along with stepby-step guides for service integration and configurations on the Data Warehouse and ETL Offload Solution patterns are available online at docs.oracle.com and should be read prior to implementing these samples: Learn how to load the data warehouse for business analytics Load the data warehouse for business analytics on Oracle Cloud Load the data warehouse for business analytics on Cloud at Customer The solution documentation provides step-by-step guides to configuring the services and should be followed prior to installation / execution of the sample workloads. To implement these samples you will need to have provisioned the following services – Oracle Autonomous Data Warehouse Cloud Oracle Data Integration Platform Cloud Oracle Analytics Cloud Oracle Big Data Cloud (for Oracle Public Cloud ETL Offload) Oracle Big Data Cloud @ Customer (for Cloud@Customer ETL Offload) Global Files - SampleSourceFiles SampleSourceFiles provide a set of .csv files that are used throughout the solution. These files are used in both ODI Smart export samples (loading ADWC and ETL Offload for BigDataCloud@Customer) as well as in the sample BigDataCloud Notebook for ETL Offload / Spark processing in the Oracle Public Cloud CUSTOMER_SRC_FILE.csv provides a set of made up customer data (customer names are random letters and numbers). This is used to load the CUSTOMER dimension and to provide REGION data for the ETL Offload spark sample Column Ordinal 1 2 3 4 5 Column Name CUSTOMER_ID FIRST_NAME LAST_NAME REGION CITY DataType NUMERIC VARCHAR VARCHAR VARCHAR VARCHAR SRC_PRODUCT.csv provides a set of product data. This is used to load the PRODUCT dimension in ADWC and to provide Product Category / Family data for the ETL Offload Spark sample. Column Ordinal 1 2 3 4 Column Name PRODUCT_ID PRODUCT_NAME PRICE FAMILY DataType NUMERIC VARCHAR NUMERIC VARCHAR ORDERS_FILE.csv provides a set of order data. This is used to load the SALES_FACT fact table in ADWC (and the subsequent SALES_ANALYSIS table). It provides the dataset for the SALES_ANALYSIS ETL Offload Spark Notebook and ODI project. Column Ordinal 1 2 3 4 5 6 7 Column Name ORDER_ID CUSTOMER_ID ORDER_DATE PRODUCT_ID AMOUNT QTY ORDER_LINE_NUMBER DataType NUMERIC NUMERIC DATE NUMERIC NUMERIC NUMERIC NUMERIC Loading Autonomous Data Warehouse This folder provides artifacts demonstrating how to replicate a source table to ADWC for reporting with GoldenGate and how to load a star schema data warehouse with ODI. (Note this project can also be used against an Exadata Cloud at Customer solution but may require you to change the knowledge modules used in the solution) GoldenGate_Parameter_Samples Artifacts in this folder represent sample GoldenGate parameter files for real-time replication of data from an Oracle database source (on-premise or DBCS) to an ADWC target. These files can be used with any GoldenGate for Oracle 12.3 implementation including the Data Integration Platform Cloud’s REMOTE AGENT. exadwc.prm is a sample GoldenGate EXTRACT parameter file to capture from an Oracle database source. To configure this file replace theentry with your Oracle database SID or CDB name. This file is configured to capture from the schema ADWC_SRC. padwc.prm is a sample GoldenGate EXTRACT PUMP parameter file to pump transactions capture via the extract to the Data Integration Platform Cloud instance for delivery to Autonomous Data Warehouse Cloud. To configure this file – replace the and entries. radwc.prm is a sample GoldenGate REPLICAT parameter file for a Classic replicat. This applies the transactions to the Autonomous Data Warehouse Cloud instance. To configure this file replace the entry with your SID or CDB name. This applies data from the ADWC_SRC schema to the adwc_repl schema in your adwc instance. Place these files in your dirprm directory for the GoldenGate installation. (exadwc.prm and padwc.prm in the source and radwc.prm in your DIPC host / DIPC remote agent installation). Follow the instructions in the Data Warehouse Solution Pattern for info on configuring GoldenGate within the DIPC remote agent and where to leverage these files. ODISampleAutonomousDataWarehouseCloudLoad These artifacts provide create user / ddl scripts to create a small star schema in the Autonomous Data Warehouse Cloud and an ODI Smart Export to load the Autonomous Data Warehouse Cloud star schema. To import / execute these loads you must perform the following steps – Connect to your ADWC instance via SQL Developer Run the CreateODI_USER_and_TABLES.sql script to create the ODI_USER and the target tables for the ODI project. **Note – replace the tag in the first line of the script with the password you wish to use for the ODI User. SSH into your Data Integration Platform Cloud instance as OPC user Create a directory in your DIPC instance o Mk dir /tmp/ADWCSample Copy the SourceFiles ORDERS_FILE.csv, SRC_PRODUCT.csv and CUSTOMER_SRC_FILE.csv and ODIADWCSample.xml to your DIPC instance – the /tmp/ADWCSample directory you just created Change permissions on the directory and files to ensure the oracle user can read the files o Sudo chmod 755 -R /tmp/ADWCSample VNC into your DIPC instance. Run ODI Studio Use Smart Import to import the ODI processes. Navigate to the Topology Manager in ODI Studio and alter the ADWC connection strings (review the Solution Documentation for more details on how to obtain / configure your ODI / ADWC connection). Open the ODI project and review the mappings. To run the processes execute the ODI package PKG ODI_User DW Sample Load Using SQL Developer – query the tables in the ADWC ODI_USER schema o CUST_DIM o DATE_DIM o PRODUCT_DIM o SALES_FACT o SALES_FACT_ANALYSIS Review the ODI mappings. Note that the 2 folders provide 2 methods for executing dimension and fact loads in ADWC: Loads Using Mappings and KM’s provides mappings the leverage certified Knowledge Modules for ADWC. These are appropriate for Type 1 dimension style loads. Loads using Dimensions and Cubes objects provides mappings that leverage the Dimensions and Cubes objects within ODI to load the same set of tables into the ADWC ODI_USER schema. The Dimensions objects are appropriate for Type 2 or Type 3 dimension loads. BigDataCloudETLOffloadSparkNotebook This artifact is a sample BigDataCloud Notebook that demonstrates Spark to load data from files stored in Oracle Object Storage – perform an ETL routine leveraging SparkSQL and then store the result in multiple file formats back in Object Storage (all running in the Oracle Public Cloud). To configure and execute the Spark notebook – Login to your Oracle Public Cloud acct. Create a Container within your Oracle ObjectStorage service (for example BDCETL) Navigate to your BigDataCloud instance and the Notebook tab. Click Import to import the notebook (point to the json file and import). Open the notebook and click the run / play button to execute the Spark script. The Spark notebook generates a SALES_FACT_ANALYSIS similar to the results of the ODI ADWC processes. To review the results simply execute the %SQL section of the notebook which queries the resulting SALES_ANALYSIS table in BigDataCloud. CloudAtCustomerODIETLOffload These artifacts demonstrate ODI performing ETL Offload in a BigData Cloud@Customer (Cloudera) environment. These leverage the SourceFiles sample dataset to ingest data into the Cloudera Cluster (via SQOOP or Spark) and then ODI executes Spark workloads to generate the SALES_ANALYSIS data set. To configure and execute the ODI jobs – Follow the documentation on configuring your BigData Cloud at Customer machine for ODI workload. Do not create your topology objects yet however. Ssh / vnc into you ODI Agent node in your cluster Create a directory in your clusters local file system o Mkdir /tmp/sourcefiles Copy the ODISmartExport_ETLOffload_BigDataCloud@Customer.zip file to your ODI agent node in your cluster. Copy the SourceFiles ORDERS_FILE.csv, SRC_PRODUCT.csv and CUSTOMER_SRC_FILE.csv to your DIPC instance – the /tmp/sourcefiles directory you just created Run ODI Studio Smart Import the ODI project into you environment Configure your topology connections based on the Solution documentation step by step guide. Edit the ORCL_SRC connection to point to an Oracle database that you have (the ingestion mappings pull from an Oracle db to demonstrate how to leverage either SQOOP or Spark to ingest data to your cluster) Follow the Solution documentation instructions on configuring your Hadoop credential store for the ORCL_SRC connection if you plan on using Spark to ingest the data Review the ODI project. 0. Create Oracle Source Objects o The PKG Create Oracle Objects and Load Data will create tables in your source Oracle database and load the SourceFiles to that database. 1. Ingest Data to BigData HDFS – SQOOP & Spark o The PKG Ingest Data to Hive leverages ODI mappings to import data into your Big data cluster leverage SQOOP. o Open one of the mappings and navigate to the physical tab. Note that there are physical designs for both SQOOP and Spark ingestion. Either can be leveraged to move data into the cluster. 2. ETL Offload – Spark o The Mapping MAP Spark ETL Offload… demonstrates how to create a mapping in ODI that leverages Spark to perform ETL offload to the cluster and create a SALES_ANALYSIS data set. Execute the packages / mappings listed above in order to o Create Oracle source objects o Ingest data into your cluster o Generate a SALES_ANALYSIS dataset leveraging ODI’s Spark capabilities.
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.5 Linearized : No Page Count : 6 Language : en-US Tagged PDF : Yes Author : tgarrod Creator : Microsoft® Word 2016 Create Date : 2018:07:06 15:57:34-06:00 Modify Date : 2018:07:06 15:57:34-06:00 Producer : Microsoft® Word 2016EXIF Metadata provided by EXIF.tools