Oracle Big Data Fundamentals training will help you to learn Oracle's Integrated Big Data Solution to acquire, process,integrate and analyze big data. Oracle Big Data Foundation training helps you leverage a complete big data solution. Get coaching to understand its massively scalable infrastructure.
Preview
Learn To:
- Define Big Data.
- Describe Oracle's Integrated Big Data Solution and its components.
- Use the Hadoop Distributed File System (HDFS).
- Acquire big data using the Command Line Interface, Flume, and Oracle NoSQL Database.
- Process big data using MapReduce, YARN, Hive, Pig, Oracle XQuery for Hadoop, Solr, and Spark.
- Integrate big data and warehouse data using Scoop, Oracle Big Data Connectors, Copy to BDA, Oracle Big Data SQL,
- Oracle Data Integrator, and Oracle GoldenGate.
- Analyze big data using Oracle Big Data SQL, Oracle Advanced Analytics technologies, and Oracle Big Data Discovery.
- Use and manage Oracle Big Data Appliance.
Course Contents
Day 1
Introduction
Lesson Objectives
Questions About You
Course Objectives
Course Road Map
Practice Environment
Connecting to the Course Environment (Oracle Big Data Lite Virtual Machine) Using VNC
Starting the Oracle Big Data Lite Virtual Version Machine 4.01
Introducing the Movieplex
Big Data and the Oracle Information Management System
Big Data Opportunities and Challenges
Oracle Information Management Architecture
Optimizing/Simplifying Architecture with Engineered Systems
Using Oracle Big Data Lite Virtual Machine
Overview of the Big Data product stack
Access methods
Review the Oracle Big Data Virtual Machine Home page
Deep dive into the Oracle case study
Identify the data structures used
Understand the importance of filtering the data
Identify the Hadoop Command Guide URL, and review the fs and version commands that are used in the practice
Introduction to the Big Data Ecosystem
Lesson Objectives
Computer Clusters
Distributed Computing
The Hadoop Ecosystem
Hadoop Core Components
Choosing a Hadoop Distribution and Version
Types of Analysis That Use Hadoop
Cloudera’s Distribution Including Apache Hadoop (CDH) Architecture
Day 2
Introduction to the Hadoop Distributed File System (HDFS)
Lesson Objectives
Hadoop Distributed Filesystem (HDFS)
Acquire Data using CLI, Fuse-DFS, and Flume
Introducing the CLI
Examining Fuse DFS
Using Flume
Acquire and Access Data Using Oracle NoSQL Database
Define Oracle NoSQL Database
List Benefits
Load data into the DB
Access NoSQL Data
Primary Administrative Tasks for Oracle NoSQL Database
Plan an Oracle NoSQL Database installation and Node configuration
Configure and Deploy a KVStore
Using the GUI web interface (monitoring the KVStore)
Use the NoSQL Database Table Model (both CLI and Java API)
Day 3
Introduction to MapReduce (Generic and not BDA specific)
Lesson Objectives
MapReduce
Interacting with MapReduce
MapReduce Daemons (Services) update based on YARN
Interacting With MapReduce
Fault Tolerance
MapReduce Examples
Resource Management Using Yarn
Lesson Objectives
YARN Overview
YARN: Theme 1
YARN: Theme 2
Job Submission in YARN
YARN Features
MapReduce 2.0: Overview
YARN Services
Overview of Apache Hive and Apache Pig
Apache Hive
Apache Pig
Overview of Cloudera Impala
Hadoop: Analysis Options
Examining Cloudera Impala
Integrating Hadoop and Oracle
Day 4
Using Oracle XQuery for Hadoop
Extensible Markup Language (XML)
Simple XML Document
XML Elements
Markup Rules for Elements
XML Attributes
What is XML Path (XPath) Language?
XPath Terminology: Node Types
XPath Terminology: Family Relationships
Overview of Solr
What is Apache Solr (Cloudera Search)?
Cloudera Search: Key Capabilities
Cloudera Search: Features
Cloudera Search: Tasks
Indexing in Cloudera Search
Types of Indexing
Schema.xml
Creating a Solr Collection
Apache Spark
What is Apache Spark?
Introduction to Spark
Resilient Distributed Datasets (RDD)
Directed Acyclic Graph (DAG) Execution Engine
Spark: Architecture
Overview of Scala Language
Options for Integrating Your Big Data
Main Themes of Data Integration
Typical use cases: The World of Fast Exploration of Big Data
Which technology do I use to integrate data?
Introducing Data Integration Options
Installation Details
Current Hadoop Certification Matrix
Day 5
Overview of Apache Sqoop
Define Sqoop
Architecture
Dataflow in Sqoop
Importance of Scoop
Role of Scoop in moving data
Comparison of Scoop
Scoop Connectors
Importing Data From Scoop Into Hive
Using Oracle Loader for Hadoop (OLH)
What is OLH?
Installation Pre-requisites
Installation Steps
OLH Architecture
OLH New Feature: Selective Load of Hive Partitions
Input Formats
Output Modes
OLH Modes
Using Copy To BDA
Define Copy to BDA
Where it fits in (bundled under the Big Data SQL license)
Sample Scenario
Installation Pre-requisites
Using Copy to BDA feature
Datatype Conversion list (tabular format)
Best practices
Advantages
Using Oracle SQL Connector for HDFS (OSCH)
Oracle SQL Connector for HDFS
Software Prerequisites
Installation Setup: Oracle Database
Installation Setup: Hadoop Cluster
Granting User Access to OSCH
OSCH: Features
OSCH: New Feature in 3.0
OSCH: Three Simple Steps
Day 6
Using Oracle Big Data SQL
Context: Exadata and Big Data Appliance
What is Big Data SQL?
Configuring Oracle Big Data SQL
Create Oracle Tables over HDFS data
Leverage the Hive Metastore to Access Data in Hadoop
Apply Oracle Database Security Policies Over Data in Hadoop
Combine HDFS and Oracle data for analysis (SQL Pattern Matching)
Using Oracle Data Integrator and Oracle GoldenGate with Hadoop
Overview of ODI
Overview of OGG
Using ODI
Using OGG