Login
Register

Home

Trainings

Fusion Blog

EBS Blog

Authors

CONTACT US

Trainings
  • Register

Oracle Gold Partners, our very popular training packages, training schedule is listed here
Designed by Five Star Rated Oracle Press Authors & Oracle ACE's.

webinar new

Search Courses

Big Data Hadoop Analyst Online Training course will ensure robust data processing applications using Apache Hadoop. Students will learn debugging, Hadoop development, and implementation of workflows and common algorithms. Students will also learn how to leverage Hive, Sqoop, Oozie, Flume, Pig, Yarn, and Hadoop Testing.

Preview

By end of this training you wold be provided with hadoop certification and learn:

 

The features that Pig, Hive, and Impala offer for data acquisition, storage, and analysis.

The fundamentals of data ETL (extract, transform, load), ingestion, and processing with Hadoop tools.

How Pig, Hive, and Impala improve productivity for typical analysis tasks.

Joining diverse datasets to gain valuable business insight

Performing real-time, complex queries on datasets.

Course Contents

Day 1

Introduction

Hadoop Fundamentals
The Motivation for Hadoop
Hadoop Overview
Data Storage: HDFS

Distributed Data Processing:YARN,MapReduce, and Spark

Data Processing and Analysis: Pig, Hive,
and Impala
Data Integration: Sqoop
Other Hadoop Data Tools
Exercise Scenarios Explanation

Day 2

Introduction to Pig

What Is Pig?
Pig’s Features
Pig Use Cases
Interacting with Pig
Basic Data Analysis with Pig
Pig Latin Syntax
Loading Data
Simple Data Types
Field Definitions
Data Output
Viewing the Schema
Filtering and Sorting Data
Commonly-Used Functions

Processing Complex Data with Pig

Storage Formats
Complex/Nested Data Types
Grouping
Built-In Functions for Complex Data
Iterating Grouped Data

Day 3

Multi-Dataset Operations with Pig

Techniques for Combining Data Sets
Joining Data Sets in Pig
Set Operations
Splitting Data Sets

Pig Troubleshooting and Optimization

Troubleshooting Pig
Logging
Using Hadoop’s Web UI
Data Sampling and Debugging
Performance Overview
Understanding the Execution Plan
Tips for Improving the Performance of Your

Day 4

Pig Jobs

Introduction to Hive and Impala
What Is Hive?
What Is Impala?
Schema and Data Storage
Comparing Hive to Traditional Databases
Hive Use Cases

Querying with Hive and Impala

Databases and Tables
Basic Hive and Impala Query Language

Syntax

Data Types
Differences Between Hive and Impala Query
Using Hue to Execute Queries
Using the Impala Shell

Day 5

Data Management

Data Storage
Creating Databases and Tables
Loading Data
Altering Databases and Tables
Simplifying Queries with Views
Storing Query Results

Data Storage and Performance

Partitioning Tables
Choosing a File Format
Managing Metadata
Controlling Access to Data

Relational Data Analysis with Hive and Impala

Joining Datasets
Common Built-In Functions
Aggregation and Windowing

Day 6

Working with Impala

How Impala Executes Queries
Extending Impala with User-Defined

Functions

Improving Impala Performance
Analyzing Text and Complex Data with Hive
Complex Values in Hive
Using Regular Expressions in Hive
Sentiment Analysis and N-Grams
Conclusion

Hive Optimization

Understanding Query Performance
Controlling Job Execution Plan
Bucketing
Indexing Data

Extending Hive

SerDes
Data Transformation with Custom Scripts
User-Defined Functions
Parameterized Queries

Choosing the Best Tool for the Job

Comparing MapReduce, Pig, Hive, Impala, andRelational Databases

Which to Choose?

Conclusion

Enroll

 
 
 
 
 

 


Training Hours

Time: 12:00 NOON GMT | 07:00AM EST | 4:00AM PST | 6:00AM CST | 5:00AM MST | 5:30PM IST  | 01:00PM GMT+1

Audience

1.This course is designed for data analysts, business intelligence specialists, developers, system architects, and database administrators. Knowledge of SQL is assumed, as is basic Linux command-line familiarity.
2.Knowledge of at least one scripting language (e.g., Bash scripting, Perl, Python, Ruby) would be helpful but is not essential.
3.Prior knowledge of Apache Hadoop is not required



Sangeetha

Add comment


Security code
Refresh

About the Author

Sangeetha

Search Trainings

Fully verifiable testimonials

Apps2Fusion - Event List

<<  Apr 2024  >>
 Mon  Tue  Wed  Thu  Fri  Sat  Sun 
  1  2  3  4  5  6  7
  8  91011121314
15161718192021
22232425262728
2930     

Enquire For Training

Fusion Training Packages

Get Email Updates


Powered by Google FeedBurner