Impala Online Training explains the basics of Impala: What is Impala, Impala Hive, Impala Pig, etc. Impala brings scalable parallel database technology to Hadoop, enabling users to issue low-latency SQL queries to data stored in HDFS and Apache HBase, without requiring data movement or transformation. Impala is integrated with Hadoop to use the same file and data formats, metadata, security, and resource management frameworks used by Apache Hive, Apache Pig and other Hadoop software.
Preview
At the end of this training, You will learn.
The features that Pig, Hive, and Impala offer for data acquisition, storage, and analysis
The fundamentals of Apache Hadoop and data ETL (extract, transform, load), ingestion, and processing with Hadoop tools
How Pig, Hive, and Impala improve productivity for typical analysis tasks
Performing real-time, complex queries on datasets
Course Contents
Day 1
Introduction to Impala
Introduction to Impala
Objectives
What is Impala
Benefits of Impala
Benefits of Impala
Exploratory Business Intelligence
Impala Installation
Starting and Stopping Impala
Data Storage
Managing Metadata
Controlling Access to Data
Impala Shell Commands and Interface
Conclusion
Day 2
Querying with Hive and Impala
Querying with Hive and Impala
Objectives
SQL Language Statements
DDL Statements
DML Statements
CREATE DATABASE
CREATE TABLE
CREATE TABLE - Examples
Internal and External Tables
Loading Data into Impala Table
ALTER TABLE
DROP TABLE
DROP DATABASE
DESCRIBE Statement
EXPLAIN Statement
SHOW TABLE Statement
INSERT Statement
INSERT Statement - Examples
SELECT Statement
Data Type
Operators
Functions
CREATE VIEW in Impala
Hive and Impala Query Syntax