Bigdata and Hadoop Administration Online Training equips you to take up Hadoop Administrator responsibilities in provisioning, installing, configuring, monitoring, maintaining, and securing Hadoop and Hadoop Ecosystem components. You will also learn how to build a Hadoop cluster, back it up, secure it, and integrate associated applications and tools.
Preview
After completion of the Administration Hadoop certification course, you should be able to:
1. Get a clear understanding of Apache Hadoop, HDFS, Hadoop Cluster and Hadoop Administration
2. Gain insight on Hadoop, , HDFS Federation, YARN, MapReduce v2
3. Plan and Deploy a Hadoop Cluster
4. Load Data and Run Applications
5. Configuration and Performance Tuning
6. Manage, Maintain, Monitor and Troubleshoot a Hadoop Cluster
7. Secure a deployment and understand Backup and Recovery
Course Contents
Day 1
Introduction to Hadoop
What Hadoop is why it is important?
Hadoop comparison with traditional systems
Hadoop history
Hadoop main components and architecture
Hadoop Distributed File System (HDFS)
HDFS overview and design
HDFS architecture
HDFS file storage
Component failures and recoveries
Block placement
Balancing the Hadoop cluster
Planning your Hadoop cluster
Planning a Hadoop cluster and its capacity
Hadoop software and hardware configuration
HDFS Block replication and rack awareness
Network topology for Hadoop cluster
Day 2
Hadoop Deployment
Different Hadoop deployment types
Hadoop distribution options
Hadoop competitors
Hadoop installation procedure
Distributed cluster architecture
Working with HDFS
Ways of accessing data in HDFS
Common HDFS operations and commands
Different HDFS commands
Internals of a file read in HDFS
Data copying with ‘distcp’
Mapreduce Abstraction
What MapReduce is and why it is popular
The Big Picture of the MapReduce
MapReduce process and terminology
MapReduce components failures and recoveries
Working with MapReduce
Day 3
Hadoop Cluster Configuration
Hadoop configuration overview and important configuration file
Configuration parameters and values
HDFS parameters MapReduce parameters
Hadoop environment setup
‘Include’ and ‘Exclude’ configuration files
Hadoop Administration and Maintenance
Namenode/Datanode directory structures and files
File system image and Edit log
The Checkpoint Procedure
Namenode failure and recovery procedure
Safe Mode
Metadata and Data backup
Potential problems and solutions / what to look for
Adding and removing nodes
Day 4
Hadoop Monitoring and Troubleshooting
Best practices of monitoring a Hadoop cluster
Using logs and stack traces for monitoring and troubleshooting
Using open-source tools to monitor Hadoop cluster
Job Scheduling
How to schedule Hadoop Jobs on the same cluster
Default Hadoop FIFO Schedule
Fair Scheduler and its configuration
Hadoop Multi Node Cluster Setup and Running Map Reduce Jobs on Amazon Ec2
Hadoop Multi Node Cluster Setup using Amazon ec2 – Creating 4 node cluster setup
Running Map Reduce Jobs on Cluster
High Availability Federation, Yarn and Security
Working with Map Reduce, Hive, Sqoop
Multinode Cluster Setup