course-image

Hadoop Training

With the increase in the use of technology, the data is getting generated at an exponential pace. All of this data can provide enormous insights for the taking business to a new height. Everywhere we look today, enterprises are bracing Big Data based solutions to increase customer relationships and building innovative solutions based on insights gained from data. This data comes from everywhere: sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals, just to name a few. This data is big data.

Demand to store all of this data in itself is a challenge and when we add the need for analytics, the situation is even more challenging for the state-of-the-art infrastructure. Hence, the open source community has stepped up to this challenge and developed a framework to store and process this huge amount of data or big data called Apache Hadoop. Along with this, there are many open source projects integrating seamlessly with Hadoop to provide varieties of solutions starting from processing the data, data visualization, data analytics etc. forming a whole Hadoop Ecosystem. Apache Hadoop provides economical and fast data storage and processing system for processing all type of data like Structured (like Relational Tables), Semi-structured (XMLs, JSONs etc.) and unstructured (Images, Chat Logs, Tweets etc.).

With the growth in adoption of the Apache Hadoop by many enterprises, the demand for skilled Hadoop professional is also steadily increasing and is slated to increase continuously. Here are some of the facts derived from the survey conducted by some of the highly recognised organisations:

  • Big Data & Hadoop Market is expected to reach $99.31B by 2022 growing at a CAGR of 42.1% from 2015 - Forbes
  • McKinsey predicts that by 2018 there will be a shortage of 1.5M data experts - Mckinsey Report
  • Avg salary of Big Data Hadoop Developers is $135k - Indeed.com Salary Data.

Keeping in mind, all of the above facts, this Hadoop Developer program at QSHORE is developed to help you gain detailed insights into Apache Hadoop by providing:

  • Detailed Architecture Discussion of Apache Hadoop and Map Reduce
  • Detailed coverage (Basic + Advanced) of the Hadoop Ecosystem components like Hive, Pig, SQOOP, Flume, HBase, Oozie etc.
  • Provides a good start for Spark and Scala
  • The detailed description of Integration of various Ecosystem components with Hadoop and the integration of components among each other.
  • Description of the Implementation of Hadoop Best Practices
  • Performance Tuning techniques
  • Real-time Design discussions
  • Exposure on a real-time Project on Big Data Analytics
  • Highly customised Study Materials
  • A fully hands-on based training approach
  • Detailed Interview Question Discussion and Support for Resume Preparation

WHO SHOULD TAKE THIS COURSE

The market for Big Data analytics is growing across the world and this strong growth pattern translates into a great opportunity for all the IT Professionals. 
Here are the few Professional IT groups, who are continuously enjoying the benefits moving into Big data domain:

  • Developers and Architects
  • BI /ETL/DW professionals
  • Senior IT Professionals
  • Testing professionals
  • Mainframe professionals
  • Freshers

 

Hadoop practitioners are among the highest paid IT professionals today with salaries ranging till $85K (source: indeed job portal), and the market demand for them is growing rapidly.

 

PRE-REQUISITES FOR THIS COURSE

As such, there are no pre-requisites for learning Hadoop. Knowledge of Core Java and SQL will be beneficial, but certainly not a mandate. If you wish to brush-up Core-Java skills, QSHORE offers you a free Core Java Course when you enrol for Hadoop Developer Program.

Curriculum

Here you can download the course and schedule for Hadoop Training Download PDF
      •   In-depth understanding of Entire Big Data Hadoop and Hadoop Eco-System

      •  The real-time idea of Hadoop Development

      •  Basic Hadoop Administration knowledge

      •  Detailed Course Materials

      •  Free Core Java and UNIX Fundamentals

      •  Real-time projects implementation

      •  Interview Oriented Discussions

      •  Help in Resume Preparation

      • UNIX/LINUX Basic Commands

      • Basic UNIX Shell Scripting

      • Basic Java Programming – Core JAVA OOPS Concepts

      • Introduction to Big Data and Hadoop

      • Working With HDFS Hadoop Map Reduce Concepts & Features

      • Developing Map Reduce Applications

      • Hadoop EcoSystem Components:

        o HIVE

          o PIG

          o HBASE 

          o FLUME

          o SQOOP 

          o OOZIE

      • Introduction to SPARK & SCALA
      • Real-Time Tools like Putty, WinSCP, Eclipse, Hue, Cloudera Manager
      • Hadoop Installation & Configuration Real Time Projects

         

      ? What is Big Data?

      ? Challenges in processing Big Data

      ? What is Hadoop?

      ? Why Hadoop?

      ? History of Hadoop

      ? Hadoop Components Overview HDFS, Map Reduce

      ? Hadoop EcoSystem Introduction

      ? NoSQL Database Introduction

        o Resource Manager

        o Application Master

        o Node Manager

          • Hadoop 2.x Architecture

          • Introduction to YARN

          • Hadoop Daemons

          • YARN Architecture

        ? Rack Awareness

        ? HDFS Daemons

        ? Writing Files to HDFS

        o Blocks & Splits

        o Input Splits

        o Data Replication

        ? Reading Files from HDFS

        ? Introduction to HDFS Configuration Files

          ? HDFS Commands

          ? Accessing HDFS

          o CLI Approach 

          o JAVA Approach [Introducing HDFS JAVA API]

            ? What is Map Reduce?

            ? Detailed Map Reduce Flow

            o Introduction to Key/Value Approach

            o Detailed Mapper Functionality

            o Detailed Reducer Functionality

            o Details of Partitioner

            o Shuffle & Sort Process

            ? Understanding Map Reduce Flow with Word Count Example

              Basic Map Reduce Programming

               Introduction to Map Reduce API [New Map Reduce API]

               Map Reduce Data Types

               File Formats

               Input Formats – Input Splits & Records, text input, binary input

               Output Formats – Text Output, Binary Output

               Configuring Development Environment – Eclipse

               Developing a Map Reduce Application using Default Functionality

              o Identity Mapper

              o Identity Reducer

              o ToolRunner API Introduction

               Developing Word Count Application

              o Writing Mapper, Reducer & Driver Code

              o Building Application

              o Deploying Application Running the Map Reduce Application

              o Local Mode of Execution

              o Cluster Mode of Execution

              Monitoring Map Reduce Application

                  o Writing Mapper, Reducer & Driver Code

                  o Building Application

                  o Deploying Application Running the Map Reduce Application

                  o Local Mode of Execution

                  o Cluster Mode of Execution

                  Monitoring Map Reduce Application

                     Map Reduce Combiner

                     Map Reduce Counters

                     Map Reduce Partitioner

                     Map Reduce Distributed Cache

                     Writing Custom Partitioner

                     Writing Custom Record Reader & Record Writer [Custom Input & Output Formats]

                    o Sample Program with PDF Input File

                     Custom Writables & Writable Comparables

                     Map Reduce Compression

                     File Merge Utility

                       Dynamic Partitioning

                      ? UDF, UDAF & UDTF]

                       XML Processing in HIVE

                       JSON processing in HIVE

                                   RL Processing in HIVE   

                      ? Introduction to HIVE Query Optimizations

                      ? Developing Hive UDFs in JAVA

                      ? Hive JDBC Client

                        •  Introduction to HIVE
                        •  Hive Architecture
                        •  Types of Meta store
                        •  Introduction to Hive Configuration Files
                        •  Hive Data Types
                          •  Simple Data Types
                          •  Collection Data Types
                        •  Types of Hive Tables
                          •  Managed Table
                          •  External Table
                        • Hive Query Language (HQL or HIVE QL)
                          •  Creating Databases
                          •  Creating Tables
                          •  Joins in Hive
                          •  Group BY and Distinct operations
                          •  Partitioning
                        •  Static Partitioning
                        •  Bucketing
                        •  Lateral View & Explode [Introduction to Hive UDFs
                        •  Hive File Formats [Introduction to Hive SERDE]
                          •  Parquet
                          •  ORC
                          •  AVRO

                      ? Introduction to HIVE Query Optimizations

                      ? Developing Hive UDFs in JAVA

                      ? Hive JDBC Client

                      •  Introduction to PIG
                      •  PIG Architecture
                      •  Introduction to PIG Configuration Files
                      •  PIG vs. HIVE vs. Map Reduce
                      •  Introduction to Data Flow Language
                      •  Pig Data Types
                      •  Pig Programming Modes
                      •  Pig Access Modes
                      •  Detailed PIG Latin Programming 
                      •  PIG UDFs & UDF Development in JAVA
                      •  PIG Macros
                      •  Hive - PIG Integration
                      •  Introduction to HCATALOG
                      •  Processing XML Data in PIG
                      •  Introduction to PIG Optimization

                      •  Introduction to NoSQL Databases

                      •  Types of NoSQL Databases

                      •  Introduction To HBASE

                      •  HBASE Architecture

                      •  HBASE Shell Interface

                      •  Creating Data Bases and Tables

                      •  Inserting Data in tables

                      •  Accessing data from Tables

                      •  HBase Filters

                      •  Hive & HBASE Integration

                      •  PIG & HBASE Integration

                      •  HBASE JAVA API

                      •  Introduction to Streaming

                      •  Introduction to FLUME

                      •  FLUME Architecture

                      •  Flume Agent Setup

                      •  Types of Source, Channel & Sinks

                      •  Developing Sample Flume Applications

                  +
                  KAFKA

                      • Introduction to Kafka
                      • Kafka Installation
                      • Kafka Cluster Architecture & API
                        • Producer
                        • Consumer
                        • Broker
                      • Integrating with various Hadoop Systems

                  +
                  SQOOP

                      • Introduction to SQOOP
                      • Connecting to RDBMS Using SQOOP
                      • SQOOP Import o Import to HDFS
                        •  Import to HIVE
                        •  Import to HBASE
                        •  Bulk Import
                      •  Full Table
                      • Subset of a Tables
                      •  All tables in DB
                      •       o Incremental Import
                      •  SQOOP Export

                  +
                  Oozie

                      • Oozie Fundamentals

                      •  Oozie Architecture

                      •  Oozie XML File Specifications

                      •  Workflow Creation

                      •  Job Submission, Monitoring & Debugging

                      •  Job Coordinators & Bundles

                      •  Introduction to Spark

                      •  Spark vs. Map Reduce

                      •  Concepts of Transformation & Action

                      •  Sample Word Count Program in Spark with Scala

                      •  Set up a Single Node Hadoop Cluster

                      •  Hadoop Configuration Files

                      •  HIVE Installation (Hands on Installation on Laptops)

                      •  PIG Installation (Hands on Installation on Laptops)

                      •  SQOOP Installation (Hands on Installation on Laptops)

                      •  HBase Installation (Hands on Installation on Laptops)

                      •  OOZIE Installation (hands-on Installation on Laptops)

                      •  Introduction to Name Node Federation

                  About Instructors

                  Mr.Raj

                  Mr.Raj
                  Offline Course
                  Duration : 45 Hours
                  Material : Yes
                  Live Project : Yes
                  Software : Yes
                  3000 Students Enrolled
                  Course Completion Certificate

                  Related Course

                  Leave a comment