- Introduction to Big Data & Big Data Challenges
- Limitations & Solutions of Big Data Architecture
- Hadoop & its Features
- Hadoop Ecosystem
- Hadoop 2.x Core Components
- Hadoop Storage: HDFS (Hadoop Distributed File System)
- Hadoop Processing: MapReduce Framework
- Different Hadoop Distributions
- Home
- Simpliv Learning
- Courses
- Big Data Hadoop Certification Training
Online
35 Hours
Quick facts
particular | details | |
---|---|---|
Medium of instructions
English
|
Mode of learning
Self study, Virtual Classroom
|
Mode of Delivery
Video and Text Based
|
Course and certificate fees
certificate availability
Yes
certificate providing authority
Simpliv Learning
The syllabus
Understanding Big Data and Hadoop
Hadoop Architecture and HDFS
- Hadoop 2.x Cluster Architecture Preview
- Federation and High Availability Architecture Preview
- Typical Production Hadoop Cluster
- Hadoop Cluster Modes
- Common Hadoop Shell Commands Preview
- Hadoop 2.x Configuration Files
- Single Node Cluster & Multi-Node Cluster set up
- Basic Hadoop Administration
Hadoop MapReduce Framework
- Traditional way vs MapReduce way
- Why MapReduce
- YARN Components
- YARN Architecture
- YARN MapReduce Application Execution Flow
- YARN Workflow
- Anatomy of MapReduce Program
- Input Splits, Relation between Input Splits and HDFS Blocks
- MapReduce: Combiner & Partitioner
Advanced Hadoop MapReduce
- Counters
- Distributed Cache
- MRunit
- Reduce Join
- Custom Input Format
- Sequence Input Format
- XML file Parsing using MapReduce
Apache Pig
- Introduction to Apache Pig
- MapReduce vs Pig
- Pig Components & Pig Execution
- Pig Data Types & Data Models in Pig
- Pig Latin Programs
- Shell and Utility Commands
- Pig UDF & Pig Streaming
- Testing Pig scripts with Punit
- Aviation use-case in PIG
Apache Hive
- Introduction to Apache Hive
- Hive vs Pig
- Hive Architecture and Components
- Hive Metastore
- Comparison with Traditional Database
- Hive Data Types and Data Models
- Hive Partition
- Hive Bucketing
- Hive Tables (Managed Tables and External Tables)
- Importing Data
- Querying Data & Managing Outputs
- Hive Script & Hive UDF
- Retail use case in Hive
Advanced Apache Hive and HBase
- Hive QL: Joining Tables, Dynamic Partitioning
- Custom MapReduce Scripts
- Hive Indexes and views
- Hive Query Optimizers
- Hive Thrift Server
- Hive UDF
- Apache HBase: Introduction to NoSQL Databases and HBase
- HBase v/s RDBMS
- HBase Components
- HBase Architecture
- HBase Run Modes
- HBase Configuration
- HBase Cluster Deployment
Advanced Apache HBase
- HBase Data Model
- HBase Shell
- HBase Client API
- Hive Data Loading Techniques
- Apache Zookeeper Introduction
- ZooKeeper Data Model
- Zookeeper Service
- HBase Bulk Loading
- Getting and Inserting Data
- HBase Filters
Processing Distributed Data with Apache Spark
- What is Spark
- Spark Ecosystem
- Spark Components
- What is Scala
- Why Scala
- SparkContext
- Spark RDD
Oozie
- Oozie
- Oozie Components
- Oozie Workflow
- Scheduling Jobs with Oozie Scheduler
- Demo of Oozie Workflow
- Oozie Coordinator
- Oozie Commands
- Oozie Web Console
- Oozie for MapReduce
- Combining flow of MapReduce Jobs
- Hive in Oozie
- Hadoop Talend Integration