- Course Introduction
- Course Guide
Online
₹ 649 1,699
Quick facts
particular | details | |
---|---|---|
Medium of instructions
English
|
Mode of learning
Self study
|
Mode of Delivery
Video and Text Based
|
Course overview
The Hadoop ecosystem is a framework or software that offers a variety of offerings to fix big data problems. Hadoop comprises Apache projects as well as a variety of commercial services and analytics. Learn Big Data: The Hadoop Ecosystem Masterclass certification course is designed by Edward Viaene - DevOps, Cloud & Big Data Specialist, and delivered by Udemy, which is intended for individuals who want to master big data concepts using Hadoop's functionalities.
Learn Big Data: The Hadoop Ecosystem Masterclass online classes incorporate 6 hours of video-based learning resources supported by articles and downloadable resources designed to provide individuals with foundational understanding to discuss real-world challenges and solutions with industry experts. Learn Big Data: The Hadoop Ecosystem Masterclass online training covers topics such as Pig, Hive, Spark, Hbase, HDFS, Hadoop stack, Hadoop security, high availability, query optimization, and how to utilize the commercial software in the big data industry presently, using both batch and real-time processing.
The highlights
- Certificate of completion
- Self-paced course
- 6 hours of pre-recorded video content
- 1 article
- 1 downloadable resource
Program offerings
- Online course
- Learning resources
- 30-day money-back guarantee
- Unlimited access
- Accessible on mobile devices and tv
Course and certificate fees
Fees information
certificate availability
certificate providing authority
What you will learn
After completing the Learn Big Data: The Hadoop Ecosystem Masterclass online certification, individuals will be introduced to the fundamentals of big data and big data analytics using Hadoop. In this Hadoop certification, individuals will explore the concepts involved with Hadoop architecture, Hadoop stack, and Hadoop security as well as will acquire knowledge of the functionalities of Spark, Hive, Pig, Hbase, Kerberos, Ranger, SPNEGO, Phoenix, and HDP. in this Hadoop course, individuals will also acquire knowledge of the strategies and methodologies involved with high availability, HDFS encryption, scheduling, query optimization, and real-time data processing.
The syllabus
Introduction
What is Big Data and Hadoop
- What is Big Data
- Examples of Big Data
- What is Data Science
- What is Hadoop
- Hadoop Distributions
- What is Big Data Quiz
Introduction to Hadoop
- Hadoop Installation
- Demo: Hortonworks Sandbox
- Demo: Hadoop Installation - Part 1
- Demo: Hadoop Installation - Part 2
- Introduction to HDFS
- DataNode Communications
- Demo: HDFS - Part 1
- Demo: HDFS - Part 2 - Using Ambari
- MapReduce WordCount Example
- Demo: MapReduce WordCount
- Lines that span blocks
- Introduction to Yarn
- Demo: Yarn and ResourceManager UI
- Ambari API and Blueprints
- Demo: Ambari API and Blueprints
- ETL Processing in Hadoop
- Introduction Quiz
Pig
- Introduction to Pig
- Demo: Part 1 - Pig Installation
- Demo: Part 2 - Pig Commands
- Demo: Part 3 - More Pig Commands
Apache Spark
- Introduction to Apache Spark
- Spark WordCount
- Demo: Spark installation and WordCount
- RDDs
- Demo: RDD Transformations and Actions
- Overview of RDD Transformations and Actions
- Spark MLLib
Hive
- Introduction to Hive
- Hive Queries
- Demo: Hive Installation and Hive Queries
- Hive Partitioning, Buckets, UDFs, and SerDes
- The Stinger Initiative
- Hive in Spark
Real Time Processing
- Introduction to Realtime Processing
Kafka
- Introduction to Kafka
- Kafka Topics
- Kafka Messages and Log Compaction
- Kafka Use Cases and Usage
- Demo: Kafka Installation and Usage
Storm
- Introduction to Storm
- A Storm Topology
- Demo: Storm installation and Example Topology
- Storm Message Processing and Reliability
- Trident
Spark Streaming
- Introduction to Spark Streaming
- Spark Streaming Architecture
- Spark Receivers and WordCount Streaming Example
- Demo: Spark Streaming with Kafka
- Spark Streaming State and Checkpointing
- Demo: Stateful Spark Streaming
- More Spark Streaming Features
HBase
- Introduction to HBase
- HBase Tables
- The HBase Meta Table
- HBase Writes
- HBase Reads
- Compactions
- Crash Recovery
- Region Splits
- Hotspotting
- Demo: HBase Install
- Demo: HBase Shell
- Demo: Spark HBase
Phoenix
- Introduction to Phoenix
- Salting, Compression, and Indexes in Phoenix
- JOINs, VIEWs, and Phoenix in Spark
- Demo: Phoenix
Hadoop Security
- Introduction to Kerberos
- Kerberos on Hadoop
- Kerberos Terminology
- Demo: Enabling Kerberos
- Introduction to SPNEGO
- Demo: SPNEGO
- Introduction to Knox
Ranger
- Introduction to Ranger
- Demo: Ranger Installation
- Demo: Ranger with Hive
HDFS Encryption
- Introduction to HDFS Transparent Encryption
- Demo: HDFS Encryption using Ranger KMS
Advanced Topics
- Yarn Schedulers
- Demo: Capacity Scheduler
- Label based scheduling
- Yarn Sizing
- Hive Query Optimizations
- Join Strategies
- Spark Optimizations
- NameNode High Availability
- Demo: NameNode High Availability Setup
- Database High Availability
Thank You
- Thank You!
- Bonus Lecture: My Other Courses
Instructors
Mr Edward Viaene
Big Data Specialist
Udemy