Learn By Example: Hadoop, MapReduce for Big Data problems

Udemy

Learn about the principles of parallel thinking to master Hadoop and MapReduce functionality.

Online

₹ 4099

particular

details

                                    Medium of instructions
                                    English

                                    Mode of learning
                                    Self study

                                    Mode of Delivery
                                    Video and Text Based

Introduction

You, this course and Us

Why is Big Data a Big Deal

The Big Data Paradigm
Serial vs Distributed Computing
What is Hadoop?
HDFS or the Hadoop Distributed File System
MapReduce Introduced
YARN or Yet Another Resource Negotiator

Installing Hadoop in a Local Environment

Hadoop Install Modes
Hadoop Standalone mode Install
Hadoop Pseudo-Distributed mode Install

The MapReduce "Hello World"

The basic philosophy underlying MapReduce
MapReduce - Visualized And Explained
MapReduce - Digging a little deeper at every step
"Hello World" in MapReduce
The Mapper
The Reducer
The Job

Run a MapReduce Job

Get comfortable with HDFS
Run your first MapReduce Job

Juicing your MapReduce - Combiners, Shuffle and Sort and The Streaming API

Parallelize the reduce phase - use the Combiner
Not all Reducers are Combiners
How many mappers and reducers does your MapReduce have?
Parallelizing reduce using Shuffle And Sort
MapReduce is not limited to the Java language - Introducing the Streaming API
Python for MapReduce

HDFS and Yarn

HDFS - Protecting against data loss using replication
HDFS - Name nodes and why they're critical
HDFS - Checkpointing to backup name node information
Yarn - Basic components
Yarn - Submitting a job to Yarn
Yarn - Plug in scheduling policies
Yarn - Configure the scheduler

MapReduce Customizations For Finer Grained Control

Setting up your MapReduce to accept command line arguments
The Tool, ToolRunner and GenericOptionsParser
Configuring properties of the Job object
Customizing the Partitioner, Sort Comparator, and Group Comparator

The Inverted Index, Custom Data Types for Keys, Bigram Counts and Unit Tests!

The heart of search engines - The Inverted Index
Generating the inverted index using MapReduce
Custom data types for keys - The Writable Interface
Represent a Bigram using a WritableComparable
MapReduce to count the Bigrams in input text
Setting up your Hadoop project
Test your MapReduce job using MRUnit

Input and Output Formats and Customized Partitioning

Introducing the File Input Format
Text And Sequence File Formats
Data partitioning using a custom partitioner
Make the custom partitioner real in code
Total Order Partitioning
Input Sampling, Distribution, Partitioning and configuring these
Secondary Sort

Recommendation Systems using Collaborative Filtering

Introduction to Collaborative Filtering
Friend recommendations using chained MR jobs
Get common friends for every pair of users - the first MapReduce
Top 10 friend recommendation for every user - the second MapReduce

Hadoop as a Database

Structured data in Hadoop
Running an SQL Select with MapReduce
Running an SQL Group By with MapReduce
A MapReduce Join - The Map Side
A MapReduce Join - The Reduce Side
A MapReduce Join - Sorting and Partitioning
A MapReduce Join - Putting it all together

K-Means Clustering

What is K-Means Clustering?
A MapReduce job for K-Means Clustering
K-Means Clustering - Measuring the distance between points
K-Means Clustering - Custom Writables for Input/Output
K-Means Clustering - Configuring the Job
K-Means Clustering - The Mapper and Reducer
K-Means Clustering : The Iterative MapReduce Job

Setting up a Hadoop Cluster

Manually configuring a Hadoop cluster (Linux VMs)
Getting started with Amazon Web Servicies
Start a Hadoop Cluster with Cloudera Manager on AWS

Appendix

Setup a Virtual Linux Instance (For Windows users)
[For Linux/Mac OS Shell Newbies] Path and other Environment Variables

Popular Courses

Popular Platforms

Popular Searches

Learn By Example: Hadoop, MapReduce for Big Data problems

Online

₹ 4099

Quick Facts

Course overview

The highlights

Program offerings

Course and certificate fees

Fees information

certificate availability

certificate providing authority

Who it is for

What you will learn

The syllabus

Introduction

Why is Big Data a Big Deal

Installing Hadoop in a Local Environment

The MapReduce "Hello World"

Run a MapReduce Job

Juicing your MapReduce - Combiners, Shuffle and Sort and The Streaming API

HDFS and Yarn

MapReduce Customizations For Finer Grained Control

The Inverted Index, Custom Data Types for Keys, Bigram Counts and Unit Tests!

Input and Output Formats and Customized Partitioning

Recommendation Systems using Collaborative Filtering

Hadoop as a Database

K-Means Clustering

Setting up a Hadoop Cluster

Appendix

Instructors

Articles

Popular Articles

Trending Courses

Popular Courses

Popular Platforms

Learn more about the Courses

Download Careers360 App

All this at the convenience of your phone

Popular Searches

Learn By Example: Hadoop, MapReduce for Big Data problems

Online

₹ 4099

Quick Facts

Course overview

The highlights

Program offerings

Course and certificate fees

Fees information

certificate availability

certificate providing authority

Who it is for

What you will learn

The syllabus

Introduction

Why is Big Data a Big Deal

Installing Hadoop in a Local Environment

The MapReduce "Hello World"

Run a MapReduce Job

Juicing your MapReduce - Combiners, Shuffle and Sort and The Streaming API

HDFS and Yarn

MapReduce Customizations For Finer Grained Control

The Inverted Index, Custom Data Types for Keys, Bigram Counts and Unit Tests!

Input and Output Formats and Customized Partitioning

Recommendation Systems using Collaborative Filtering

Hadoop as a Database

K-Means Clustering

Setting up a Hadoop Cluster

Appendix

Instructors

Articles

Popular Articles

Trending Courses

Popular Courses

Popular Platforms

Learn more about the Courses

Thank You!

Download Careers360 App

All this at the convenience of your phone