- Introduction to Big Data
- What is Big Data?
- Impact of Big Data
- Parallel Processing, Scaling, and Data Parallelism
- Tools of Big Data
- Beyond the Hype
- Big Data Use Cases
- Viewpoints about Big Data
Beginner
Online
6 Weeks
Free
Quick facts
particular | details | ||
---|---|---|---|
Medium of instructions
English
|
Mode of learning
Self study
|
Mode of Delivery
Video and Text Based
|
Learning efforts
2-3 Hours Per Week
|
Course overview
Business organizations now need expert big data professionals to understand customer behaviour out of the complex and unstructured data of pictures, posts, tweets, audio files, videos and satellite images. Edx here presents you with a foundational course to get started with big data and climb up the ladder of a career in the industry. Big Data, Hadoop, and Spark Basics is an introductory computer science course that provides an introductory understanding of big data and its tools.
Big Data, Hadoop, and Spark Basics Certification, provided by edX, is a beginner-level programme offered by IBM. The online programmes will explore big data processing tools such as Apache Spark, Hive and Hadoop. Big Data, Hadoop, and Spark Basics Certification by IBM is a 6-week long programme that demands 2-3 hours per week from the learners to devote. The self-paced programme is a free certification programme that can be upgraded to the fullest mode by paying the fee.
The highlights
- Available on edX
- Offered by IBM
- 6 Week-long Course
- Complete Online Course
- Audit and Verified Tracks Available
- Shareable Certificate Upon Completion
- Self-paced Programme
Program offerings
- Graded assignments and exams
- Edx support
- English medium
- Intermediate level course
- Video transcript in english
Course and certificate fees
Type of course
EdX provides two enrollment options to get into the Big Data, Hadoop, and Spark Basics Online Courses; audit track and verified track. In the audit track, the candidates can attend the courses without paying any fee and will be given limited access. Whereas in the verified track, the learners will have to pay the Big Data, Hadoop, and Spark Basics Certification Fees and will be provided with certification and graded assignments.
Big Data, Hadoop, and Spark Basics Certification Fees
Course Title | Total Fee in INR |
Big Data Analytics (Verified Tracks) | INR 8,097 |
certificate availability
certificate providing authority
certificate fees
Eligibility criteria
Academic Qualifications
EdX specifies some prerequisites for the learning to be eligible for joining the Big Data, Hadoop, and Spark Basics Certification Course. They are computer and IT literate and interested in learning the process of data management.
What you will learn
Big Data, Hadoop, and Spark Basics Certification Syllabus will help the learners:
- Understand the basics of big data, the tools of big data processing, and its use cases.
- Learn the application of runtime environment options and Apache Spark development.
- Familiarize with fundamental concepts of Spark programming, SparkSQL and DataFrames, and data sets.
- Closely know the ecosystem, practices, architecture, and applications of Hadoop such as MapReduce, HBase, Spark, and Distributed File System (HDFS).
Who it is for
Big Data, Hadoop, and Spark Basics Classes is an ideal programme for the professionals such as
Admission details
Join the Big Data, Hadoop, and Spark Basics Training through the following steps:
Step 1 - Browse the official URL https://www.edx.org/course/big-data-hadoop-and-spark-basics
Step 2 - Get started with the programmes by choosing the option ‘Enroll’.
The syllabus
Module 1 – What is Big Data?
Module 2 – Introduction to the Hadoop Ecosystem
- What is Hadoop
- An introduction to MapReduce
- The Hadoop Ecosystem/Common components: Introducing HDFS, Hive, HBase, and Spark, other modules
- Working with HDFS
- Working with HBase
- Lab: MapReduce
Module 3 – Introduction to Apache Spark
- Why use Apache Spark?
- Functional Programming Basics
- Parallel Programming using Resilient Distributed Datasets
- Scale-out / Data Parallelism in Apache Spark
- DataFrames and SparkSQL
- Lab: Practical examples with PySpark
Module 4 – DataFrames and SparkSQL
- Introduction to Data-Frames & SparkSQL
- RDDs in Parallel Programming and Spark
- Data-frames and Datasets
- Catalyst and Tungsten
- ETL with Data-frames
- Lab: ETL with Data-frames
- Real-world usage of SparkSQL
- Lab: SparkSQL
Module 5 – Development and Runtime Environment options
- Apache Spark architecture
- Overview of Apache Spark Cluster Modes
- How to Run an Apache Spark Application
- Using Apache Spark on IBM Cloud
- Lab: Scale-out on IBM Spark Environment in Watson Studio
- Setting Apache Spark Configuration
- Running Spark on Kubernetes
- Lab: Spark on Kube
Module 6 – Monitoring & Tuning
- The Apache Spark User Interface
- Monitoring Jobs
- Debugging of parallel jobs
- Understanding Memory resources
- Understanding Processor resources
- Lab: Monitoring and Performance tuning
Module 7 – Final Quiz
How it helps
By enrolling in the programmes, the learners will have the Big Data, Hadoop, and Spark Basics certification benefits including a thorough understanding of big data and its management tools. Plus, EdX will confer the paid students enrolled in the verified track certification of completion.
Instructors
Mr Karthik Muthuraman
Software Engineer
IBM
Other Bachelors, Other Masters
Ms Aije Egwaikhide
Senior Data Scientist
IBM
Other Bachelors, Other Masters
FAQs
The online certification programme is a joint venture between EdX and IBM.
The duration of the online programmes is 6 weeks and the candidates are required to spend 2-3 hours a week.
The online course on big data is intended for introductory level learners.
The prerequisites for the online certification programme are computer and IT literacy and an understanding of data management.
The online certification program is taught by Karthik Muthuraman who is a software engineer (Machine Learning) and Aije Egwaikhide who is a senior Data Scientist at IBM.
Articles
Popular Articles
Latest Articles
Similar Courses
Big Data and Hadoop and Spark
Board Infinity
Big Data Hadoop
Udemy
Courses of your interest
An Introduction To Coding Theory
IIT Kanpur via Swayam
C++ Foundation
PW Skills
Data Science Foundations to Core Bootcamp
Springboard
User Experience Design And Research
UM–Ann Arbor via Futurelearn
Data Analysis with Excel for Complete Beginners
CloudSwyft Global Systems, Inc via Futurelearn
Artificial intelligence Design and Engineering wit...
CloudSwyft Global Systems, Inc via Futurelearn
Data Science Fundamentals on Microsoft Azure
CloudSwyft Global Systems, Inc via Futurelearn
Artificial Intelligence Projects
Great Learning
More Courses by IBM
R Programming Basics for Data Science
IBM via Edx
Threat Intelligence Lifecycle Fundamentals
IBM via Edx
Introduction to Data Engineering
IBM via Coursera
Relational Database Administration
IBM via Coursera
Introduction to the Threat Intelligence Lifecycle
IBM via Coursera
Introduction to Web Development with HTML CSS Java...
IBM via Coursera
Introduction to Devops
IBM via Coursera
Data Scientist Career Guide and Interview Preparat...
IBM via Coursera
Data Analyst Career Guide and Interview Preparatio...
IBM via Coursera