- The Course Overview
- Core Concepts in Spark and PySpark
- Setting Up Spark on Windows and PySpark
- SparkContext, SparkConf and Spark Shell
Quick Facts
particular | details | |||
---|---|---|---|---|
Medium of instructions
English
|
Mode of learning
Self study
|
Mode of Delivery
Video and Text Based
|
Course and certificate fees
Fees information
₹ 449 ₹3,499
certificate availability
Yes
certificate providing authority
Udemy
Who it is for
What you will learn
Knowledge of big data
The syllabus
Install PySpark and Setup Your Development Environment
Getting Your Big Data into the Spark Environment Using RDDs
- Loading Data onto Spark RDDs
- Parallelization with Spark RDDs
- RDD Operation Basics
Big Data Cleaning and Wrangling with Spark Notebooks
- Using Spark Notebooks for Quick Iteration of Ideas
- Sampling/Filtering RDDs to Pick-Out Relevant Data Points
- Splitting Datasets and Creating New Combinations with Set Operations
Aggregating and Summarizing Data into Useful Reports
- Calculating Averages with Map and Reduce
- Faster Average Computation with Aggregate
- Pivot Tabling with Key-Value Paired Data Points
Powerful Exploratory Data Analysis with MLlib
- Computing Summary Statistics with MLlib
- Using Pearson and Spearman to Discover Correlations
- Testing Your Hypotheses on Large Datasets
Putting Structure on Your Big Data with SparkSQL
- Manipulating DataFrames with SparkSQL Schemas
- Using the Spark DSL to Build Queries for Structured Data Operations
Articles
Popular Articles
prev
next
Latest Articles
Top 50 Hadoop Interview Questions for Freshers and Experienced Professionals
Updated On 17 Apr, 2024
prev
next