- Introduction to Spark
- (Old) Free Account creation in Databricks
- (New) Free Account creation in Databricks
- Provisioning a Spark Cluster
- Basics about Notebooks
- Why we should learn Apache Spark?
- Spark Architecture Components
- Driver
- Partitions
- Executors
Beginner
Online
₹ 449 799
Quick facts
particular | details | |
---|---|---|
Medium of instructions
English
|
Mode of learning
Self study
|
Mode of Delivery
Video and Text Based
|
Course and certificate fees
Fees information
certificate availability
certificate providing authority
The syllabus
Introduction
Download Resources
Introduction to Spark and Spark Architecture Components
Spark Execution
- Spark Jobs
- Spark Stages
- Spark Tasks
- Practical Demonstration of Jobs, Tasks and Stages
Spark SQL, Dataframes and Datasets
- Spark RDD (Create and Display Practical)
- Spark Dataframe (Create and Display Practical)
- Anonymus Functions in Scala
- Extra (Optional on Spark DataFrame)
- Extra (Optional on Spark DataFrame) in Details
- Spark Datasets (Create and Display Practical)
- Caching
- Notes on reading files with Spark
- Data Source CSV File
- Data Source JSON File
- Data Source LIBSVM File
- Data Source Image File
- Data Source Arvo File
- Data Source Parquet File
- Untyped Dataset Operations (aka DataFrame Operations)
- Running SQL Queries Programmatically
- Global Temporary View
- Creating Datasets
- Scalar Functions (Built-in Scalar Functions) Part 1
- Scalar Functions (Built-in Scalar Functions) Part 2
- Scalar Functions (Built-in Scalar Functions) Part 3
- User Defined Scalar Functions
Spark RDD
- Operation in Apache Spark
- Transformations
- map(function)
- filter(function)
- flatMap(function)
- mapPartitions(func)
- mapPartitionsWithIndex(func)
- sample(withReplacement, fraction, seed)
- union(otherDataset)
- intersection(otherDataset)
- distinct([numPartitions]))
- groupby(func)
- groupByKey([numPartitions])
- reduceByKey(func, [numPartitions])
- aggregateByKey(zeroValue)(seqOp, combOp, [numPartitions])
- sortByKey([ascending], [numPartitions])
- join(otherDataset, [numPartitions])
- cogroup(otherDataset, [numPartitions])
- cartesian(otherDataset)
- coalesce(numPartitions)
- repartition(numPartitions)
- repartitionAndSortWithinPartitions(partitioner)
- Wide vs. Narrow Transformations
- Actions
- reduce(func)
- collect()
- count()
- first()
- take(n)
- takeSample(withReplacement, num, [seed])
- takeOrdered(n, [ordering])
- countByKey()
- foreach(func)
- Shuffling
- Persistence (Cache)
- Unpersist
- Broadcast Variables
- Accumulators
- Important Lecture
- Bonus
Articles
Popular Articles
Latest Articles
Similar Courses
Courses of your interest
C++ Foundation
PW Skills
Data Science Foundations to Core Bootcamp
Springboard
User Experience Design And Research
UM–Ann Arbor via Futurelearn
Data Analysis with Excel for Complete Beginners
CloudSwyft Global Systems, Inc via Futurelearn
Artificial intelligence Design and Engineering wit...
CloudSwyft Global Systems, Inc via Futurelearn
Data Science Fundamentals on Microsoft Azure
CloudSwyft Global Systems, Inc via Futurelearn
Artificial Intelligence Projects
Great Learning