- Introduction to Spark
- (Old) Free Account creation in Databricks
- (New) Free Account creation in Databricks
- Provisioning a Spark Cluster
- Basics about Notebooks
- Why we should learn Apache Spark?
- Spark Architecture Components
- Driver
- Partitions
- Executors
Apache Spark with Scala useful for Databricks Certification
Quick Facts
particular | details | |||
---|---|---|---|---|
Medium of instructions
English
|
Mode of learning
Self study
|
Mode of Delivery
Video and Text Based
|
Course and certificate fees
Fees information
certificate availability
Yes
certificate providing authority
Udemy
The syllabus
Introduction
Download Resources
Introduction to Spark and Spark Architecture Components
Spark Execution
- Spark Jobs
- Spark Stages
- Spark Tasks
- Practical Demonstration of Jobs, Tasks and Stages
Spark SQL, Dataframes and Datasets
- Spark RDD (Create and Display Practical)
- Spark Dataframe (Create and Display Practical)
- Anonymus Functions in Scala
- Extra (Optional on Spark DataFrame)
- Extra (Optional on Spark DataFrame) in Details
- Spark Datasets (Create and Display Practical)
- Caching
- Notes on reading files with Spark
- Data Source CSV File
- Data Source JSON File
- Data Source LIBSVM File
- Data Source Image File
- Data Source Arvo File
- Data Source Parquet File
- Untyped Dataset Operations (aka DataFrame Operations)
- Running SQL Queries Programmatically
- Global Temporary View
- Creating Datasets
- Scalar Functions (Built-in Scalar Functions) Part 1
- Scalar Functions (Built-in Scalar Functions) Part 2
- Scalar Functions (Built-in Scalar Functions) Part 3
- User Defined Scalar Functions
Spark RDD
- Operation in Apache Spark
- Transformations
- map(function)
- filter(function)
- flatMap(function)
- mapPartitions(func)
- mapPartitionsWithIndex(func)
- sample(withReplacement, fraction, seed)
- union(otherDataset)
- intersection(otherDataset)
- distinct([numPartitions]))
- groupby(func)
- groupByKey([numPartitions])
- reduceByKey(func, [numPartitions])
- aggregateByKey(zeroValue)(seqOp, combOp, [numPartitions])
- sortByKey([ascending], [numPartitions])
- join(otherDataset, [numPartitions])
- cogroup(otherDataset, [numPartitions])
- cartesian(otherDataset)
- coalesce(numPartitions)
- repartition(numPartitions)
- repartitionAndSortWithinPartitions(partitioner)
- Wide vs. Narrow Transformations
- Actions
- reduce(func)
- collect()
- count()
- first()
- take(n)
- takeSample(withReplacement, num, [seed])
- takeOrdered(n, [ordering])
- countByKey()
- foreach(func)
- Shuffling
- Persistence (Cache)
- Unpersist
- Broadcast Variables
- Accumulators
- Important Lecture
- Bonus
Articles
Popular Articles
Latest Articles
Similar Courses
Courses of your Interest
C++ Foundation
PW Skills
Advanced CFD Meshing using ANSA
Skill Lync
Data Science Foundations to Core Bootcamp
Springboard

User Experience Design And Research
UM–Ann Arbor via Futurelearn

Fundamentals of Agile Project Management
UCI Irvine via Futurelearn

Artificial intelligence Design and Engineering wit...
CloudSwyft Global Systems, Inc via Futurelearn