Introduction
Apache PySpark Fundamentals
Quick Facts
particular | details | |||
---|---|---|---|---|
Medium of instructions
English
|
Mode of learning
Self study
|
Mode of Delivery
Video and Text Based
|
Course overview
Python and Apache Spark have teamed up to create PySpark. As opposed to Python, which is a general-purpose, high-level language of programming, Apache Spark is an open-source cluster computing framework focused on speed, usability, and streaming analytics. Apache PySpark Fundamentals certification course is developed by Johnny F. - Programmer & Instructor and is made available by Udemy for learners who want to become professional PySpark developers.
Apache PySpark Fundamentals online training is a comprehensive program that incorporates 1.5 hours of digital lessons supported by downloadable learning materials that focus on teaching the basics of Apache Spark with Python and equip students with all the knowledge they need to create Spark applications using PySpark. Apache PySpark Fundamentals online course covers topics such as Spark functions and resilient distributed datasets, and it offers an understanding of Apache Spark and how it can help with data science and big data analytics.
The highlights
- Certificate of completion
- Self-paced course
- 1.5 hours of pre-recorded video content
- 1 downloadable resource
Program offerings
- Online course
- Learning resources
- 30-day money-back guarantee
- Unlimited access
- Accessible on mobile devices and tv
Course and certificate fees
Fees information
certificate availability
Yes
certificate providing authority
Udemy
Who it is for
What you will learn
After completing the Apache PySpark Fundamentals online certification, learners will gain a foundational understanding of the principles and concepts involved with Apache PySpark as well as will acquire knowledge of the Apache Spark ecosystem. In this Apache PySpark course, learners will explore the functionalities of resilient distributed datasets as well as will acquire the skills to work with rows, columns, and DataFrame APIs. In this Apache PySpark certification, learners will also study the strategies to leverage built-in Spark functions and will acquire the skills to create their functions in Spark.
The syllabus
Introduction
Intro to Apache Spark
- Getting started
- What is PySpark
- Spark components partitions, transformations, and actions
- Tech setup
DataFrames
- Working with DataFrame API
- Schemas
- Working with columns and rows
Functions
- Built-in functions
- Working with dates
- User-defined functions
- Join
Resilient Distributed Datasets
- RDDs
- Working with RDDs
Conclusion
Conclusion