Careers360 Logo
ask-icon
share
    Compare

    Quick Facts

    Medium Of InstructionsMode Of LearningMode Of Delivery
    EnglishSelf StudyVideo and Text Based

    Course Overview

    The Distributed Computing with Spark SQL certification course is a 14 hours course. This course is available on the educational course platform Coursera, and the syllabus is curated by the University of California. Also, this course is part of the main programme, Learn SQL Basics for Data Science Specialization. 

    This Distributed Computing with Spark SQL training course is made for candidates with some ideas, and information about SQL. This Coursera programme will be great for students who want to step ahead in their data journey. All 4 modules developed for this course are intertwined among themselves. In the end, when all 4 modules are learnt, the candidates will have learned many ways related to Spark SQL, and ways to construct reliable data points.

    The Highlights

    • Online course
    • Shareable certificate
    • 14 hours for completion
    • English course title available 

    Programme Offerings

    • Flexible Deadlines
    • Short Programme
    • Different Subtitles

    Courses and Certificate Fees

    Certificate AvailabilityCertificate Providing Authority
    yesCoursera

    The Distributed Computing with Spark SQL certification fee is based on the monthly plans mainly for 1 month, 3 months, or 6 months. All these monthly plans have the number of hours to be learned mentioned and also a certification at the course end to be shared with the candidates.

    Distributed Computing with Spark SQL Fee Details

    Description

    Amount in INR

    1 Month

    Rs. 3,257/month


    Eligibility Criteria

    Certification Qualifying Details

    • The Distributed Computing with Spark SQL certification by Coursera is offered when the candidates are done with every course specialization.

    What you will learn

    SQL knowledge

    Here are some things that will be learnt from the Distributed Computing with Spark SQL certification syllabus:

    • Using a collaborative workspace that can help in writing Spark SQL that can be easily executed.
    • Learning to inspect the Spark Up that shall be used for analyzing the query performance that helps in ultimately identifying bottlenecks.
    • Curating end-to-end pipelines that will help read the data by transforming it and ultimately saving the result.
    • Help in building a medallion either gold, bronze, or silver to ensure performance, scalability, and reliability.

    Who it is for

    Distributed Computing with Spark SQL shall become ideal for people like data scientists, and computer programmers


    Admission Details

    To get admission to the Distributed Computing with Spark SQL classes, the students can follow these steps: 

    Step 1: Follow the official URL: https://www.coursera.org/learn/spark-sql#.

    Step 2: During step 2, get to the ‘Enroll Now’ button and then click on it

    Step 3: After the account creation is done, then log in must be done which will then lead the students to choose either the free mode or the paid mode.

    Step 4: The above decision will be the deciding factor for admission to this course.

    The Syllabus

    Videos
    • Course Introduction
    • Why Distributed Computing?
    • Spark DataFrames
    • The Databricks Environment
    • SQL in Notebooks
    • Import Data
    Readings
    • A Note From UC Davis
    • Readings and Resources
    • Assignment #1 - Queries in Spark SQL
    Practice Exercises
    • Assignment #1 Quiz - Queries in Spark SQL
    • Module 1 Quiz

    Videos
    • Module Introduction
    • Spark Terminology
    • Caching
    • Shuffle Partitions
    • Spark UI
    • Adaptive Query Execution (AQE)
    Reading
    • Readings
    • Assignment #2 - Spark Internals
    Practice Exercises
    • Assignment #2 Quiz - Spark Internals
    • Module 2 Quiz

    Videos
    • Module Introduction
    • Spark as a Connector
    • Accessing Data
    • File Formats
    • JSON, Schemas and Types
    • Writing Data
    • Tables and Views
    Readings
    • Readings
    • Assignment #3 - Engineering Data Pipelines
    Practice Exercises
    • Assignment #3 Quiz - Engineering Data Pipelines30m
    • Module 3 Quiz

    Videos
    • Module Introduction
    • Data Lakes vs. Data Warehouses
    • What is a Lakehouse?
    • Delta Lake
    • Delta Lake (Demo)
    • Delta Advanced Features (Demo)
    • Continuing with Spark and Data Science
    • Course Summary
    Readings
    • Readings
    • Assignment #4 - Lakehouse
    Practice Exercises
    • Assignment #4 Quiz - Lakehouse
    • Module 4 Quiz

    Instructors

    UC Davis Frequently Asked Questions (FAQ's)

    1: The Distributed Computing with Spark SQL online course is part of which main programme?

    The name of the main course is ‘Learn SQL Basics for Data Science Specialization'.

    2: Are the deadlines mentioned on the Coursera platform flexible?

    Yes, the deadlines can be adjusted on the Coursera platform.

    3: What’s the Distributed Computing with Spark SQL online course’s level?

    The level is intermediate.

    4: Name the tutors for this Distributed Computing with Spark SQL course?

    There are 2 tutors namely Brooke Wenig, and Conor Murphy.

    5: Which is the supported institution of the online course on Distributed Computing with Spark SQL?

    UC Davis is a partnering institution.

    Articles

    Student Community: Where Questions Find Answers

    Ask and get expert answers on exams, counselling, admissions, careers, and study options.