- Welcome to the Course
- Browsing Tables with Hue
- Browsing Tables with SQL Utility Statements
- Browsing HDFS with the Hue File Browser
- Browsing HDFS from the Command Line
- Understanding S3 and Other Cloud Storage Platforms
- Browsing S3 Buckets from the Command Line
Beginner
Online
5 Weeks
Quick facts
particular | details | |
---|---|---|
Medium of instructions
English
|
Mode of learning
Self study
|
Mode of Delivery
Video and Text Based
|
Course overview
Two prominent instructors, Ian Cook and Glynn Durham of the Cloudera institute offer the Managing Big Data in Clusters and Cloud Storage online course. The online course takes a total duration of 21 hours to complete and includes a verified certificate of completion. The course will be taught in English, however, the subtitles are available in nine different languages. The Managing Big Data in Clusters and Cloud Storage syllabus is an online-based four-week course and is provided as a part of the “Modern Big Data Analysis with SQL Specialization” programme. The Managing Big Data in Clusters and Cloud Storage by Coursera is a flexible beginner-level course that provides practical experience in SQL based engines such as Apache Impala and Apache Hive.
The highlights
- Completely online program
- The program offered by Cloudera
- 21 approximate coursework hours
- Shareable and verified certificate
- Medium of instruction in English
- Five-week coursework
- Nine language subtitles
- Beginner difficulty level
- Part of Modern Big Data Analytic with SQL Specialization
- Instructed by Ian Cook
- Flexible deadline coursework
Program offerings
- Graded quizzes
- Practice quizzes
- Reading materials
- Practice exercises
- Video lectures
Course and certificate fees
Managing Big Data in Clusters and Cloud Storage Fee Structure
Particulars | Fee Amount in INR |
Managing Big Data in Clusters and Cloud Storage - Audit course | Free |
Managing Big Data in Clusters and Cloud Storage - 1 month | Rs.4,015/- |
Managing Big Data in Clusters and Cloud Storage - 3 months | Rs.8,031/- |
Managing Big Data in Clusters and Cloud Storage - 6 months | Rs.12,046/- |
certificate availability
certificate providing authority
Eligibility criteria
Education
No prior educational programme is required to enroll and complete the coursework in the Managing Big Data in Clusters and Cloud Storage certification.
Certification Qualification Details
Students must be able to complete the necessary coursework, quizzes, and materials to earn the Managing Big Data in Clusters and Cloud Storage certification.
What you will learn
The Managing Big Data in Clusters and Cloud Storage programme is planned for the following:
- The Managing Big Data in Clusters and Cloud Storage certification syllabus will focus on how to handle large datasets, how to load them into clusters, and how to store them in the cloud.
- The candidates will learn how to use different tools to search tables as well as existing databases in big data systems.
- The candidates will learn how to employ different sets of tools for the purpose of exploring files in cloud storage and distributed big data file systems.
- The candidates will become hands-on in Apache Hive and Apache Impala to build and handle big data databases and tables.
- The candidates will be able to define and select from different data types and file formats for big data systems.
Admission details
Filling the form
To enroll in the Managing Big Data in Clusters and Cloud Storage online course and earn a verified certificate, follow the steps outlined below.
Step 1: The applicant can go to the website listed to initiate an application for the programme.
Step 2: After selecting "Enroll" from the menu, students must click "Next."
Step 3: The applicant must then fill out and submit the registration or application form, which must have all relevant material.
Step 4: Before enrolling in the course, students must first pay the course fee.
The syllabus
Week 1: Orientation to Data in Clusters and Cloud Storage
Videos
Readings
- Review and Preparation
- Instructions for Downloading and Installing the Exercise Environment
- Troubleshooting the VM
Practice Exercise
- Week 1 Graded Quiz
Week 2: Defining Databases, Tables, and Columns
Videos
- Week 2 Introduction
- Introduction to the CREATE TABLE Statement
- Using Different Schemas on the Same Data
- Specifying TBLPROPERTIES
- Examining, Modifying, and Removing Tables
- Hive and Impala Interoperability
- Impala Metadata Refresh
Readings
- Creating Databases and Tables with Hue
- Creating Databases and Tables with SQL
- Permissions to Create Databases and Tables
- The ROW FORMAT Clause
- The STORED AS Clause
- The LOCATION Clause
- CREATE TABLE Shortcuts
- Using Hive SerDes
- Working with Unstructured and Semi-Structured Data
- Examining Table Structure
- Dropping Databases and Tables
- Modifying Existing Tables
Practice Exercises
- Week 2 Practice Quiz
- Week 2 Graded Quiz
Week 3: Data Types and File Types
Videos
- Week 3 Introduction
- Overview of Data Types
- Choosing the Right Data Types
- Overview of File Types
- Choosing the Right File Types
Readings
- Integer Data Types
- Decimal Data Types
- Character String Data Types
- Other Data Types
- Examining Data Types
- Out-of-Range Values
- Text Files
- Avro Files
- Parquet Files
- ORC Files
- Other File Types
- Creating Tables with Avro and Parquet Files
Practice Exercises
- Week 3 Practice Quiz
- Week 3 Graded Quiz
Week 4: Managing Datasets in Clusters and Cloud Storage
Videos
- Week 4 Introduction
- Refresh Impala's Metadata Cache after Loading Data
- Loading Files into HDFS with Hue's Table Browser
- Loading Files into HDFS with Hue's File Browser
- Loading Files into HDFS from the Command Line
- Loading Files into S3 from the Command Line
- Using Hive and Impala to Load Data into Tables
- Conclusion
Readings
- More about HDFS Shell Commands
- Chaining and Scripting with HDFS Commands
- HDFS Permissions
- Other Ways to Load Files into S3
- S3 Permissions
- Missing Values
- Character Sets
- Using Sqoop to Import Data
- More Sqoop Import Options
- Using Sqoop to Export Data
- SQL LOAD DATA Statements
- SQL INSERT Statements
- SQL INSERT ... SELECT and CTAS Statements
Practice Exercises
- Week 4 Practice Quiz
- Week 4 Graded Quiz
Week 5: Optimizing Hive and Impala (Honors)
Videos
- Week 5 Introduction
- What to Do When Queries Are Too Complex
- What to Do When Queries Take Too Long
- When to Use Table Partitioning
- When to Use Complex Columns
- File Systems versus Storage Engines
Readings
- Creating and Querying Views
- Modifying and Removing Views
- Materialized and Non-Materialized Views
- The ORDER BY Clause in Views
- Choosing Which Query Engine to Use
- Understanding Map Tasks and Reduce Tasks
- Hive Query Performance Patterns
- Understanding Execution Plans
- Table and Column Statistics
- Other Strategies for Query Optimization
- Creating Partitioned Tables
- Loading Data with Dynamic Partition
- Loading Data with Static Partitioning
- Risks of Using Partitioning
- Complex Data Types
- Creating Tables with Complex Data
- Querying Complex Data with Hive
- Querying Complex Data with Impala
- Complex Data in Practice
- Overview of Apache Kudu
Practice Exercises
- Week 5 Practice Quiz
- Week 5 Graded Quiz
Scholarship Details
Coursera will provide financial assistance to students who cannot afford to cover the course fee. Candidates may qualify for financial assistance by using the drop-down menu to the left of the "Enroll" tab and clicking "Financial Aid." After the applications have been submitted, the approved applicants will be notified.
How it helps
Managing Big Data in Clusters and Cloud Storage certification benefits the candidates starting at a beginner level of learning with flexible based coursework in the area of Big Data and SQL. Candidates will be able to hone their skills and run queries through SQL engines. The candidate's abilities would allow him or her to carve out a promising future in big data analytics and SQL and build his or her career in the world of big data as a confident candidate with hands-on tools and experience.
Ian Cook and Glynn Durham from the Cloudera institute offer the coursework, signs, approves, and authenticates the certification, making it an internationally recognised certificate. With such a credential, an applicant would be able to communicate with potential employers and recruiters in online professional networking portals such as Linkedin. For any future project partnership, the applicant would be willing to partner with like-minded colleagues or experts. Furthermore, the applicant will be more likely to be hired in specialised roles requiring knowledge of SQL engines and big data implementation.
Instructors
Mr Glynn Durham
Senior Instructor
Cloudera
Other Masters
Mr Ian Cook
Staff Curriculum Developer
Cloudera
FAQs
Yes, candidates who apply for the Managing Big Data in Clusters and Cloud Storage training programme can attend the programme for one week for free.
In a self-paced learning environment, Managing Big Data in Clusters and Cloud Storage benefits the candidate because they can learn at their pace without following a rigid schedule.
The system requirements are - 64-bit OS type, Windows or macOS, or Linux, 25GB free disk space, 8 GB RAM or higher, Windows XP, AMD-V or Intel VT-x virtualization, and 7-Zip or WinZip.
Applicants need to visit the official website to register for the programme and submit the application
Managing Big Data in Clusters and Cloud Storage online course as a verified credential can be added to a candidate's profile, resume, or CV, as well as shared on social media.
Subtitles in nine languages are given to help the candidate's learning since the course is solely taught in English.
The coursework will be completely done online which will take a total of 21 hours to complete.
The applicant does not require any special credentials to apply for and learn about the Managing Big Data in Clusters and Cloud Storage certification.
Yes, to obtain financial aid for the Managing Big Data in Clusters and Cloud Storage certificate, students must apply for the "Financial Assistance" option after choosing the "Enroll" option on the website page.
Articles
Popular Articles
Similar Courses
Computational Thinking and Big Data
The University of Adelaide, Adelaide via Edx
Big Data and Language 1
Korea Advanced Institute of Science and Technology, Daejeon via Coursera
Security and Privacy for Big Data-Part 2
EIT Digital via Coursera
Big Data Foundation
Board Infinity
Big Data and Language 2
Korea Advanced Institute of Science and Technology, Daejeon via Coursera
Analyzing Big Data with SQL
Cloudera via Coursera
Foundations for Big Data Analysis with SQL
Cloudera via Coursera
Foundations of Mining Non-Structured Medical Data
EIT via Coursera
Biostatistics for Big Data Applications
The University of Texas Medical Branch, Galveston via Edx
Knowledge Management and Big Data in Business
The Hong Kong Polytechnic University, Hong Kong via Edx
Courses of your interest
C++ Foundation
PW Skills
Data Science Foundations to Core Bootcamp
Springboard
User Experience Design And Research
UM–Ann Arbor via Futurelearn
Data Analysis with Excel for Complete Beginners
CloudSwyft Global Systems, Inc via Futurelearn
Artificial intelligence Design and Engineering wit...
CloudSwyft Global Systems, Inc via Futurelearn
Data Science Fundamentals on Microsoft Azure
CloudSwyft Global Systems, Inc via Futurelearn
Artificial Intelligence Projects
Great Learning