Google Cloud Big Data & Machine Learning Fundamentals (GC-GCBDMLF)

This course will introduce you to Google Cloud's big data and machine learning functions. You'll begin with a quick overview of Google Cloud and then dive deeper into its data processing capabilities.


Roughly one year of experience with one or more of the following: 

  • A common query language such as SQL. 
  • Extract, transform, and load activities. 
  • Data modeling. 
  • Machine learning and/or statistics. 
  • Programming in Python.


  • Identify the purpose and value of the key Big Data and Machine Learning products in Google Cloud. 
  • Use Cloud SQL and Dataproc to migrate existing MySQL and Hadoop/Pig/Spark/Hive workloads to Google Cloud. 
  • Employ BigQuery and Cloud SQL to carry out interactive data analysis. 
  • Choose between different data processing products in Google Cloud. 
  • Create ML models with BigQuery ML, ML APIs, and AutoML.


  • Data analysts, data scientists, and business analysts who are getting started with Google Cloud. 
  • Individuals responsible for designing pipelines and architectures for data processing, creating and maintaining machine learning and statistical models, querying datasets, visualizing query results, and creating reports. 
  • Executives and IT decision makers evaluating Google Cloud for use by data scientists.
Mostra dettagli


Course Outline

The course includes presentations, demonstrations, and hands-on labs.


Module 1: Introducing Google Cloud Platform

  • Google Platform Fundamentals Overview.
  • Google Cloud Platform Big Data Products.
  • Lab: Sign up for Google Cloud Platform.


Module 2: Compute and Storage Fundamentals

  • CPUs on demand (Compute Engine).
  • A global file system (Cloud Storage).
  • Cloud Shell.
  • Lab: Set up an Ingest-Transform-Publish data processing pipeline.


Module 3: Data Analytics on the Cloud

  • Stepping stones to the cloud.
  • Cloud SQL: your SQL database on the cloud.
  • Lab: Importing data into CloudSQL and running queries.
  • Spark on Dataproc.
  • Lab: Machine Learning Recommendations with Spark on Dataproc.


Module 4: Scaling Data Analysis

  • Fast random access.
  • Datalab.
  • BigQuery.
  • Lab: Build a Machine Learning Dataset.


Module 5: Machine Learning

  • Machine Learning with TensorFlow.
  • Lab: Carry out ML with TensorFlow.
  • Pre-built models for common needs.
  • Lab: Employ ML APIs.


Module 6: Data Processing Architectures

  • Message-oriented architectures with Pub/Sub.
  • Creating pipelines with Dataflow.
  • Reference architecture for real-time and batch data processing.


Module 7: Summary

  • Why GCP?.
  • Where to go from here.
  • Additional Resources.