Fundamentals Of Accelerated Data Science With Rapids (NFADSWR)

Whether you work at a software company that needs to improve customer retention, a financial services company that needs to mitigate risk, or a retail company interested in predicting customer purchasing behavior, your organization is tasked with preparing, managing, and gleaning insights from large volumes of data without wasting critical resources. Traditional CPU-driven data science workflows can be cumbersome, but with the power of GPUs, your teams can make sense of data quickly to drive business decisions.


In this workshop, you’ll learn how to build and execute end-to-end GPU-accelerated data science workflows that enable you to quickly explore, iterate, and get your work into production. Using the RAPIDS™-accelerated data science libraries, you’ll apply a wide variety of GPU-accelerated machine learning algorithms, including XGBoost, cuGRAPH’s single-source shortest path, and cuML’s KNN, DBSCAN, and logistic regression to perform data analysis at scale. 


Learning Objectives

By participating in this workshop, you’ll:

  • Implement GPU-accelerated data preparation and feature extraction using cuDF and Apache Arrow data frames
  • Apply a broad spectrum of GPU-accelerated machine learning tasks using XGBoost and a variety of cuML algorithms
  • Execute GPU-accelerated graph analysis with cuGraph, achieving massive-scale analytics in small amounts of time
  • Rapidly achieve massive-scale graph analytics using cuGraph routines


Prerequisites:

Experience with Python, ideally including pandas and NumPy

Suggested resources to satisfy prerequisites: Kaggle's pandas Tutorials, Kaggle's Intro to Machine Learning, Accelerating Data Science Workflows with RAPIDS


Technologies:

RAPIDS, cuDF, XGBoost, cuML, cuGraph, Dask, cuPy, pandas, NumPy, Bokeh

Certificate: Upon successful completion of the assessment, participants will receive an NVIDIA DLI certificate to recognize their subject matter competency and support professional career growth.


Hardware Requirements:

Desktop or laptop computer capable of running the latest version of Chrome or Firefox. Each participant will be provided with dedicated access to a fully configured, GPU-accelerated server in the cloud.

Mostra dettagli


Workshop Outline

Introduction

  • Meet the instructor.
  • Create an account at courses.nvidia.com/join


GPU-Accelerated Data Manipulation

Ingest and prepare several datasets (some larger-than-memory) for use in multiple machine learning exercises later in the workshop:

  • Read data directly to single and multiple GPUs with cuDF and Dask cuDF.
  • Prepare population, road network, and clinic information for machine learning tasks on the GPU with cuDF.


GPU-Accelerated Machine Learning

Apply several essential machine learning techniques to the data that was prepared in the first section:

  • Use supervised and unsupervised GPU-accelerated algorithms with cuML.
  • Train XGBoost models with Dask on multiple GPUs.
  • Create and analyze graph data on the GPU with cuGraph.


Project: Data Analysis to Save the UK

Apply new GPU-accelerated data manipulation and analysis skills with population-scale data to help stave off a simulated epidemic affecting the entire UK population:

  • Use RAPIDS to integrate multiple massive datasets and perform real-world analysis.
  • Pivot and iterate on your analysis as the simulated epidemic provides new data for each simulated day.