Deploying a Model for Inference at Production Scale (NDMIPS-OD)

At scale machine learning models can interact with up to millions of users in a day. As usage grows, the cost of both money and engineering time can prevent models from reaching their full potential. It’s these types of challenges that inspired creation of Machine Learning Operations (MLOps).

Learning Objectives

Practice Machine Learning Operations by:

Deploying neural networks from a variety of frameworks onto a live Triton Server
Measuring GPU usage and other metrics with Prometheus
Sending asynchronous requests to maximize throughput

Upon completion, learners will be able to deploy their own machine learning models on a GPU server

Prerequisites:

Familiarity with at least one Machine Learning framework such as:
- PyTorch
- TensorFlow *
- ONNX
- TensorRT

* Covered in Getting Started with Deep Learning

Familiarity with Docker recommended but not required.

Tools, libraries, frameworks used: NVIDIA Triton

Pianifica

* Il prezzo indicato non include l’IVA che sarà però applicata in fattura

4.00 Hours

Registrazione Request a course / private training