AI-461
Distributed AI Computing
Master Ray for distributed AI computing, optimizing scalable machine learning, data processing, and model serving in cloud-based AI pipelines.
Available Sections:
Details
Ray is the AI Compute Engine. Ray manages, executes, and optimises compute needs across AI workloads. It unifies infrastructure via a single, flexible framework—enabling any AI workload from data processing to model training to model serving and beyond. This course provides an in-depth exploration of distributed computing using Ray, a framework for building and scaling distributed Python applications. Students will learn to develop, deploy, and optimise distributed systems using Ray, with applications in machine learning, data processing, and reinforcement learning. Ray is an open-source distributed computing framework designed to simplify the development and scaling of machine learning (ML) and Python applications.
What you will learn in this course
Develop and deploy distributed AI systems using Ray for scalable workloads.
Optimize AI pipelines for machine learning, data processing, and reinforcement learning with Ray.
Scale Python applications across multiple nodes using Ray’s distributed computing framework.
Implement Ray Serve for high-throughput model serving in production environments.
Enhance performance and cost-efficiency in AI workloads using Ray’s unified framework.
Apply real-world distributed computing techniques inspired by OpenAI, AWS, and Uber use cases.
Design robust, scalable distributed AI systems for cloud environments.
Prerequisites
- AI-101 - Modern AI Python Programming