Machine learning (ML) has become one of the most sought-after skills in the modern tech landscape, driving innovation in AI, data science, finance, healthcare, robotics and beyond. For learners who want to go beyond surface-level tutorials, the Applied Machine Learning (CS 5785) course from Cornell Tech stands out as a comprehensive, rigorous, and practical resource.

This course not only introduces the mathematical foundations of machine learning but also ensures learners get hands-on experience by coding algorithms in Python, NumPy, and Scikit-Learn. With 23 lectures, detailed Jupyter notebooks, slides and 30+ hours of YouTube videos, it offers a complete learning roadmap for anyone interested in ML.
Table of Contents
Below is a lecture-by-lecture overview of the course to help you understand its scope and value.
Lecture 1: Introduction to Machine Learning
The course begins with an overview of machine learning as a field of study that enables computers to learn without being explicitly programmed. Learners are introduced to the three main approaches:
- Supervised Learning
- Unsupervised Learning
- Reinforcement Learning
This lecture sets the foundation by discussing real-world applications and outlining what will be covered throughout the course.
Lecture 2: Anatomy of Supervised Machine Learning
This lecture breaks down the components of supervised learning: datasets, learning algorithms, and predictive models. A running example of predicting diabetes risk from BMI illustrates how supervised algorithms work. It also introduces optimization and the Scikit-Learn library, one of the most popular Python ML packages.
Lecture 3: Optimization and Linear Regression
Here, learners are introduced to ordinary least squares (OLS), the first supervised learning algorithm in the course. Topics include:
- Gradient descent optimization
- Normal equations
- Polynomial feature expansion
- Extensions of linear regression
Lecture 4: Foundations of Supervised Learning
This lecture covers the mathematical underpinnings of supervised learning, including:
- Data distributions
- Hypothesis classes
- Bayes optimality
- Overfitting vs. underfitting
- Regularization techniques
It sets the stage for understanding classification models.
Lecture 5: Maximum Likelihood Learning
The focus shifts to maximum likelihood estimation (MLE), Bayesian ML and MAP (Maximum A Posteriori) learning. Students also learn about common failure modes in supervised learning and how regularization can address them.
Lecture 6: Classification Algorithms
This lecture dives into essential classification models:
- K-Nearest Neighbors (KNN)
- Logistic Regression
- Softmax Regression
It emphasizes how these algorithms are applied in tasks like text classification.
Lecture 7: Generative Algorithms
Applied Machine Learning – CS 5785 introduces generative models, particularly Gaussian Discriminant Analysis (GDA). The differences between generative vs. discriminative models are explained through intuitive classification problems.
Lecture 8: Naive Bayes
This lecture explores Naive Bayes, a simple yet effective generative model often used in text classification and spam detection. Concepts like the Bag-of-Words model are also introduced.
Lecture 9: Support Vector Machines (SVMs)
SVMs are covered in detail, including:
- Margins and max-margin classifiers
- Hinge loss
- Sub-gradient descent
SVMs are shown as powerful classifiers that work well in high-dimensional spaces.
Lecture 10: Dual Formulation of SVMs
Here, the mathematical depth increases with topics like:
- Lagrange duality
- Dual formulations of SVM
- Sequential Minimal Optimization (SMO) algorithm
Lecture 11: Kernels
Learners explore kernel methods, an essential concept in ML. Topics include:
- Mercer’s theorem
- Radial Basis Function (RBF) kernels
- The “kernel trick” for transforming non-linear data into higher-dimensional spaces
Lecture 12: Decision Trees
This lecture introduces decision trees and techniques to improve their performance, including:
- Classification and Regression Trees (CART)
- Bagging
- Random forests
Lecture 13: Boosting
Boosting is introduced as a method to enhance weak learners:
- AdaBoost
- Gradient Boosting
- Additive models
Lecture 14: Neural Networks
This lecture covers the basics of artificial neural networks, including:
- Perceptrons
- Multilayer networks
- Backpropagation
Lecture 15: Deep Learning
A follow-up to neural networks, this lecture introduces deep learning with a focus on:
- Convolutional Neural Networks (CNNs)
- Practical applications in computer vision and beyond
Lecture 16: Introduction to Unsupervised Learning
Learners explore the foundations of unsupervised learning, including terminology, practice examples, and real-world applications where labels are not available.
Lecture 17: Density Estimation
This lecture discusses density estimation and probabilistic models, including:
- Kernel density estimation
- K-nearest neighbors for density estimation
- Latent variable models
Lecture 18: Clustering
Clustering is one of the most widely used unsupervised methods. Topics include:
- K-means clustering
- Gaussian mixture models
- Expectation-Maximization (EM)
Lecture 19: Dimensionality Reduction
This lecture focuses on reducing the complexity of datasets:
- Principal Component Analysis (PCA)
- Independent Component Analysis (ICA)
- Practical implications for visualization and computational efficiency
Lecture 20: Evaluating Machine Learning Models
Evaluation is key to ML success. This lecture covers:
- Dataset splits
- Cross-validation techniques
- Performance measures for regression and classification
Lecture 21: Model Iteration and Improvement
This lecture focuses on improving models through:
- Error diagnosis
- Bias/variance tradeoff
- Baselines and learning curves
Lecture 22: Tools for Diagnosing Model Performance
Building on the previous lecture, this session explores tools like:
- Error analysis
- Data integrity checks
- Learning and validation curves
- Comparing to human-level performance
Lecture 23: Course Overview
The final lecture wraps up with key ML concepts such as:
- Bias/variance tradeoff
- Empirical risk minimization
- Basics of learning theory
Although the video for this Applied Machine Learning course isn’t available, slides and notes provide a strong summary of the course.
Why Applied Machine Learning – CS 5785 Is a Game-Changer for ML Learners
The Applied Machine Learning – CS 5785 course from Cornell Tech is unique because it:
- Balances mathematics and coding
- Uses real-world Python implementations
- Covers a wide range of ML algorithms
- Provides free and open access to notes, slides, and videos
With this curriculum, learners don’t just memorize algorithms—they gain the ability to analyze, implement, and improve them.
Conclusion
Machine learning is one of the most transformative fields today, and Cornell Tech’s Applied Machine Learning – CS 5785 provides one of the most complete and accessible paths to mastering it. Whether you’re a beginner looking for a structured introduction or an advanced learner seeking mathematical rigor, this course equips you with the skills to succeed in data science, AI engineering, and research.
Ready to start learning? Access the free resources here:
Related Reads
- NLP Text Preprocessing Cheatsheet 2025: The Ultimate Powerful Guide
- Plotly Cheatsheet 2025: Powerful Techniques from Beginner to Advanced
- Matplotlib Cheatsheet 2025: From Beginner to Advanced
- Scikit-learn Cheatsheet 2025: From Beginner to Advanced
- NumPy Cheatsheet 2025: From Basics to Advanced in One Guide
4 thoughts on “Applied Machine Learning – CS 5785 at Cornell Tech: Complete Course Guide”