When most people think of learning machine learning (ML), they imagine coding in Python, experimenting with TensorFlow or PyTorch and building neural networks. While hands-on coding is essential, it only scratches the surface. True mastery of machine learning comes from understanding the mathematics and theoretical concepts that power these algorithms. Without that foundation, models may work but the “why” behind their behavior often remains a mystery.

That’s where Michael U. Gutmann’s Pen and Paper Exercises in Machine Learning shines. This open-source collection is freely available on GitHub and as a compiled PDF on arXiv. It provides a comprehensive set of exercises with detailed, step-by-step solutions designed to strengthen mathematical intuition and problem-solving skills in ML.
In this blog, we’ll explore why this resource is so valuable, what topics it covers and how you can use it to accelerate your journey into the world of machine learning.
Why Pen-and-Paper Exercises Matter in Machine Learning
In most ML courses and textbooks, exercises are either too simplistic or come without proper solutions. This leaves many learners frustrated, skipping over the very problems that could solidify their understanding. Gutmann addresses this gap by offering fully worked-out solutions, ensuring that learners can follow each step and truly grasp the concepts.
Here are a few reasons why pen-and-paper practice is essential in ML learning:
- Strengthens Mathematical Foundations – Concepts like eigenvalue decomposition, probability distributions and optimisation techniques are better understood when worked out manually.
- Bridges Theory and Practice – By deriving results by hand, you gain insight into why algorithms behave as they do which helps when implementing them in code.
- Prepares for Advanced Study – Graduate-level ML, probabilistic modeling and research often demand a deep command of the math behind the models.
- Boosts Interview Readiness – Many AI/ML interviews involve theoretical questions. Being comfortable with proofs and derivations can give you an edge.
Gutmann, in the preface, emphasizes that while coding is crucial, pen-and-paper work builds intuition and confidence that code alone cannot provide.
What Topics Does the Collection Cover?
The PDF spans ten chapters, each focusing on a critical area of machine learning theory. Here’s an overview:
1. Linear Algebra
Covers orthogonalisation, eigenvalue decomposition, determinants, and the power method – key tools for dimensionality reduction and PCA.
2. Optimisation
Introduces gradients, Newton’s method, matrix calculus and log-determinant derivatives. These are core to training ML models efficiently.
3. Directed Graphical Models
Explains concepts like d-separation, Markov properties and hidden Markov models (HMMs) which are vital for sequence modeling.
4. Undirected Graphical Models
Focuses on Gibbs distributions, Markov blankets and Restricted Boltzmann Machines (RBMs).
5. Expressive Power of Graphical Models
Discusses I-equivalence, minimal I-maps, moralisation and triangulation, helping learners understand the representational limits of different model classes.
6. Factor Graphs and Message Passing
Explores sum-product and max-sum algorithms as well as elimination order strategies for efficient inference.
7. Inference for Hidden Markov Models
Covers predictive distributions, the Viterbi algorithm, forward–backward sampling and Kalman filtering.
8. Model-Based Learning
Includes maximum likelihood estimation (MLE), Bayesian inference, factor analysis, independent component analysis (ICA) and unnormalised models.
9. Sampling and Monte Carlo Integration
Introduces importance sampling, rejection sampling, MCMC methods and Bayesian Poisson regression.
10. Variational Inference
Explains mean field methods and variational posterior approximations techniques widely used in modern Bayesian deep learning.
Each exercise is followed by a detailed solution, making the resource both a problem book and a guided tutorial.
How to Use the Resource
The collection is flexible and can be tailored for different needs:
- As a Student: Use it to practice ML theory alongside textbooks or online courses.
- As a Researcher: Reference it to strengthen your grasp on inference methods and probabilistic modeling.
- As a Job Seeker: Work through the exercises to sharpen your problem-solving skills before interviews.
Access here:
- GitHub Repository: ml-pen-and-paper-exercises
- Compiled PDF: arXiv link
Who Should Use This Resource?
This collection is designed for anyone serious about building a strong ML foundation:
- Undergraduate and graduate students preparing for advanced ML courses.
- Self-learners who want to go beyond coding tutorials.
- AI researchers working on probabilistic modeling, Bayesian inference or generative models.
- Professionals and interviewees looking to refresh theoretical knowledge for career growth.
Final Thoughts
Machine learning is a rapidly evolving field, but the mathematical foundations remain timeless. Michael U. Gutmann’s Pen and Paper Exercises in Machine Learning offers a rare opportunity to practice these foundations with clarity and rigor.
Unlike many resources, this isn’t just about definitions and theorems—it’s about active learning through practice. Whether you’re aiming for academic research, a career in AI, or simply want to deepen your understanding, this free resource deserves a place in your study plan.
Start today by downloading the compiled PDF on arXiv or exploring the GitHub repository.
Related Reads
- 7 AI Books That Can Teach You More Than a $200K Master’s Degree
- 10 GitHub Repositories to Build a Career in AI Engineering
- Asteroid Collision in Python – LeetCode75 Explained
- Introduction to Large Language Models: IIT Madras YouTube Course
- Removing Stars From a String in Python – LeetCode75 Clearly Explained
References
Michael U. Gutmann’s Pen and Paper Exercises in Machine Learning
- GitHub: https://github.com/michaelgutmann/ml-pen-and-paper-exercises
- arXiv PDF: https://arxiv.org/abs/2206.13446
Foundations of Machine Learning – MIT OpenCourseWare
https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-036-introduction-to-machine-learning-fall-2020/
Mathematics for Machine Learning (Book, Cambridge University Press)
https://mml-book.github.io/
Probabilistic Graphical Models – Stanford University (Coursera by Daphne Koller)
https://www.coursera.org/learn/probabilistic-graphical-models
Pattern Recognition and Machine Learning by Christopher M. Bishop (reference textbook)
https://www.microsoft.com/en-us/research/people/cmbishop/prml-book/
Deep Learning Book by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
https://www.deeplearningbook.org/
A Probabilistic Perspective on Machine Learning – Kevin Murphy
https://probml.github.io/pml-book/
Mathematics for Machine Learning Specialization – Coursera (Imperial College London)
https://www.coursera.org/specializations/mathematics-machine-learning
CS229: Machine Learning – Stanford University (Lecture Notes)
https://cs229.stanford.edu/
2 thoughts on “Pen and Paper Exercises in Machine Learning: A Free Resource to Master ML Fundamentals”