
- Author: Jeff Calder and Peter J. Olver
- Series: Springer Undergraduate Texts in Mathematics and Technology
- Publisher: Springer
- Publication Date: 08/26/2025
- Number of Pages: 629
- Format: Hardcover
- Price: $79.99
- ISBN: 978-3-031-93763-7
- Category: textbook
[Reviewed by Bill Satzer, on 10/14/2025]
This introduction to modern methods of machine learning and data analysis is a mathematically rigorous and challenging but highly readable account with a remarkably wide scope. It is largely self-contained and accessible at the upper undergraduate level. The authors emphasize the importance of understanding the how and why their algorithms work, what they do, and what their limitations are. The primary goal is to introduce the reader to modern machine learning methods and to enable their use in real problems.
Almost a third of the book focuses on aspects of linear algebra; this is significant and not at all surprising given the importance it has in so many aspects of data analysis and machine learning. Optimization, another crucial tool, appears next with practical algorithms for minimizing nonlinear functions. It’s almost halfway through the book before the basics of machine learning are introduced, but by then the reader is well-prepared for the succeeding developments.
After introducing the basics of machine learning and data analysis, the authors describe practical algorithms for the three main types of machine learning – supervised, unsupervised, and semi-supervised - along with a discussion of the hows and whys of splitting data into training, testing, and validation subsets. This is the core of the book.
The authors then consider principal component analysis (PCA), which is uses singular value decomposition to simplify data by looking for linear relationships between measurements at different data points, and can enable useful visualization of data. Their intention here is to prove optimality of PCA for linear approximation of a data set by a low-dimensional affine subspace. Also included is a brief treatment of statistical data analysis. However, throughout the book, they specifically exclude statistical machine learning and mostly avoid probabilistic interpretations.
Two following chapters address graph theory in the context of data science and learning based on graphs, and then offer an abbreviated treatment of neural networks and deep learning. This addresses some of the questions associated with the very large set of parameters that can appear in networks like this. A final chapter goes back to examine some more sophisticated optimization algorithms appropriate for very large-scale problems.
Even readers who are comfortable with linear algebra should at least review that section carefully. The authors’ approach is not abstract, uses real vector spaces, but emphasizes elements that support robust calculation – especially for solutions of systems of linear equations.
Computational work associated with the text requires only the power of a good laptop computer. Very good, easily usable software is provided by the authors. It uses the Python programming language, and it is accessible to anyone with even basic programming skills. Python notebooks appear throughout the text, and an introduction to Python as well as considerable amount of Python code is publicly available through a GitHub website.
Every section of the book has exercises. These often begin with relatively easy questions that can be solved by hand without computational tools, but progress to practical results of interest, theoretical results, and sometimes proofs that were not included in the text. Larger computational problems are also offered, ones that could use software provided in the collection of Python notebooks provided. Solutions are provided – one set for students, a second set for instructors.
Bill Satzer (bsatzer@gmail.com), now retired from 3M Company, spent most of his career as a mathematician working in industry in diverse applications. He did his PhD work in dynamical systems and celestial mechanics.