You are here

Linear Algebra and Optimization for Machine Learning

Charu Aggarwal
Publication Date: 
Number of Pages: 
BLL Rating: 

The Basic Library List Committee suggests that undergraduate mathematics libraries consider this book for acquisition.

[Reviewed by
Brian Borchers
, on
With the recent growth in undergraduate and graduate degree programs in data science and machine learning a new niche has developed for courses that cover mathematics used in data science including applied linear algebra, vector calculus, optimization, probability, and statistics.  Students coming to data science, particularly from undergraduate programs in computer science are unlikely to have a strong background in all of these areas.  Furthermore, how this mathematics is used in data science is sometimes quite different from application to in other areas of science and engineering.  Thus a specialized textbook on mathematics for data science may be needed.
Linear Algebra and Optimization for Machine Learning is a textbook that covers applied linear algebra and optimization with a focus on topics of importance to machine learning.  The book uses many applications from machine learning as examples.
Although the coverage of linear algebra begins with a review of basic operations on matrices and vectors, it quickly moves on to more advanced topics that go beyond what is covered in the typical sophomore-level introductory course, including QR factorization, trace inner product and Frobenius norm, the singular value decomposition, and the Laplacian matrix of a graph. Machine learning applications of linear algebra discussed in the book include principal components analysis using the SVD and spectral clustering using the graph Laplacian.
Optimization topics include matrix calculus and the backpropagation methods for the computation of gradients, gradient descent and stochastic gradient descent, Newton and quasi-Newton methods, and subgradient and proximal gradient methods for nondifferentiable functions.  The optimization methods are applied to machine learning problems in linear regression, logistic regression, support vector machines, and neural networks.
The book has numerous exercises.  Some exercises are sprinkled throughout the text with the intention that the reader will solve them before reading further.  Additional exercises are included at the end of each chapter.  I was impressed both by the number of exercises and the range from simple hand computations to very challenging problems.
Compare this book with other recent textbooks on mathematics for data science including Linear Algebra and Learning from Data by Gilbert Strang, Mathematics for Machine Learning by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong, and Introduction to Applied Linear Algebra, Vectors, Matrices, and Least Squares by Stephen Boyd and Lieven Vandenberghe.  Mathematics for Machine Learning and Introduction to Applied Linear Algebra are both written at a somewhat lower level than Aggarwal's book.  Based on the topics covered and the excellent presentation, I would recommend Aggarwal's book over these other books for an advanced undergraduate or beginning graduate course on mathematics for data science.


Brian Borchers is a professor of mathematics at New Mexico Tech and the editor of MAA Reviews.