You are here

Topological Data Analysis with Applications

Gunnar Carlsson and Mikael Vejdemo-Johansson
Cambridge University Press
Publication Date: 
Number of Pages: 
[Reviewed by
Liz Munch
, on
Topological Data Analysis with Applications provides an introduction to the burgeoning field of topological data analysis (TDA), which encompasses the work to bring tools from algebraic topology to data analysis. This field of research now combines ideas from many disparate directions: algebraic topology of course, but also statistics, algorithms, graph theory, machine learning, category theory, and linear algebra, to name a few.
Thus the most difficult aspect of working with any introductory text to such an extremely interdisciplinary field is to determine what background is required of the reader. In the introduction, the authors state that the intended audience consists of both data scientists and topologists. To that end, Chapter 2 provides an overview of common notions in data science for the topologist reader; and Chapter 3 gives an overview of common notions of topology for the data science reader.  While neither section will give a complete understanding of the topic to the other group, there is enough basic idea to get the reader started, and there are plentiful resources for where to go for more information. In particular, I appreciated the careful treatment of possible options for input data. So many TDA textbooks just start from the viewpoint that the input data is a space with a real-valued function; here Chapter 2 gives a clear view on how different forms of data can be transformed into inputs amenable to TDA computations.
Chapter 4 brings in some of the more modern ideas from TDA. It starts with discussions of clustering from a topological viewpoint. Standard simplicial complex constructions, such as the Čech and Rips complexes are introduced. There is a brief discussion of graph based representations of data such as the mapper and Reeb graphs. However, the majority of the section is focused on the highly utilized algebraic representation of data: persistent homology. There are further discussions of generalizations of this construction, including zigzag- and multiparameter-persistence; persistent co-homology; and the Euler characteristic curve.
Chapter 5 gives a brief introduction for using the output of persistent homology in machine learning settings. This focuses on converting barcodes into vectors that respect the metric space structure. Additional details on this setting are continued in Section 6.6.
Finally, Chapter 6 provides a series of case studies where tools from TDA have been used in practice. These are largely focused on the work of the authors, and include analysis of image data, viral evolution, time series analysis, and sensor network coverage problems. This chapter would be particularly useful for the data scientist trying to figure out how to frame a problem in a way that can make use of TDA methods.
Overall, the book provides a solid introduction to the subject. Although it does not provide possible homework problems, this text could be used in a graduate level course with a cohort of students with mixed backgrounds. However, I would expect this text would be the most accessible to the student with at least a semester of algebraic topology.


Liz Munch is an Associate Professor at Michigan State University with a joint appointment in the department of Computational Mathematics, Science Engineering and the department of Mathematics. Her research focuses on topological data analysis with an emphasis on connecting the tools to problems arising in plant morphology and signal processing.