You are here

Markov Chains and Mixing Times

David A. Levin and Yuval Peres
American Mathematical Society
Publication Date: 
Number of Pages: 
[Reviewed by
Rick Durrett
, on
Discrete time Markov chains with a finite state space are the first stochastic process that one encounters in probability. At each time \( X_{n} \) is in some state \( x \), and it jumps to state \( y \) at time \( n+1 \) with probability \( p(x,y) \). The Markov property says that given the current state the rest of the past is irrelevant for predicting the future. You can think of the system as a board game where on each turn you “roll a die” and use it to determine where you move to a new square. 
The importance of Markov chains comes from the fact that many systems have this property and there is a rich theory for determining the asymptotic behavior of the system. In most cases of interest, the asymptotic behavior is that the system converges to equilibrium. In an introductory stochastic process course, which usually covers topic such as Poisson processes, renewal theory, continuous-time Markov chains, and perhaps Brownian motion, this is all there is time for. 
However, convergence theory for Markov chains is far from the end of the story. In many situations, such as randomized algorithms that arise from computer science and Markov chain Monte Carlo (basically a long winded way of saying simulation) it is important to understand the amount of time it takes for the system to reach equilibrium because that will dictate whether the algorithm is useful or not.
Chapter 4 introduces some of the many metrics that are used to quantify the time to reach equilibrium. Chapters 5 and 6 introduce two methods (coupling and strong stationary times) that are used to upper bound the time to equilibrium. Coupling is a method for defining two processes on the same space so they agree with each other “as much as possible.” When one process is a Markov chain started at and the second is the chain started in equilibrium then one minus the probability they agree is an upper bound on the “total variation distance” between the current state and equilibirium. The second method called strong stationary times constructs random times at which the process is exactly in equilibrium.
Chapter 7 discusses the typically harder problem of giving lower bounds on the mixing times. The book then turns to important examples: 8. Shuffling cards, 9. Random walks on Networks. Chapters 10 considers the problem of the time it takes to first visit a specified state, the hitting time, while Chapter 11 discusses the cover time, i.e., the time to visit all of the states.
Chapter 12 which closes Part I discusses eigenvalues. When the transition matrix can be diagonalized then the convergence rate can be computed explicitly. However, in many examples one uses the size of the spectral gap between 1 and the next largest eigenvalue. 
Part II considers more advanced material. There are a number of famous examples: the Ising model of magnetism from physics, gene rearrangement from biology, the simple exclusion processes which is one of the first interacting particle systems, and the Lamplighter walk from mathematics – an individual walks along an infinite sequence of lamps. On each step he/she either jumps or changes the state of the lamp. The system may sound like a joke but its analysis is no laughing matter and the answers are exotic. 
Other special techniques are introduced in Part II, but perhaps the most intriguing notion here is the cutoff time. For an example consider the random walk on a random d-regular graph with \( d \geq 3 \). The time to converge to equilibrium is asymptotically \( c_{d} \log n \) but the total variation distance goes from 1 to 0 in a window of size \( \sqrt{ \log n} \) around  \( c_{d} \log n \). The phenomenon that the total variation distance goes from 1 to 0 in a window that is much smaller than the mixing time is “cutoff.” Determining when this happens is a very interesting problem.
In theory, Part I of the book can be used for a course. Two flow charts show how one might go about this. This could work for a collection of graduate students, even ones from a variety of application fields, but I personally wouldn’t try it on Duke undergraduates.


Rick Durrett is a professor in the Mathematics Department at Duke University.  

See the table of contents in the publisher's webpage.