You are here

Regression Estimators: A Comparative Study

Marvin H. J. Gruber
Johns Hopkins
Publication Date: 
Number of Pages: 
[Reviewed by
Jonathan Groves
, on

Although the least-squares estimator is the most commonly known estimator in multivariate linear regression, other regression estimators have been derived to help resolve problems with multicollinearity, small or zero eigenvalues of the matrix XTX from the normal equations, and additional observations based on prior information. Although the mathematical literature usually focuses on regression estimators from either just a Bayesian or frequentist view of probability, Marvin Gruber’s Regression Estimators (second edition) approaches linear regression from both these views — thus allowing the book to appeal to mathematical and applied statisticians who favor either one of these views of probability. With the wealth of information and references to literature and over 150 exercises and problems, Gruber’s book can be used as a textbook for a course on regression estimators, as a survey to the reader who may be interested in pursuing research in this area, and as a reference book for statisticians. Much of Chapters 3–7 contain results by Gruber and his doctoral advisor Poduri Rao (not to be confused with C. R. Rao who had developed the Rao distance and several other important ideas discussed in this book), and many of the later chapters contain recent developments. Because of its focus on theory and proofs, this book will appeal more to mathematical statisticians than to applied statisticians though applied statisticians working in regression analysis may appreciate the information and applications given and the references to the literature.

Reading Gruber’s book requires a good background in the fundamentals of multivariate and matrix calculus, multivariate probability distributions, and matrix theory. From differential geometry, which is not needed until the final chapter, the reader should be familiar with the definitions of manifolds, tangent spaces and bundles, and the Riemannian metric. None of the required background on multivariate and matrix calculus is discussed in the book. At least a working understanding of the general theory of multivariate probability distributions (expected value, dispersions, bias of estimators…) is required, since little of the general theory is covered in the book. Some background in Bayesian probability and statistics is also useful. From matrix theory, the reader should be familiar with singular value decomposition, positive definite and semi-positive definite matrices, and generalized matrix inverses. All these ideas are used heavily throughout the book.

Chapters 1–2 contain a brief, comprehensive history of the study of regression estimators and some of the required background on statistics and matrix theory. Chapter 3 derives some properties of the classical least-squares estimator and other alternative estimators such as ridge estimators, mixed estimators, the linear minimax estimator, and the Bayes estimator. Chapter 4 explores some of the relationships between these estimators, such as the equivalence of the Bayes and generalized ridge estimators, the equivalence of the Bayes and mixed estimators, and the ridge estimator as special cases of the Bayes, minimax, and mixed estimators. The classical Gauss-Markov theorem and one of its extensions are discussed as well.

Measures of the mean square error (MSE) and measures of risk are commonly used measures of the efficiency of estimators. Chapters 5–8 define these measures of efficiency with emphasis on the MSE averaging over prior assumptions, the MSE without averaging over prior assumptions, and the MSE with incorrect prior assumptions. Chapters 12–13 discuss risk with respect to Zellner’s balanced loss function and several asymmetric measures of risk, such as risk with respect to the LINEX loss function, the frequentist risk, and the Bayes risk. Asymmetric risks assume that errors in estimation in one direction are more serious than errors in the other direction.

Gruber’s book includes several applications of regression estimators in other settings. For example, the Kalman Filter (which is based on a linear regression model with parameters varying at discrete times and with a stochastic linear relationship between the parameters at consecutive times) is introduced in Chapter 9 along with its relationships with regression estimators. Chapter 10 applies these regression estimators to the one-way and two-way ANOVA models. Chapter 11 discusses the relationships between penalized splines and regression estimators.

Chapter 14, the closing chapter, studies another measure of efficiency of regression estimators: the Rao distance (defined on geodiscs) between distributions of estimators on statistical manifolds. This chapter includes some of the necessary differential geometry background along with geodiscs, geodisc distances, and the Rao distance. Properties of the Rao distance between two distributions of linear Bayes estimators, the Rao distance between distributions of ridge estimators, and the Rao distance between distributions of mixed estimators are discussed. The book closes with an example of a sequence of geodisc distances associated with iterations of the Kalman Filter converging to zero.

It is unfortunate that this book does not use italics for variables, but the typesetting is still readable. It is also unfortunate that there are many typographical errors. Most of these errors are misspellings, incorrect references to equation and theorem numbers, and other minor errors in equations that generally are not difficult to correct. Other errors are difficult to catch without working through the details of the proofs. Several errors are more serious; for instance, in the statement of Theorem 2.6.4, the column vectors b and c and their transposes bT and cT should be interchanged. I must confess that I spent about 10–20 minutes puzzled as to why my result did not agree with Gruber’s; when I analyzed the dimensions of matrix products, I realized that the theorem cannot make sense as stated. Another example is Theorem 2.3.1: The theorem and the conclusion both state that the James-Stein estimator is inadmissible, but what is proven — which is what Gruber does mean to prove — is that the maximum likelihood estimator (MLE) is inadmissible (the James-Stein estimator is uniformly better than the MLE, which proves that the MLE is inadmissible).

Despite its weaknesses, Gruber’s Regression Estimators (second edition) is strong in several ways. With much of the fundamental research literature cited and with many recent developments given, especially in Chapters 12–14, the book gives the reader who is interested in research a good survey of the subject. Gruber mentions that the list of references is by no means complete and recommends that the reader consult the lists of references in these cited books and articles to find further research literature. Chapter 14 mentions several problems not yet fully resolved, which may give the reader some ideas for original research.

One of the greatest strengths of this book is that it is one of the few resources that studies and compares the developments of regression estimators from both the Bayesian and frequentist points of view. In fact, Gruber mentions that the lack of such resources is one of his main reasons for writing this book.  

This book is a valuable resource for anyone interested in learning about alternatives to the least-square estimator in multivariate linear regression and for anyone interested in a good survey of the research in this field.

Jonathan Groves is an adjunct professor of mathematics at Kaplan University, Argosy University, and Florida Institute of Technology (as of January 2011). Much of his tutoring at the Kaplan Math Center is in statistics. His main mathematical interests are in commutative algebra, algebraic geometry, statistics, and mathematics education. He has written several papers on D-nice polynomials, including two for the International Electronic Journal of Algebra, and hopes to find appropriate tools from algebraic geometry that will simplify the study of D-nice polynomials. He is considering either a full-time career in academia or as a mathematician or statistician for the government. His email address is

The table of contents is not available.