You are here

Regression Analysis by Example

Samprit Chatterjee and Ali S. Hadi
John Wiley
Publication Date: 
Number of Pages: 
BLL Rating: 

The Basic Library List Committee suggests that undergraduate mathematics libraries consider this book for acquisition.

[Reviewed by
Ita Cirovic Donev
, on

Regression Analysis by Example is a book on applied regression models. The authors cover general aspects of univariate and multivariate regression modeling such as variable selection, problems with collinearity, parameter estimation, hypothesis testing, biased estimation. There is also a chapter on logistic regression.

Honestly, when I first picked up the book I expected it to be more like a series of case studies, really depicting and explaining the examples in great detail (after all, that's what the title says). Examples are given in each chapter, which is not so bad as a starting strategy for a book, but in the end the depth of the explanations vary a lot chapter by chapter. Some parts are explained better and in more detail than others.

Starting with a chapter on what constitutes regression, simple univariate and multivariate models are presented. Concepts are presented intuitively, using mathematical exposition only when it is necessary. There are no proofs or derivations of formulas. For example, even normal equations are just cited.

I feel that there is a substantial lack of discussion of the foundations of regression methods, especially in the first couple of chapters. Given the assumption that the reader should only have some previous statistics knowledge, a more detailed mathematical exposition should have been provided to lead the reader to the correct understanding. This is especially important for other chapters. It would enable the reader to follow the later parts of the book without any serious open questions. Chapters on collinearity, estimation, hypothesis testing, variable selection and others are all presented in a descriptive manner. One can obtain a good perspective on the mentioned methods.

This new edition also adds an aditional chapter on logistic regression. It provides a nice intuitive introduction to logistic regression. Concepts are presented in a lucid style. If you have some background in logistic regression this will refresh your memory and as such it can be considered rather an easy read. The authors again do not go too much into the details and reasons why certain problems arise. For example, problems with computing the MLE for logistic regression are not explained; instead, the reader is referred to other references. The example on predicting bankruptcy is very popular today in the financial industry. However, the authors choose to provide us only with the most simplistic case.

Exercises are provided at the end of each chapter and there are quite a number of them. Naturally, almost all of the exercises are data-related, which gives good practice. Data used in the book can be downloaded from the book's website.

Referencing of equations in the text is done via equation numbering, which can be quite tedious, since we generally do not remember what (4.6) stood for.

With no proper theoretical base of the methods and presentatin of very simple examples, one may wonder about the usefullness of the book. However, this book is generally intended for students with only a basic knowledge of statistics. As such it is an excellent guidebook for doing regression. However, I have to note that this approach could potentially be very dangerous if the reader is engaged in some more complicated (i.e., real) regression problem where he/she could draw wrong or misleading conclusions. If you are looking for a more applied approach which also features a quite thorough theoretical discussion, an excellent reference is F. Harrell's book Regression Modeling Strategies .

One major drawback of the book is the lack of computer code. Given the intended audience it is very likely that the knowledge of using R, S-Plus or SAS is very low or none. This would then make it very hard for students to follow the examples on their own or do the exercises efficiently.

Overall, I would recommend this book for all students (upper undergraduate and first year graduate students in nonmathematical areas of study) interested in regression modeling at a novice level. It will, together with some other references, provide a good way of completing school projects efficiently. Students in mathematics and statistics should look for additional references along with this book. The authors emphasize that this is a book for readers whose specialization is not statistics but who use statistics in their daily work. I do not agree that such users would be well served if they just knew what is in this book. Even fpr a practitioner there is a great need to know the details, because you definitely don't want to make professional decisions if you are not sure how you obtained some results.

Ita Cirovic Donev is a PhD candidate at the University of Zagreb. She hold a Masters degree in statistics from Rice University. Her main research areas are in mathematical finance; more precisely, statistical mehods of credit and market risk. Apart from the academic work she does consulting work for financial institutions.




1. Introduction.

2. Simple Linear regression.

3. Multiple Linear Regression.

4. Regression Diagnostics: Detection of Model Violations.

5. Qualitative Variables as Predictors.

6. Transformation of Variables.

7. Weighted Least Squar45es.

8. The Problem of Correlated Errors.

9. Analysis of Collinear Data.

10. Biased Estimation of Regression Coefficients.

11. Variable Selection Procedures.

12. Logistic Regression.

13. Further Topics.

Appendix A: Statistical Tables.