You are here

Regression for Categorical Data

Gerhard Tutz
Cambridge University Press
Publication Date: 
Number of Pages: 
Cambridge Series in Statistical and Probabilistic Methematics
[Reviewed by
Peter Rabinovitch
, on

Regression for Categorical Data is a fine mathematical statistics book.

The standard topics of univariate and multivariate logit and probit models, in all their full mathematical detail, are presented well, if a little dryly. Interesting additional topics including regularization, tree based methods, and non-parametric methods are well presented, and their connections to the main topics well established.

R software is used in the exercises, with code and data available. However, R is not integrated into the text, i.e. there are no printouts from R with accompanying text explaining them. I would have liked R to be more integrated, but that would have made the book swell significantly beyond its 550 or so pages.

Each chapter ends with suggestions for further reading, and about 10 exercises. About one third of the exercises make use of the R data sets, the rest are theoretical. They range from the straightforward, testing the readers comprehension of the chapters material, to quite interesting, asking for interpretations and the “why” behind some of the ideas.

The book will serve as a great reference. It would be also be an excellent a text for students who have completed a course at the level of Casella and Berger’s Statistical Inference. Although there is too much material for a one semester course, the student will then have it as a useful reference.

The book looks attractive, as all the Cambridge Series in Statistical and Probabilistic Mathematics book do, but to my tired, old eyes some of the plots were difficult to read.

Peter Rabinovitch is a Systems Architect at Research in Motion. He recently defended his PhD thesis and is now trying to sell some lightly used Mallows permutations.

1. Introduction
2. Binary regression: the logit model
3. Generalized linear models
4. Modeling of binary data
5. Alternative binary regression models
6. Regularization and variable selection for parametric models
7. Regression analysis of count data
8. Multinomial response models
9. Ordinal response models
10. Semi- and nonparametric generalized regression
11. Tree-based methods
12. The analysis of contingency tables: log-linear and graphical models
13. Multivariate response models
14. Random effects models
15. Prediction and classification
Appendix A. Distributions
Appendix B. Some basic tools
Appendix C. Constrained estimation
Appendix D. Kullback–Leibler distance and information-based criteria of model fit
Appendix E. Numerical integration and tools for random effects modeling.