You are here

An Introduction to Categorical Data Analysis

Alan Agresti
John Wiley
Publication Date: 
Number of Pages: 
Wiley Series in Probability and Statistics
BLL Rating: 

The Basic Library List Committee recommends this book for acquisition by undergraduate mathematics libraries.

[Reviewed by
Ita Cirovic Donev
, on

Statistical methods for categorical data analysis have been quite popular and are in fact gaining in popularity as more fields use it for study and research. Categorical data analysis is popular in medical studies, in credit risk management for building scorecards, and in other social sciences. The need for a second edition of this book was inevitable as more and more researchers are lured into the field and are in a need of a solid introductory book.

First we need to note that this book is really an introductory one (as the title says), i.e., there are no detailed technical explanations of the subject except where the presentation of certain formulas is needed. However, the author does explain in great detail the concepts associated with the theory, choosing the intuitive approach for his presentation. I believe that this is essential for a novice as it will provide him/her with all the necessary background in order to go on and learn the theory or the computational aspects of the categorical data analysis.

One can see from the table of contents that the main topics in categorical data analysis are covered. The introductory chapter does exactly what it says, leading the reader to the main concepts of the categorical data analysis. The second chapter deals only with contingency tables. It is very detailed, covering the intuition behind the usage of contingency tables as well as presenting its use in detail. The author has filled the text with plenty of examples; usually the example follows right after some concept has been introduced. Odds ratios have also taken up much room, and I am really glad for it, since today many authors in introductory books devote but only a few paragraphs to this, quite important, concept.

With chapter three we start the more general presentation of the theoretical aspects of categorical analysis with an introduction of the generalized linear models. This chapter could also serve the readers as a nice and gentle introduction to generalized linear models.

The next three chapters are devoted to logistic regression. There are many books that deal with introductory level logistic regression but none are like this one. Here Agresti really does a great job of bringing to subject to the readers, unlike the many authors who expect the readers to look in more than one reference to obtain some general understanding. I like the fact that a whole chapter is devoted only to building and applying logistic models. Agresti leads the readers on the right path and if they feel that the explanation is not detailed enough he provides references for further research and study.

Next, lognormal models are presented followed by the models for correlated and clustered data. The style of presentation does not deviate from previous chapters, i.e., it is excellent.

Exercises form a major part of the book. There are numerous exercises at the end of each chapter. Most of them are computer based, but there are also some theoretical ones. The ratio of theoretical to applied exercises depends on the chapter. Completing the exercises brings another flavor of the subject and of course much more understanding. In appendices, the reader can find some SAS programming code for each chapter. The reader can fill in for him/herself with additional of more detailed SAS code (or any other statistical software application). Also, with little experience in other statistical software it is relatively straightforward to implement the presented SAS code into the desired format.

There are potentially some readers that will find several drawbacks to the book in regards to the nonexistent guidance to the applicable software, no strict theoretical concepts, etc. However, I must stress that I completely agree with the author's choices. The book is designed to lead the readers into the world of categorical models with very little background in such statistical methods, maybe even in any statistical methods at all. This is an introductory book and as such it is marvelous. If the majority of the introductory books would even try to go in this direction we would end up with readers, especially students, with a much better understanding of the concepts. If one builds a strong intuitive and applied base then it is much easier to go on and grasp the theoretical concepts as well as the computational problems. There are natural follow-ups to this book and one such example is another book by Agresti, entitled Categorical Data Analysis.

Ita Cirovic Donev is a PhD candidate at the University of Zagreb. She hold a Masters degree in statistics from Rice University. Her main research areas are in mathematical finance; more precisely, statistical mehods of credit and market risk. Apart from the academic work she does consulting work for financial institutions.


1. Introduction.

2. Contingency Tables.

3. Generalized Linear Models.

4. Logistic Regression.

5. Building and Applying Logistic Regression Models.

6. Multicategory Logit Models.

7. Loglinear Models for Contingency Tables.

8. Models for Matched Pairs.

9. Modelling Correlated, Clustered Responses.

10. Random Effects: Generaizaed Linear Mixed Models.

11. A Historical Tour of Cataegorical Data Analysis.

Appendix: Software for Categorical Data Analysis.

Table of Chi-Squared Distribution Values.


Index of Examples.

Subject Index.

Answers to Selected Odd-Numbered Exercises.