You are here

Handbook for Applied Modeling Non-Gaussian and Correlated Data

Jamie D. Riggs and Trent L. Lalonde
Cambridge University Press
Publication Date: 
Number of Pages: 
[Reviewed by
Peter Rabinovitch
, on

This book is a guide to modeling and analyzing non-Gaussian and correlated data. There is clearly a need for such a book to help less experienced data scientists come up to speed with data that does not fit into the mold cast by their first stats text. There are many positive aspects to this book, and unfortunately a few negative ones too.

It starts by describing several data sets that will be used throughout the book. A chapter introducing several types of models follows: Constance variance response models; Non-constant variance response models; Discrete, categorical response models; Counts response models; Time-to-event response models; Longitudinal response models; and Structural equation modeling. The seven chapters that follow each focus on one type of model and apply it to each of the data sets. The book concludes with a chapter containing guidelines on how to choose a model that is suitable for your data.

The data sets and models are well explained, and the limitations of each type of model on the various data sets is illustrated by frequent plots.

The code and data takes a bit of snooping to find, it is at Unfortunately, the code does not work as is. It requires some tweaking to give it correct filenames, variable names, etc. This is not a big deal if you are comfortable with R, but for a beginner might be a show stopper.

I find the writing to be a bit dry, and sometimes puzzling. On one page I found the authors describing the exponential function as the anti-logarithm, and then later on the same page referring to the logarithm as the anti-exponential. This dance happens throughout the book, and is unnecessary as the target audience surely is familiar with logarithms and exponentials.

I found that by analyzing the data sets in each chapter I frequently had to review earlier material to reacquaint myself with the data and what had already been done. For me, it would have been better if each chapter analyzed one data set, applying the different methods, showing where they fail, etc. rather than the current structure.

I also would have preferred the code and results to be integrated into the book, as I think that would make it easier for the target audience to learn to apply the techniques it describes.

That being said, I think a second edition addressing some of these issues would be extremely valuable to the group of new data scientists who have little experience with “non-standard” data.

Peter Rabinovitch is a Senior Performance Engineer at Akamai who has been doing data science since long before “data science” was a thing.

1. The data sets
2. The model-building process
3. Constance variance response models
4. Non-constant variance response models
5. Discrete, categorical response models
6. Counts response models
7. Time-to-event response models
8. Longitudinal response models
9. Structural equation modeling
10. Matching data to models.