You are here

Design and Analysis of Experiments and Observational Studies using R

Nathan Taback
Chapman and Hall/CRC
Publication Date: 
Number of Pages: 
[Reviewed by
Sara Stoudt
, on
Taback’s Design and Analysis of Experiments and Observational Studies using R provides a comprehensive scaffolding for a variety of classes that cover experimental design and analysis of both experimental and observational data, allowing room for instructors to add narration to supplement the material depending on the focus of the course. 
Approaches to this material that spend more in-class time on theoretical content will appreciate the coding guidance that can be used for self-study or out-of-class activities. Although the book expects pre-existing coding experience in a mix of base R and tidyverse (primarily, pipes, dplyr, and ggplot2), the second chapter that reviews mathematical statistics concepts also provides code that spans much of the R needed for the rest of the book and can be reviewed to get up to speed. The Computation Lab sections throughout the text are helpfully labeled to pair the implementation to the description of each topic. For students with less exposure to R, these sections could be fleshed out further by an instructor to use as lab assignments where students are asked to annotate the code, repurpose the code for a similar investigation, and interpret the provided output in context. 
More applications-focused courses will appreciate the streamlined nature of the text itself. Throughout, many results are stated with no proofs, like for example the relationships between particular theoretical distributions and the ANOVA identity, but often illustrated and given intuition using simulation. There may be readers who find the discussion too sparse though and who will want to further supplement this book with some additional theoretical resources and/or discussion of interpreting findings in context.
For those courses that want to cover causal inference as a culminating topic or preview a future course on the topic will appreciate the last chapter’s consistency with the popular Imbens and Rubin notation. (Note that the second chapter’s review does include regression, but those who will work through the causal inference section may also want to supplement with a logistic regression review at some point to help with propensity score estimation.)
The concept of a propensity score is even introduced early on in the third chapter that starts the reader off with comparing two treatments to prime students, hinting at the overarching theme of the book -  “the interplay between how data should be collected, and the strength of the answers to the questions”. This chapter provides instruction on both normal theory and randomization-based inference approaches. The randomization-based inference in particular motivates further coding approaches, and the code techniques used in this chapter help provide a framework for the more extensive simulation studies in the next chapter on power and sample size.
The book goes on to discuss how to compare more than two treatments, covering Analysis of Variance (ANOVA) and more complicated designs including blocking, Latin Square and factorial designs. Some of the craft of experimental design is also discussed here including how to account for multiple comparisons, how to set up experiments under constraints, and the effect hierarchy principle that suggests lower-order effects should be prioritized in estimation.
Beyond the printed book, there are a variety of supplemental materials found online including a free electronic copy of the textbook (which will be especially handy for courses where the whole class is not experimental design, and the number of textbooks assigned to students escalates), an R package of datasets (from traditional agriculture to clinical trials) for easy use by students, and a repository of course materials including slides, homework assignments, and a sample midterm. There are also exercises at the end of each chapter in the book that are a good mix of theory, R-based, and open-ended/communication focused.
As someone who has recently taught experimental design for the first time, I can especially appreciate knowing that this compact yet comprehensive resource is available to students and faculty alike, whether it's to see some sample code, get a sense for what topics are most essential to cover, or to start seeing connections between experimental design and causal inference using observational data.


Sara Stoudt ( is an assistant professor in the Department of Mathematics at Bucknell University. She is interested in applied statistics and the pedagogy of writing in the STEM fields.