You are here

Foundational and Applied Statistics for Biologists Using R

Ken A. Aho
Chapman & Hall/CRC
Publication Date: 
Number of Pages: 
[Reviewed by
Michael Sutherland
, on

This is a terrific intermediate level modern applied statistics text for biologists… or anyone else who is interested in data analysis. It is a big, handsome book of 600 plus pages. That might seem annoyingly large, but it sure does make it easy to do a thorough job of introducing and detailing the main concepts, the methods, and the pleasures of modern data analysis. It is a visually pleasing book with good layouts, nice typefaces and great tables and graphics… and the R code to produce them! A great way for a class to really engage with R graphics.

The Table of Contents shows 7 foundational chapters followed by 4 applications chapters. The first chapter on philosophical and historical foundations of science and “knowing” is novel, well done and too brief! But I deeply appreciated its presence, its importance for students and the opportunity it provides for making connections between fields. The remaining 6 foundational chapters present the traditional set of concepts: probability, density functions, their parameters and statistics, interval estimation (by sampling, resampling and simulation), hypothesis testing, sampling design and experimental design. The following application chapters are the classic core of statistical models: regression, ANOVA, and contingency counts.

The author has put effort into making the book. A website and a companion R package, asbio, serve two audiences: introductory classes and more advanced classes. He has succeeded nicely in writing a dual level book. Selective skips and deletes will still leave a well written introductory text with engaging worked examples each offering the opportunity to develop one’s R skills. For my more mature students there is ample advanced material, many references to the literature, and still those many worked examples with advanced questions to both exercise their statistical understanding and also develop their R skills in a modern analytic environment.

As an example, the book discusses the advanced concept of sphericity in both introductory and advanced contexts. It then shows how easy it is to find and grab a repeated measures ANOVA procedure from an R package off the internet which addresses the issues of sphericity and how to make the adjustments calculations needed. This is one of those many worked examples I mentioned earlier. It uses readily available R code. It proved to be just what I needed to help convince a colleague to leave her old, expensive statistical computing environment for the freedom of R. How nice to have an introductory book that includes grown-up conversations about advanced ideas!

The book is well produced and I enjoyed the companion website, I would strongly recommend the book for mature students, such as those who have engaged and declared a major in bio/enviro/ecosystem studies programs. It is not a book I’d pick for an introductory survey course. I look forward to using it with my upper level undergrads and the Masters and PhD students I continue to work with.

Mike Sutherland is a semi retired statistical consultant who works on interesting academic and business problems. He was a founding faculty member of Hampshire College, then moved to the University of Massachusetts to become the Director of the Statistical Consulting Center.

Philosophical and Historical Foundations

Nature of Science
Scientific Principles
Scientific Method
Scientific Hypotheses
Variability and Uncertainty in Investigations
Science and Statistics
Statistics and Biology

Introduction to Probability
Introduction: Models for Random Variables
Classical Probability
Conditional Probability
Combinatorial Analysis
Bayes Rule

Probability Density Functions
Introductory Examples of pdfs
Other Important Distributions
Which pdf to Use?
Reference Tables

Parameters and Statistics
OLS and ML Estimators
Linear Transformations
Bayesian Applications

Interval Estimation: Sampling Distributions, Resampling Distributions, and Simulation Distributions
Sampling Distributions
Confidence Intervals
Resampling Distributions
Bayesian Applications: Simulation Distributions

Hypothesis Testing
Parametric Frequentist Null Hypothesis Testing
Type I and Type II Errors
Criticisms of Frequentist Null Hypothesis Testing
Alternatives to Parametric Null Hypothesis Testing
Alternatives to Null Hypothesis Testing

Sampling Design and Experimental Design
Some Terminology
The Question Is: What Is the Question?
Two Important Tenets: Randomization and Replication
Sampling Design
Experimental Design


Pearson’s Correlation
Robust Correlation
Comparisons of Correlation Procedures

Linear Regression Model
General Linear Models
Simple Linear Regression
Multiple Regression
Fitted and Predicted Values
Confidence and Prediction Intervals
Coefficient of Determination and Important Variants
Power, Sample Size, and Effect Size
Assumptions and Diagnostics for Linear Regression
Transformation in the Context of Linear Models
Fixing the Y-Intercept
Weighted Least Squares
Polynomial Regression
Comparing Model Slopes
Likelihood and General Linear Models
Model Selection
Robust Regression
Model II Regression (X Not Fixed)
Generalized Linear Models
Nonlinear Models
Smoother Approaches to Association and Regression
Bayesian Approaches to Regression

Inferences for Factor Levels
ANOVA as a General Linear Model
Random Effects
Power, Sample Size, and Effect Size
ANOVA Diagnostics and Assumptions
Two-Way Factorial Design
Randomized Block Design
Nested Design
Split-Plot Design
Repeated Measures Design
Unbalanced Designs
Robust ANOVA
Bayesian Approaches to ANOVA

Tabular Analyses
Probability Distributions for Tabular Analyses
One-Way Formats
Confidence Intervals for p
Contingency Tables
Two-Way Tables
Ordinal Variables
Power, Sample Size, and Effect Size
Three-Way Tables
Generalized Linear Models