Understanding Statistical Error: A Primer for Biologists

Marek Gierliński

Publisher:

John Wiley-Blackwell

Publication Date:

2016

Number of Pages:

213

Format:

Paperback

Price:

59.95

ISBN:

9781119106913

Category:

Textbook

MAA Review
Table of Contents

[Reviewed by

Robert W. Hayden

, on

02/8/2017

]

Many statisticians are wary of books with titles like Statistics for Midwestern Dentists. After all, is not statistics the same no matter where you live? True, but one could make a case for a text geared to a specific discipline. One argument is that examples and exercises drawn from that field may be more engaging to students in that field. There may also be small changes in content to please a different audience. Even in a course for majors from the Business Department we may find that the managers want time series and the marketers want chi-squared. The book at hand targets biologists.

At first glance, the book is a bit frightening, since it is about one-third the size of a good general introductory statistics textbook. One has to wonder how deeply it gets into core ideas. A quick glance through the table of contents reveals that many traditional topics are missing, while other topics of interest to biologists are included. Also, there are far fewer exercises and examples than one might normally find in a textbook. This raises the question of what the intended audience is. Clearly it is for biologists, and many of the examples require a fair amount of biological knowledge to even begin to follow. There is also more mathematics than usual in an introductory course, including many cameo appearances by partial derivatives. Surely this must be for upper division students and beyond. Topical coverage would not make it equivalent to a normal introductory course. Perhaps one could imagine it as supplementary reading for upper-lever biology lab courses.

Looking more carefully at the content we find some positive points. There is awareness of issues well known to statisticians but not yet widely present in beginning textbooks, such as the limitations of the traditional approach to confidence intervals for a single proportion. Many computer simulations are effectively used to illustrate abstract points. The meaning of “95%” in “95% confidence interval” is handled well. The exposition is breezy, informal, and full of humor. Many ideas are explained simply and clearly. Worth mentioning is the coverage of some topics not usually found in an introductory statistics course: error bars, rounding (a topic of great interest to AP Statistics teachers), and the propagation of errors. The last of these addresses estimating the error in a function based on errors in its arguments. As an example, we might measure the length, width and height of a box with known errors, and want to know the error in the volume we compute from those measurements.

On the negative side, the informality is sometimes carried too far. The Introduction ends by admitting

…there is a general lack of mathematical rigour. A mathematician might scowl at the content of this book, so if you are one, please shut your eyes now.

In response it might be helpful to consider what sorts of departures from rigor might be acceptable in an introductory course. Surely we cannot prove or derive everything. Perhaps we can keep silent about some hypotheses of theorems that rarely matter in practice. We may oversimplify here and there. but we should not tell students things that are usually (or always) not true. On this last point this book has some issues. For example, the author recommends using regression through the origin because his example data are far from the origin. This is exactly backwards. In many (most?) situations where we fit a straight line, that line is just an approximation to the true non-linear relationship that happens to work well over a narrow range of the independent variable. Forcing the line through the origin may give a good average slope between the origin and the data, but a poor fit in the range of the data.

The writing generally has a “stream of consciousness” style and though there are many cross references the logical development of the subject is seriously mangled. For example, early on the book says that the sampling distribution of the mean is normal, but later confidence intervals for the mean are computed with the t distribution without explanation (or even acknowledgment) of the shift in distributions. Throughout the book the breezy tone leads to a lot of sloppiness, such as saying that that a confidence interval “is” the margin or error. In addition to being wrong it is hard to imagine what a novice would take this to mean. There are just too many places where this book is unclear or simply wrong.

There are a small number of mostly computational exercises with brief solutions supplied for most. Some of these make conceptual points but most of the student’s attention will be on the arithmetic rather than the ideas.

So, sadly, and despite the author’s entertaining sense of humor and generally readable prose, we have to rate this book “R” and suggest it not be read by statistical minors. For those who already have a solid understanding of statistical concepts, it may serve to provide information on how biologists think about statistics, and what statistical topics are of interest to them.

After a few years in industry, Robert W. Hayden (bob@statland.org) taught mathematics at colleges and universities for 32 years and statistics for 20 years. In 2005 he retired from full-time classroom work. He now teaches statistics online at statistics.com and does summer workshops for high school teachers of Advanced Placement Statistics. He contributed the chapter on evaluating introductory statistics textbooks to the MAA's Teaching Statistics.

See the table of contents in the publisher's webpage.

Tags:

Biostatistics