You are here

An Introduction to Information Theory

Fazlollah M. Reza
Dover Publications
Publication Date: 
Number of Pages: 
[Reviewed by
Tom Schulte
, on

Originally published in 1961, this work appears on the surface to be a graduate-level text for computer engineering students. As such, it may seem to have only historical interest. It would be tempting to skip over it in favor of more up-to-date works. Besides being part of the constellation of seminal works on information theory, however, this book is still very illustrative and enlightening, especially for someone mathematically sophisticated.

The first four chapters take the reader from elementary probability to the basics of information theory. Beginning with such basic concepts as sets, sample spaces, and random variables, Reza develops the mathematics of measuring information, stochastic elements in communication, and channel capacity.

The final chapter, on group codes, can serve as a supplement in the classroom or for independent study, since it does not require the higher mathematics of earlier chapters. It is an excellent introduction to the topic of coding theory.

This is a straight republishing of the classic work and there is no new or updated material. Still, it is a complete and classic overview that includes advanced topics, avoids being overly rigorous, and makes room for practical applications. It is comparable to Information Theory by Robert B. Ash; both are suitable additions to any information theory library focused on historical development of the topic. I recommend reading Reza before Ash. Reza’s book is an excellent follow-up to An Introduction to Information Theory by John R. Pierce for those seeking greater insight into the development of this important area of applied mathematics. (All three seminal works exist as Dover reprints.)

Chapters rich with examples conclude with exercises that are very illustrative. Generally, answers are not given. Sometimes exercises include explicit guidance to help the reader accomplish the tasks. Closely mingling applications with theory and including detailed exercises designed for illumination makes the book an excellent opportunity for self-study. Shining a light on Shannon’s theoretical advances, Reza connects probability to the mathematics of information to coding schemes.

The first two chapters alone are a very good introduction to probability. An axiomatic approach based on fundamentals of set theory and Boolean algebra is used to develop discrete probability from the ground up. From these building blocks it is a very direct line of development to describe mathematically such concepts as information transition rate, channel noise, redundancy, and more.

This book is an excellent opportunity to connect basic mathematical literacy to a higher level understanding of information theory and is sufficiently self-contained for independent study.

Tom Schulte designs Statistical Process Control (SPC) software for manufacturers at Plex Systems in Michigan.

CHAPTER 1 Introduction
1-1. Communication Processes
1-2. A Model for a Communication System
1-3. A Quantitative Measure of Information
1-4. A Binary Unit of Information
1-5. Sketch of the Plan
1-6. Main Contributors to Information theory
1-7. An Outline of Information Theory
Part 1 : Discrete Schemes without Memory
CHAPTER 2 Basic Concepts of Probability
2-1. Intuitive Background
2-2. Sets
2-3. Operations on Sets
2-4. Algebra of Sets
2-5. Functions
2-6. Sample Space
2-7. Probability Measure
2-8. Frequency of Events
2-9. Theorem of Addition
2-10. Conditional Probability
2-11. Theorem of Multiplication
2-12. Bayes's Theorem
2-13. Combinatorial Problems in Probability
2-14. Trees and State Diagrams
2-15. Random Variables
2-16. Discrete Probability Functions and Distribution
2-17. Bivariate Discrete Distributions
2-18. Binomial Distribution
2-19. Poisson's Distribution
2-20. Expected Value of a Random Variable
CHAPTER 3 Basic Concepts of Information Theory: Memoryless Finite Schemes
3-1. A Measure of Uncertainty
3-2. An Intuitive Justification
3-3. Formal Requirements for the Average Uncertainty
3-4. H Function as a Measure of Uncertainty
3-5. An Alternative Proof That the Entropy Function Possesses a Maximum
3-6. Sources and Binary Sources
3-7. Measure of Information for Two-dimensional Discrete Finite Probability Schemes
3-8. Conditional Entropies
3-9. A Sketch of a Communication Network
3-10. Derivation of the Noise Characteristics of a Channel
3-11. Some Basic Relationships among Different Entropies
3-12. A Measure of Mutual Information
3-13. Set-theory Interpretation of Shannon's Fundamental Inequalities
3-14. "Redundancy, Efficiency, and Channel Capacity"
3-15. Capacity of Channels with Symmetric Noise Structure
3-16. BSC and BEC
3-17. Capacity of Binary Channels
3-18. Binary Pulse Width Communication Channel
3-19. Uniqueness of the Entropy Function
CHAPTER 4 Elements of Encoding
4-1. The Purpose of Encoding
4-2. Separable Binary Codes
4-3. Shannon-Fano Encoding
4-4. Necessary and Sufficient Conditions for Noiseless
4-5. A Theorem on Decodability
4-6. Average Length of Encoded Messages
4-7. Shannon's Binary Encoding
4-8. Fundamental Theorem of Discrete Noiseless Coding
4-9. Huffman's Minimum-redundancy Code
4-10. Gilbert-Moore Encoding
4-11. Fundamental Theorem of Discrete Encoding in Presence of
4-12. Error-detecting and Error-correcting Codes
4-13. Geometry of the Binary Code Space
4-14. Hammings Single-error Correcting Code
4-15. Elias's Iteration Technique
4-16. A Mathematical Proof of the Fundamental Theorem of Information Theory for Discrete BSC
4-17. Encoding the English Alphabet
Part 2: Continuum without Memory
CHAPTER 5 Continuous Probability Distribution and Density
5-1. Continuous Sample Space
5-2. Probability Distribution Functions
5-3. Probability Density Function
5-4. Normal Distribution
5-5. Cauchy's Distribution
5-6. Exponential Distribution
5-7. Multidimensional Random Variables
5-8. Joint Distribution of Two Variables: Marginal Distribution
5-9. Conditional Probability Distribution and Density
5-10. Bivariate Normal Distribution
5-11. Functions of Random Variables
5-12. Transformation from Cartesian to Polar Coordinate System
CHAPTER 6 Statistical Averages
6-1. Expected Values; Discrete Case
6-2. Expectation of Sums and Products of a Finite Number of Independent Discrete Random Variables
6-3. Moments of a Univariate Random Variable
6-4. Two Inequalities
6-5. Moments of Bivariate Random Variables
6-6. Correlation Coefficient
6-7. Linear Combination of Random Variables
6-8. Moments of Some Common Distribution Functions
6-9. Characteristic Function of a Random Variable
6-10. Characteristic Function and Moment-generating Function of Random Variables
6-11. Density Functions of the Sum of Two Random Variables
CHAPTER 7 Normal Distributions and Limit Theorems
7-1. Bivariate Normal Considered as an Extension of One-dimensional Normal Distribution
7-2. MuItinormal Distribution
7-3. Linear Combination of Normally Distributed Independent Random Variables
7-4. Central-limit Theorem
7-5. A Simple Random-walk Problem
7-6. Approximation of the Binomial Distribution by the Normal Distribution
7-7. Approximation of Poisson Distribution by a Normal Distribution
7-8. The Laws of Large Numbers
CHAPTER 8 Continuous Channel without Memory
8-1. Definition of Different Entropies
8-2. The Nature of Mathematical Difficulties Involved
8-3. Infiniteness of Continuous Entropy
8-4. The Variability of the Entropy in the Continuous Case with Coordinate Systems
8-5. A Measure of Information in the Continuous Case
8-6. Maximization of the Entropy of a Continuous Random Variable
8-7. Entropy Maximization Problems
8-8. Gaussian Noisy Channels
8-9. Transmission of Information in the Presence of Additive Noise
8-10. Channel Capacity in Presence of Gaussian Additive Noise and Specified Transmitter and Noise Average Power
8-11. Relation Between the Entropies of Two Related Random Vari
8-12. Note on the Definition of Mutual Information
CHAPTER 9 Transmission of Band-limited Signals
9-1. Introduction
9-2. Entropies of Continuous Multivariate Distributions
9-3. Mutual Information of Two Gaussian Random Vectors
9-4. A Channel-capacity Theorem for Additive Gaussian Noise
9-5. Digression
9-6. Sampling Theorem
9-7. A Physical Interpretation of the Sampling Theorem
9-8. The Concept of a Vector Space
9-9. Fourier-series Signal Space
9-10. Band-limited Singal Space
9-11. Band-limited Ensembles
9-12. Entropies of Band-limited Ensemble in Signal Space
9-13. A Mathematical Model for Communication of Continuous Signals
9-14 Optimal Decoding
9-15. A Lower Bound for the Probability of Error
9-16. An Upper Bound for the Probability of Error.
9-17. Fundamental Theorem of Continuous Memoryless Channels in Presence of Additive Noise
9-18. Thomasian's Estimate
Part 3 : Schemes with Memory
CHAPTER 10 Stochastic Processes
10-1. Stochastic Theory
10-2. Examples of a Stochastic Process
10-3. Moments and Expectations
10-4. Stationary Processes
10-5. Ergodic Processes
10-6. Correlation Coefficients and Correlation Functions
10-7. Example of a Normal Stochastic Process
10-8. Examples of Computation of Correlation Functions
10-9. Some Elementary Properties of Correlation Functions Stationary Processes
10-10. Power Spectra and Correlation Functions
10-11. Response of Linear Lumped Systems to Ergodic Excitation
10-12. Stochastic Limits and Convergence
10-13. Stochastic Differentiation and Integration
10-14. Gaussian-process Example of a Stationary Process
10-15. The Over-all Mathematical Structure of the Stochastic Processes
10-16. A Relation between Positive Definite Functions and Theory of Probability
CHAPTER 11 Communication under Stochastic Regimes
11-1. Stochastic Nature of Communication
11-2. Finite Markov Chains
11-3. A Basic Theorem on Regular Markov Chains
11-4. Entropy of a Simple Markov Chain
11-5. Entropy of a Discrete Stationary Source
11-6. Discrete Channels with Finite Memory
11-7. Connection of the Source and the Discrete Channel with Memory
11-8. Connection of a Stationary Source to a Stationary Channel
Part 4 : Some Recent Developments
CHAPTER 12 The Fundamental Theorem of Information Theory
12-1. A Decision Scheme
12-2. The Probability of Error in a Decision Scheme
12-3. A Relation between Error Probability and Equivocation
12-4. The Extension of Discrete Memoryless Noisy Channels
12-5. On Certain Random Variables Associated with a Communication S
12-6. Feinstein's Lemma
12-7. Completion of the Proof
12-8. Ensemble Codes
12-9. A Relation between Transinformation and Error Probability
12-10. An Exponential Bound for Error Probability
12-11. The Code Book
12-12. A Lemma and Its Application
12-13. Estimation of Bounds
12-14. Completion of Wolfowitz's Proof
CHAPTER 13 Group Codes
13-1. Introduction
13-2. The Concept of a Group
13-3. Fields and Rings
13-4. Algebra for Binary n-Digit Words
13-5. Hammings Codes
13-6. Group Codes
13-7. A Detection Scheme for Group Codes
13-8. Slepian's Technique for Single-error Correcting Group Codes
13-9. Further Notes on Group Codes
13-10. Some Bounds on the Number of Words in a Systematic Code
APPENDIX Additional Notes and Tables
N-1 The Gambler with a Private Wire
N-2 Some Remarks on Sampling Theorem
N-3 Analytic Signals and the Uncertainty Relation
N-4 Elias's Proof of the Fundamental Theorem for BSC
N-5 Further Remarks oil Coding Theory
N-6 Partial Ordering of Channels
N-7 Information Theory and Radar Problems
T-1 Normal Probability Integral
T-2 Normal Distributions
T-3 A Summary of Some Common Probability Functions
T-4 Probability of No Error for Best Group Code
T-5 Parity-check Rules for Best Group Alphabets
T-6 Logarithms to the Base 2
T-7 Entropy of a Discrete Binary Source