A Selection of Problems from A.A. Markov’s Calculus of Probabilities: Problem 8 – Sums of Independent Random Variables

Author(s): 
Alan Levine (Franklin and Marshall College)

 

This problem is very different from the others in this chapter. It is a variation of the material covered in Chapter III, where Markov was concerned with theorems about limiting probabilities, such as the Central Limit Theorem. As with the previous problems, there is ample room for the reader to fill in mathematical details.

 

Задача 8ая.  Пусть \(X_1, X_2,\dots, X_n\) будуть \(n\) независимых величин и пусть совокупность чисел \(1, 2, 3,\dots, m\) представляет все возможные, и при том равновозможные значения каждой из них.

Требуется найти вероятность, что сумма \(X_1, X_2,\dots, X_n\) будеть равна данному числу.

8th Problem. Let \(X_1, X_2, \dots, X_n\) be \(n\) independent variables and let the set of numbers \(1,2,3,\dots,m\) represent all possible, equally likely values of each of them.

It is required to find the probability that the sum \(X_1 + X_2 + \cdots + X_n\) will be equal to some given number.

Continue to beginning of Markov's solution of Problem 8.

Skip to Markov's analysis of the binomial distribution.

 

A Selection of Problems from A.A. Markov’s Calculus of Probabilities: Problem 8 – Solution, Part 1

Author(s): 
Alan Levine (Franklin and Marshall College)

 

Review statement of Problem 8.

 

Solution:  Letting \(n= 1,2,3,\dots\) consecutively, we arrive at the conclusion that, for any value of \(n\), the probability of the equation \(X_1 + X_2 + \cdots + X_n = \alpha\), where \(\alpha\) is a given number, can be determined as the coefficient of \(t^\alpha\) in the expansion of the expression \[\left\lbrace\frac{t + t^2 + \cdots + t^m}{m} \right\rbrace ^n\] in powers of the arbitrary number \(t\).17

On the other hand, we have \[\left\lbrace\frac{t + t^2 + \cdots + t^m}{m} \right\rbrace ^n = \frac{t^n}{m^n} \frac{(1-t^m)^n}{(1-t)^n} = \] \[ = \frac{t^n}{m^n} \left[ 1 - n t^m + \frac{n(n-1)}{1\cdot 2} t^{2m} - \cdots\right] \left[1 + nt + \frac{n(n+1)}{1\cdot 2} t^2 + \frac{n(n+1)(n+2)}{1\cdot 2\cdot 3} t^3 + \cdots \right]. \]

Therefore, denoting the probability of the equation \(X_1 + X_2 + \cdots + X_n = \alpha\) by the symbol \(P_\alpha\), we can establish the formula: \[m^n P_\alpha = \frac{n(n+1) \cdots (\alpha-1)}{1 \cdot 2 \cdot \cdots \cdot (\alpha-n)} - \frac{n}{1} \frac{n(n+1) \cdots (\alpha-m-1)}{1 \cdot 2 \cdot \cdots \cdot (\alpha-n- m)} + \frac{n(n-1)}{1\cdot 2} \frac{n(n+1) \cdots (\alpha-2m-1)}{1 \cdot 2 \cdot \cdots \cdot (\alpha-n-2m)} - \cdots,\] which represents an easy way of calculating \(P_\alpha\) for small values of \(\alpha\).  

It is also not hard to show the equation \(P_{\alpha} = P_{n(m+1) - \alpha}\), which allows us to substitute the number \(\alpha\) for the difference \(n(m+1) - \alpha\) and, in this way, gives the possibility of decreasing \(\alpha\), if \(\alpha > \frac{n(m+1)}{2}\).

 

Continue to Markov's numerical example for Problem 8.

Skip to the second part of Markov's solution of Problem 8.

Skip to the third part of Markov's solution of Problem 8.

Skip to Markov's analysis of repeated independent events.


[17] Another instance of the use of probability generating functions. This conclusion is correct because he has assumed that, for each \(n\), \(P(X_n = i) = \frac{1}{m}\), for \(i = 1,2,\dots m\).

 

A Selection of Problems from A.A. Markov’s Calculus of Probabilities: Problem 8 – Solution, Numerical Example

Author(s): 
Alan Levine (Franklin and Marshall College)

 

Review statement of Problem 8.

 

For example, for \(m = 6\) and \(n = 3\), we find:

\[216 P_{18} = 216 P_3 = 1, \ \ \ 216 P_{17} = 216 P_4 = 3, \]
\[216 P_{16}= 216 P_5 = \frac{3\cdot 4}{1 \cdot 2}  = 6, \ \ \ 216 P_{15}= 216 P_6 = \frac{3\cdot 4\cdot 5}{1 \cdot 2\cdot 3} = 10,\]
\[ 216 P_{14}= 216 P_7 = \frac{3\cdot 4\cdot 5\cdot 6}{1 \cdot 2\cdot 3 \cdot 4} = 15,\]
\[ 216 P_{13}= 216 P_8 = \frac{3\cdot 4\cdot 5\cdot 6 \cdot 7}{1 \cdot 2\cdot 3\cdot 4 \cdot 5} = 21,\]
\[ 216 P_{12}= 216 P_9 = \frac{3\cdot 4\cdot 5\cdot 6 \cdot 7 \cdot 8}{1 \cdot 2\cdot 3\cdot 4 \cdot 5 \cdot 6} - 3 = 25,\]
\[ 216 P_{11}= 216 P_{10} = \frac{3\cdot 4\cdot 5\cdot 6 \cdot 7 \cdot 8 \cdot 9}{1 \cdot 2\cdot 3\cdot 4 \cdot 5 \cdot 6 \cdot 7} - 3 \cdot 3  = 27,\]

Three ordinary six-sided dice, whose sides have numbers 1, 2, 3, 4, 5, 6 can serve to illustrate this example.

If these three dice are thown on a surface and if \(X_1, X_2, X_3\) denote the numbers on the upper sides, then \(\{1, 2, 3, 4, 5, 6\}\) will be the exhaustive and equally likely values of \(X_1\), as well as of \(X_2\) and \(X_3\).

Corresponding to these, the numbers \(P_3, P_4, P_5, \dots P_{18}\) represent the probabilities of different assumptions on the sum of the numbers,18 revealed on three ordinary throws of the dice.

And the equation \(P_3 + P_4 + P_5 + \cdots + P_{10} = P_{11} + P_{12} + P_{13} + \cdots + P_{18}\) shows the equal probability of the assumptions that this sum does not exceed 10 and, conversely, that it is bigger than 10.

Three medieval Islamic dice, excavated from Nishapur, Iran.
Figure 7. Three dice made of bone or ivory, ca 9th–10th century CE, excavated from Nishapur, Iran.
Metropolitan Museum of Art, Rogers Fund, 1938, public domain.

 

Continue to the second part of Markov's solution of Problem 8.

Skip to the third part of Markov's solution of Problem 8.

Skip to Markov's analysis of repeated independent events.

 


[18] The text has the last entry in this list as \(P_{21}\), which clearly is incorrect. The sum of the faces on three dice cannot exceed 18.

 

A Selection of Problems from A.A. Markov’s Calculus of Probabilities: Problem 8 – Solution, Part 2

Author(s): 
Alan Levine (Franklin and Marshal College)

 

Review statement of Problem 8.

 

For large values of \(n\), the exact calculation of \(P_\alpha\) requires tiring computations and can hardly represent great interest.

Then the question arises of finding an approximate expression for the probability with the possibility that it is simple and close to exact.

Assuming only \(n\) is large, but \(m\) is not, and considering not the probability of the actual value of the sum \(X_1 + X_2 + \cdots +X_n\) but the probability that this sum lies within given limits, we can return to the general approximate calculations in Chapter III.

To apply these, we should find the mathematical expectation of the first and second powers of the considered variables.

Since the mathematical expectation of any of these variables \(X_1, X_2, \dots X_n\) is equal to \(\frac{1+2 + \cdots + m}{m} = \frac{m+1}{2}\), and the mathematical expectation of their squares is equal to \(\frac{1^2 + 2^2 + \cdots + m^2}{m} = \frac{(m+1)(2m+1)}{6}\), then the difference between the mathematical expectations of their squares and the square of their mathematical expectations19 reduces to \[\frac{(m+1)(2m+1)}{6} - \left(\frac{m+1}{2}\right) ^2 = \frac{m^2 - 1}{12},\] and therefore, the results of the third chapter give, for the probability of the inequalities \[n \frac{m+1}{2} - \tau \sqrt{n \frac{m^2 - 1}{6}} < X_1 + X_2 + \cdots + X_n < n \frac{m+1}{2}  + \tau \sqrt{n \frac{m^2 - 1}{6}} ,\] an approximate expression in the form of the known integral20 \[\frac{2}{\sqrt{\pi}} \int_0^\tau e^{-z^2}\, dz.\]

We will take advantage of the particular example to show other results of this approximate expression of probability, which can be applied in the general case.

And before anything, we note that, in the expansion of any entire function \(F(t)\) in powers of \(t\), the coefficient of \(t^\alpha\) can be represented in the form of the integral \[\frac{1}{2\pi} \int_{-\pi}^{\pi}  F(e^{\phi\sqrt{-1}} )e^{-\alpha\phi \sqrt{-1}}\,d\phi,\] because \(\int_{-\pi}^\pi d\phi = 2\pi\), and, for any integer \(k\) different from 0, we have \(\int_{-\pi}^\pi e^{k\phi \sqrt{-1}} \, d\phi = 0\).21

Therefore,22 \[\begin{align} P_\alpha&= \frac{1}{2\pi} \int_{-\pi}^\pi \frac{ e^{(n-\alpha) \phi \sqrt{-1}}\left(1-e^{m \phi \sqrt{-1}}\right)^n}{m^n\left(1-e^{\phi \sqrt{-1}}\right)^n} \, d\phi \\
    &=\frac{1}{2\pi} \int_{-\pi}^\pi \frac{ e^{ (n \frac{m+1}{2} -\alpha) \phi \sqrt{-1}} \left(e^{\frac{m}{2} \phi \sqrt{-1}} - e^{-\frac{m}{2} \phi \sqrt{-1}}\right)^n}{m^n \left(e^{\frac{1}{2} \phi \sqrt{-1}} - e^{-\frac{1}{2} \phi \sqrt{-1}}\right)^n}\,d\phi \\
    &=\frac{1}{2\pi} \int_{-\pi}^\pi e^{(n \frac{m+1}{2} - \alpha) \phi \sqrt{-1}} \left( \frac{\sin \frac{m}{2} \phi}{m \sin \frac{\phi}{2}}\right)^n\, d\phi \\
    &= \frac{1}{\pi} \int_0^\pi \cos \left(n \frac{m+1}{2} - \alpha\right) \phi\  \left( \frac{\sin \frac{m}{2} \phi}{m \sin \frac{\phi}{2}}\right)^n\, d \phi.
\end{align}\]

 

Continue to the third part of Markov's solution of Problem 8.

Skip to Markov's analysis of repeated independent events.

 


[19] In other words, the variance.

[20] Markov never defines the normal distribution, as we know it today. When he talks about continuous random variables for the first time in Chapter V, his only examples are those which have a constant density function (i.e., what we would now call the “uniform” distribution) and one whose density is defined for all \(x\) and decreases as \(|x|\) increases. He then claims that the density is proportional to \(e^{-x^2}\), although there are many other such possibilities. This gives us a “normal distribution” with mean 0 and variance \(\frac12\).  So, his approximation can be written as \[P\left(\left| \frac{S_n - \mu}{\sqrt{2} \sigma} \right| < \tau \right) = \frac{2}{\sqrt{\pi}} \int_0^\tau e^{-z^2}\, dz.\]

[21] This is a Fourier transform. Markov chose not to use the symbol \(i\) to represent \(\sqrt{-1}\), although that notation had been around for at least a century.

[22] He is using the facts that \(e^\frac{ix}{2} - e^{- \frac{ix}{2}} = 2i \sin\left(\frac{x}{2}\right)\) and \(\cos(x) = \mathrm{Re}\,(e^{ix})\).

 

A Selection of Problems from A.A. Markov’s Calculus of Probabilities: Problem 8 – Solution, Part 3

Author(s): 
Alan Levine (Franklin and Marshall College)

 

Review statement of Problem 8.

 

Returning to the approximate calculation and letting \(n \frac{m+1}{2} - \alpha = \beta = \gamma \sqrt{n \frac{m^2-1}{6}}\), we substitute23 the exponential function \(e^{-n \frac{m^2-1}{24} \phi^2}\) for \(\left( \frac{\sin \frac{m}{2} \phi}{m \sin \frac{\phi}{2}}\right)^n\) based on the concept shown in Chapter III, and for the upper limit of the integral, we put \(\infty\) in place of \(\pi\).

In this way, we get the approximate formula \[P_{n\frac{m+1}{2} - \beta} \approx \frac{1}{\pi} \int_0^\infty \cos\beta\phi \cdot e^{-n\frac{m^2-1}{24} \phi^2}\, d\phi,\] whose right side is equal to \[\frac{1}{\sqrt{\pi}} \sqrt{\frac{6}{n(m^2-1)}} e^{-\frac{6 \beta^2}{n(m^2-1)}} = \frac{1}{\sqrt{\pi}} \sqrt{\frac{6}{n(m^2-1)}} e^{-\gamma^2}.\]

According to these, the probability of the inequalities \[n \frac{m+1}{2} - \tau \sqrt{n \frac{m^2 - 1}{6}} < X_1 + X_2 + \cdots + X_n < n \frac{m+1}{2}  + \tau \sqrt{n \frac{m^2 - 1}{6}}\] is represented approximately by the sum of all products \(\frac{1}{\sqrt{\pi}} \sqrt{\frac{6}{n(m^2-1)}} e^{-\gamma^2}\) for which \(\gamma\) satisfies \(-\tau < \gamma < \tau\) and turns the expression \(n \frac{m+1}{2} - \gamma \sqrt{n \frac{m^2 - 1}{6}}\) into an integer.

All terms in the sum shown contain a factor \(\sqrt{\frac{6}{n(m^2-1)}}\), which is equal to the difference between each two adjacent values of \(\gamma\) and will be arbitrarily small for sufficiently large \(n\).

Replacing on this basis the sum by the integral, we get for the probability of the inequalities \[n \frac{m+1}{2} - \tau \sqrt{n \frac{m^2 - 1}{6}} < X_1 + X_2 + \cdots + X_n < n \frac{m+1}{2}  + \tau \sqrt{n \frac{m^2 - 1}{6}}\] the previous approximate expression24 \[\frac{2}{\sqrt{\pi}} \int_0^\tau e^{-\gamma^2}\, d\gamma.\]

 

This is the end of the solution to this problem.

 

Continue to Markov's analysis of the binomial distribution.

 


[23] The Taylor series for \(e^{-n \frac{m^2-1}{24} \phi^2}\) and \(\left( \frac{\sin \frac{m}{2} \phi}{m \sin \frac{\phi}{2}}\right)^n\) both begin with \(1 + \frac{n}{24} (1-m^2) \phi^2\). The third terms of these series differ by \(\frac{n}{2880}(1-m^4)\phi^4\). Thus, for sufficiently small \(\phi\), the two functions are approximately equal.

[24] In essence, this example shows that the sum of independent, identically-distributed discrete random variables can be approximated by a normal distribution with appropriate mean and standard deviation. In the previous chapter, Markov showed that the distribution of the sample mean can also be approximated by an appropriate normal distribution; i.e., the Central Limit Theorem. He also relaxed the requirement that the variables have the same mean and variance.