Histogram Approaches to Probability Density Functions and Probability Mass Function
This project will explore how a histogram approaches a probability density function (pdf or PDF) or a probability mass function (pmf or PMF) for a random variable and illustrate how to use the pmf/pdf to compute probabilities.
HISTOGRAMS, PDFS AND PMFS
By now, you should all know about histograms, after all, we discussed them in ENES101. We take a large collection events from multiple independent repetitions of a specific experiment, assign numbers to those events by means of a random variable, and then sort the values of the random variable into bins, counting the number in each bin. If we divide the number in each bin (i.e., the value of the histogram in that bin) by the total number of independent trials, we form a naïve estimate of the probability that the value of the random variable falls in that bin. Clearly, if we sum up all of these estimates, we should compute exactly 1.000…, because we will have accounted for all of the trials. Thus, in a simple view, the scaled histogram represents the scaled probability density function (if the random variable is continuous) or the probability mass function ( if the random variable is discrete). As the number of trials increases, the histogram should more and more closely approach the analytical pdf or the analytical pmf , where and represent the center of the bin.
PMF for a single fair die
Using the MATLAB function randi(imax, m, n) , model the number of dots showing on a fair six-sided die. In this case the number of dots – that is, resulting random integer – is the random variable. Each element of the returned matrix of values is one trial. Generate histograms using 120; 1200; 12,000; 120,000 trials and generate the scaled histogram, as described above. In each case, compute the (sample) mean and (sample) variance of your trials.
For those programming in a language other than MATLAB, randi(imax, m, n) creates an array of random integers in the range of 1 to imax.
Hint: use the appropriate MATLAB functions for this! Discuss your observations as the number of points increases. How do the histograms vary (or not) from what you expect. As shown in the figure 1 to figure 4, by increasing the trial, the mean and variance values is fitted to the expected values.
Comparing Sample Mean and Variance with Analytical Values
For this problem, the analytical expected value, or mean, is , and the analytical variance 2.9167. How do the (sample) mean and (sample) variance compare with the analytical values?
As Figure 4 shows, with 120000 trial, the mean and variance is calculated 0.166 and 2.9228 which is in accordance with the analytical values.
Figure 1. 120 trial (mean=0.166 and vaiance=2.6048)
Figure 2. 1200 trial (mean=0.166 and vaiance=3.0075)
Figure 3. 12000 trial (mean=0.166 and vaiance=2.9159)
Figure 4. 120000 trial (mean=0.166 and vaiance=2.9228)
Figure 5. P1=0.5 and m=20
Figure 6. P1=0.5 and m=200
Figure 7. P1=0.5 and m=2000
Figure 8. P1=0.5 and m=20000
Figure 9. P1=0.1 and m=20
Figure 10. P1=0.1 and m=200
Figure 11. P1=0.1 and m=2000
Figure 12. P1=0.1 and m=20000
Figure 13. P1=0.9 and m=20
Figure 14. P1=0.9 and m=200
Figure 15. P1=0.9 and m=2000
Figure 16. P1=0.9 and m=20000
Figure 17.exponentially distribution with10 trial
Figure 18.exponentially distribution with100trial
Figure 19. exponentially distribution with 1000000 trial
Figure 20. with 10 trial
Figure 21. with 100 trial
Figure 22. with 100000 trial
Figure 23. with 10 trial
Figure 24. with 100 trial
Figure 25. with 100000 trial
Figure 26.