STA 4321 – Discrete Distribution Project
Part 1: Poisson Distribution: Youth Soccer League
An elite coed youth soccer league is very balanced with all teams being of approximately the same skill levels. There are 25 teams, and each team plays each of the other teams once (they have very patient parents). On average, in a conceptual population of an infinite number of games among these teams, the total number of goals scored is = 3.2 per game. Games are 30 minutes long, and no overtime is played. Suppose we consider breaking each game down to 20 second intervals, where the probability of 1 goal in a 20-second interval is p, and the probability of 0 goals is 1-p. Treating the Poisson Distribution as a limit of the Binomial Distribution as n gets large and p gets small (but np remains constant), and based on all previous information, answer the following parts:
- How many games will be played?
- If Y ≡ # of goals in a given game, what is the probability distribution of Y?
- What are the mean, variance, and standard deviation of Y (in the conceptual population of games)?
- Treating the game as a sequence of 20-second intervals, what is n? What is p?
- Simulate the season where m1 is the number of games, and m2 is the number of intervals per game by completing the following parts in EXCEL (using the Data Analysis Tool Pack, which you may need to add-in)
- Click on the Data Tab on the Tool Bar
- Click on Data Analysis
- Click on Random Number Generation
- For Number of Variables, choose m2
- For Number of Random Numbers, choose m1
- For Distribution, choose Uniform
- For Parameters, Between 0 and 1 (Default)
- For Random Seed, choose the last 4 digits of your UFID, this allows you to re-produce the same random numbers again at a different time.
- For Output Options, select Output Range: A1
- Click OK
- You will now have an array of m1 rows (games) and m2 columns (game sub-intervals)
- Go to the first row of the (m2+1)st column and “count” the number of goals in that game by counting the number of sub-intervals with random numbers ≤ p by entering the following command: =COUNTIF(A1:*1,”<=”&p) where * represents the letters corresponding to the m2th column (for instance, if m2=26, *=Z, if m2=27, *=AA).
- After your answer appears, double-click on the box in the Southeast corner of the cell and it will do that command for all m1 rows.
- Obtain the empirical probability distribution by counting the number of occurrences of each outcome and dividing it by the number of games, and compare it to the theoretical probability distribution. Repeat for the mean, variance, and standard deviation.
Part 2: Bernoulli Trials: Binomial and Negative Binomial Distributions:
A fashion design team has prepared thousands of items during their production season. Of all of their items, a potential large retail chain would purchase 45% of the items to mass produce (they find the other 55% of the items to not be potentially profitable based on their market segment). Consider the following 2 scenarios.
Scenario 1: Due to logistics and time constraints of the meeting, they decide to bring 10 items to the meeting with the retail chain’s buyers. Let Y ≡ # of Items purchased.
- What is the probability distribution of Y?
- What are the Mean, Variance, and Standard Deviation of Y?
- Simulate 500 possible meetings (with 500 different random samples from their population of items). Note: This will be similar to part 1, with 500 rows, and 10 columns.
- Obtain the empirical probability distribution by counting the number of occurrences of each outcome and dividing it by the number of meetings, and compare it to the theoretical probability distribution. Repeat for the mean, variance, and standard deviation.
Scenario 2: The buyer comes to the design team’s studio, and says they will view items until they have chosen 3 items to purchase. Let Y ≡ # of items shown until the 3rd item is purchased.
- What is the probability distribution of Y?
- What are the Mean, Variance, and Standard Deviation of Y?
- Simulate 200 possible meetings (with 200 different random samples from their population of items). Note: This will be similar to part 1, with 200 rows, and (say) 20 columns.
- Obtain the empirical probability distribution by counting the number of occurrences of each outcome and dividing it by the number of meetings, and compare it to the theoretical probability distribution. Repeat for the mean, variance, and standard deviation.
- Hint for Scenario 2. After generating your 200 rows by 20 columns of Uniform RVs in cells A1:T200, do the following:
- In Cell U1 Type: =COUNTIF($A1:A1,"<="&0.45)
- Click on the box in the Southeast Corner and drag to Cell AN1
- While Cells U1:AN1 are still highlighted, double click on square in SE corner of cell AN1, and that will copy commands for rows 2:200
- In Cell AO1, copy and paste the following command, which will save the trial on which the 3rd item was selected for purchase.
=if(w1=3,3,if(x1=3,4,if(y1=3,5,if(z1=3,6,if(aa1=3,7,if(ab1=3,8,if(ac1=3,9,if(ad1=3,10,if(ae1=3,11,if(af1=3,12,if(ag1=3,13,
if(ah1=3,14,if(ai1=3,15,if(aj1=3,16,if(ak1=3,17,if(al1=3,18,if(am1=3,19,if(an1=3,20)))))))))))))))))