Understanding graphs and logarithms
The aim of this activity is that you improve your understanding of what a logarithm is and why people use it, and also deepen your understanding of interpreting graphs.
The activity is based on an episode of the television series NUMB3RS, which has been shown in the US for some years, but which is currently only available on ITV3 here in the UK. It is based on the exploits of a cop who is also very good at mathematics.
NUMB3RS Activity: The Graph Tells the Story
In “Backscatter” Don and his team bust a group of Internet hackers associated with the Russian mob. The mob retaliates, and Charlie must help solve the Internet scheme before the situation worsens. Charlie uses a technique called backscatter analysis to help track the flow of distribution of an Internet attack. Charlie uses the analysis to figure out the prevalence of denial-of-service attacks in the Internet. He gathers data to assess the number, duration, and focus of attacks, and to characterize their behaviour. The process is very complicated.
Suppose that Charlie has finally produced two graphs shown below and needs to interpret them. The list of attacks on Don’s Internet provider over the course of one week is first sorted into increasing order according to the length of time or duration of each attack. The first graph shows the cumulative distribution of these attack durations. The second graph shows the probability density of attack durations for Don’s Internet provider over the course of one week.
[Source:
1. Describe the units on the x-axis. Explain how Charlie can use three different units to label the same scale. (This scale is called a logarithmic scale. For more information about logarithmic scales, see the Extensions page.)
2. Explain the units on the y-axis. Why do you think the units are written in this way? (This scale is also a logarithmic scale.)
3. According to the graph, what is the cumulative % Attacks for an attack duration of 2 minutes or less? 10 minutes or less?12 hours or less?
4. According to the graph, describe the time durations of the shortest 10% of the attacks.
This graph shows the probability density of attack durations
for Don’s Internet provider over the course of one week. The y-axis is the percentage of attacks that lasted a given amount of time.
[Source:
5. What percent of attacks lasted 5 minutes?
6. If the % Attack was 1%, what are the possible Attack Durations?
7. If a mob member claimed that an attack lasted for 7 days, would you believe it? Why or why not?
8. During which time interval do all the attack durations have % Attack greater than 2%?
Answers:
1. The time durations on the x-axis range from 1 minute to 7 days. Because this is such a large range of values, he uses three different units on one axis: minutes, hours, and days. Therefore, the scale marks on the graph represent different units. 2. The percentages on the y-axis range from 0 to 100, but a logarithmic scale is used: the distances from 0 to 1, 1 to 10, and 10 to 100 are all the same. Each y represents the percentage of attacks with time duration less than or equal to x. 3. approximately 8%, 20% and 100% 4. approximately between 1 and 3 minutes 5. about 3% 6. about 1.5 minutes, 2 minutes, and 30 minutes 7. No, because the chance of an attack lasting 7 days is very close to 0. 8. about 3 minutes to 30 minutes
NUMB3RS Activity: Checkbook (Mis)Calculations
A running gag in “Provenance” concerns Charlie’s inability to balance his checkbook. In this activity, you will examine the three most common errors in entering numbers such as the amount of a check and a classic problem related to one of these errors.
1. The most common error, by far, in entering numbers such as the amount of a check is a single-digit error; for example entering an 8 instead of a 3 somewhere in the number. For simplicity, suppose all amounts in this checkbook are in dollars (no cents) from £1 to £9999.
a. If you make one single-digit error, what is the greatest possible difference between your total and the bank’s total?
b. What is the least possible difference?
2. Another common error is a transposition error where you reverse the order of two adjacent digits; for example writing “83” instead of “38.” A common rule for spotting this error is “if the difference between your total and the bank’s total is divisible by 9, look for a transposition error.” Using algebra, explain why this rule makes sense. (Hint: First consider the case where the transposition error is in the last two digits. If the correct amount of a check is 1000a + 100b + 10c + d, then the incorrect amount entered will be 1000a + 100b + 10d + c. Then analyze the other two cases.)
3. A third type of error is a jump transposition error, in which two non-adjacent digits are switched. For example, you write “483” instead of “384” somewhere in the number. Develop a rule for detecting when two digits separated by another digit have been switched. Explain why your rule works.
A classic problem related to this topic begins as follows:
Brett cashes a check worth less than £100 for x pounds and y pence, but the teller inadvertently him pays y pounds and xpence. After Brett buys a newspaper for k pence, the remaining money is twice as much as the original value of the check.
4. If k = 50, find the amount of the check. (Hint: Develop an equation which begins 2(100x + y) = _____, solve for y and examine the table of values to search for integer solutions.)
5. If k = 75, show there is no such check.
6. What is the largest possible original value for such a check? (Hint: Generalize the method used in Question #4. The price of the paper k could be any price.)
Answers:
1a. £8000 1b.£1 2.1000a + 100b + 10c + d – (1000a + 100b + 10d + c) = 9(c – d);
1000a + 100b + 10c + d – (1000a + 100c + 10b + a) = 90(b – c); 1000a + 100b + 10c + d – (1000b + 100a + 10c + d) = 900(a – b); all three differences are divisible by 9. 3. The difference between your total and the bank’s total is divisible by 99. 4. Use the equation Y1 = (199x + 50)/98 to show £16.33 is the unique solution. 5. The solution x = 73, y = 149 has y > 99. 6. £48.99
NUMB3RS Activity: Numb3rs of the I Ching
When the body of a dead woman washes up on the Los Angeles shore, Larry notices that she has a tattoo on her foot in the form of five I Chingsymbols, or hexagrams. Charlie expends a great deal of mathematical energy trying to determine their meaning, although Amita is the one who discovers the truth. The tattoo looks like this:
Each symbol is called a hexagram because it is made up of six lines, either dashed (yin, or “receptive principle”) or solid (yang, or “creative principle”). In turn, each hexagram is made up of two trigrams, or sets of three lines. The lower trigram describes an inner (personal) aspect as related to the upper trigram, which is the outer (external) aspect or situation.
1. Why are there exactly eight trigrams?
2. In the spaces below, use a pattern to generate all eight I Ching trigrams, using the first and last as a guide.
3. Because each hexagram consists of a lower and upper trigram, how many possible I Ching hexagrams are there?
Charlie believes that the I Ching tattoo on the victim’s foot may be some kind of encoded or encrypted message. In truth, each hexagram has its own number (see the Extensions page of this activity). For the victim’s tattoo, they are: Influence (31), Waiting (05), Abundance (55), Strength (01), and Inner Truth (61).
Suppose you wanted to encode information using I Ching hexagrams. For example, if your birthday is October 3rd, you could encode this with two hexagrams; 10 and 03. Your code would look like this:
4. Because the hexagrams are numbered from 01 to 64, what kinds of data could you encode using one, two, or three hexagrams? What limitations are there?
5. The victim’s tattoo consisted of five hexagrams. What kind of numerical information do you think this could represent? Again, what limitations are there?
Answers:
1. Because each of the 3 lines of a trigram can be either dashed or solid, there are 23 = 8 possibilities in all.
2. Order may vary. One way is to show all possibilities for one dashed line, then two dashed lines, as shown:
3. 8 × 8 = 64
4. Answers may vary. Using one hexagram, the first 26 could be used to encode the letters of the alphabet. These could even be combined with a shift cipher to encrypt a message. With two, any four-digit number, such as birthdates, could be encoded, provided that neither the first nor second pair of digits is higher than 64. Three hexagrams could represent five- or six-digit numbers (for a five digit number, use 01 – 09 for the first digit), such as a ZIP code or a student number. The same limitations apply, namely no consecutive pair of digits can exceed 64.
5. Any 9 or 10 digit number. Examples include a Social Security number, a ZIP code with the +4 digits, or a telephone number (including area code). The same limitations apply.
NUMB3RS Activity: Is It Really Rare?
Suppose m is the expected number of times that an event will occur based on what has happened in the past. Also suppose that the occurrences of such an event are independent. A Poisson distribution gives the probability that such a random event will occur in a time interval when the probability of the event occurring has a known historical distribution. The probability that an event occurs exactly k times is given by
wheree is the base of the natural logarithm (e ≈ 2.71828...). For a given value of m, the formula might be shortened to
1. Suppose that on average, 1 gang-related shooting occurs every 12 hours in Los Angeles.
a. In a 24-hour day, how many shootings are expected?
b. Find the probabilities of 0, 1, 2, 3, 4, 5, and 6 gang related shootings occurring in a 24-hour day.
c. Which numbers of shootings have the highest probability of occurring? Why is this answer reasonable?
d. Using the information found in part b, draw a probability distribution below.
e. Suppose 10 shootings occur in a 24-hour day. Do you think that this could be a random occurrence? Use the probabilities you found to explain your reasoning.
For any sample space, the sum of the probabilities of the outcomes is 1. So, for a given value of m,
P(0) + P(1) + P(2) + ... = 1
From this, you can use the complement of an event to help you find a probability. For example, given m, to find the probability that an event will happen more than 2 times (k > 2), subtract the probability that the event will happen 2 or fewer times (k ≤ 2) from 1.
P(2 or fewer times) + P(more than 2 times) = 1
P(more than 2 times) = 1 – P(2 or fewer times)
P(more than 2 times) = 1 – P(0 times) – P(1 time) – P(2 times)
P(more than 2 times) = 1 – (P(0 times) + P(1 time) + P(2 times))
2. The number of crimes that are typically committed in one local precinct of Los Angeles between 11 P.M. and 2 A.M. can be thought of as a Poisson variable. In this precinct, an average of 2 crimes are committed during this 3-hour period.
a. What is the probability that 2 or fewer crimes will be committed between 11 P.M. and 2 A.M.?
b. What is the probability that more than 2 crimes will be committed between 11 P.M. and 2 A.M.?
c. How are the answers to part a and part b related?
d. What is the probability that more than 6 crimes will be committed between 11 P.M. and 2 A.M.? [Hint: Refer back to what you know from 2b.]
e. Challenge
What is the probability that at least 1 crime will be committed between midnight and 1 A.M.? [Hint: Find the value of m for this time interval.]
answers: 1a. 24 ÷ 12 = 2 (that is, m = 2) 1b. P(0) = 0.1353, or about 13.5%; P(1) = 0.2707, or about 27.1%; P(2) = 0.2707, or about 27.1%; P(3) = 0.1804, or about 18.0%; P(4) = 0.0902, or about 9.0%; P(5) = 0.0361, or about 3.6%; P(6) = 0.0120, or about 1.2% 1c. 1 or 2 shootings; This is reasonable because there are an average of 2 shootings expected in a 24-hour period. 1d.
1e. On a day when only 2 shootings are normally expected, the probability of 10 shootings occurring is about 0.004% or 0% . So, it is not likely that it is a random occurrence. 2a. about 67.7% 2b.about 32.3% 2c. Part a and part b describe complementary events. So, the probabilities have a sum of 1. 2d. about 0.0045 or 0.5%; The probability that 6 or fewer crimes will be committed in the time period is about 0.9955 or about 99.6%, so the probability of more than 6 crimes occurring is about 1 – 0.9955 = 0.0045 or about 0.5%. 2e. about 48.7%