3.1. $$\overline{A}=[E_2, E_4, E_5, E_7, E_8, E_{10}]$$

3.2.a. $$A \cap B = [E_3, E_9]$$

3.2.b. $$A \cup B = [E_1, E_2, E_3, E_7, E_8, E_9]$$

3.2.c. It’s not, because \(E_4\) to \(E_6\), as well as \(E_{10}\) are not covered.

3.3.a. $$A \cap B = [E_4, E_5, E_6, E_{10}]$$

3.3.b. $$A \cup B = [E_1, E_2, E_4, E_5, E_6, E_7, E_8, E_{10}]$$

3.3.c. It’s not, because \(E_3\) and \(E_{10}\) are missing.

3.4.a. $$A \cap B = [E_3, E_6]$$

3.4.b. $$A \cup B = [E_3, E_4, E_5, E_6, E_9, E_{10}]$$

3.4.c. It’s not.

3.5.a. \(\overline{A}\) is the event “it will be 4 or less days before the machinery becomes available”.

3.5.b. \(A \cap B\) is the event it will be 5 days before the machine becomes available.

3.5.c. \(A \cup B\) is collectively exhaustive (any number of days before the machinery becomes available.

3.5.d. \(A \cap B\) is not the empty set, but rather contains the outcome of 5 days, so A and B are not mutually exclusive.

3.5.e. A and B are collectively exhaustive because any outcome is either below 6 or above 4.

3.5.f. According to Table 3.2, \(B \cap \overline{A} = B - (A \cap B)\), so \((A \cap B) \cup (\overline{A} \cap B) = (A \cap B) \cup (B - (A \cap B)) = B\).

An alternative demonstration is thus: \((A \cap B) \cup (\overline{A} \cap B) = (A \cup \overline{A}) \cap B =
S \cap B = B\).

As for our specific A and B, \(A \cap B\) is only day 5, and \(\overline{A} \cap B\) is 4 days or less, so the union of the two is anything that’s less than 6, or B.

3.5.g. I can’t think of a mathematical way to describe this, but it’s clear intuitively, that \(\overline{A} \cap B\) is all of B, except for the intersection with A, and the union of that with A is all of B and all of A, so \(A \cup B\).

As for our specific A and B, \(\overline{A} \cap B\) is 4 days or less, and A is more than 4 days, so the union is the entire sample space, and \(A \cup B\) is collectively exhaustive so it too equals the entire sample space.

3.6.a. \(A \cap B\) matches \(O_1\) and \(\overline{A} \cap B\) matches \(O_3\), their union is exactly \(b = [O_1, O_3]\)

3.6.b. \(\overline{A} \cap B = [O_3]\) and \(A = [O_1, O_2]\), so the union is \(A \cup B = [O_1, O_2, O_3]\).

3.7.a. $$[(M_1, M_2), (M_1, M_3), (M_1, T_1), (M_1, T_2), (M_2, M_3), (M_2, T_1), (M_2, T_2), (M_3, T_1), (M_3, T_2), (T_1, T_2)]$$

3.7.b. $$A = [(M_1, T_1), (M_1, T_2), (M_2, T_1), (M_2, T_2), (M_3, T_1), (M_3, T_2), (T_1, T_2)]$$

3.7.c. $$B=[(M_1, M_2), (M_1, M_3), (M_2, M_3), (T_1, T_2)]$$

3.7.d. $$\overline{A}=[(M_1, M_2), (M_1, M_3), (M_2, M_3)]$$

3.7.e. \(A \cap B = [(T_1, T_2)]\), whereas \(\overline{A} \cap B = [(M_1, M_2), (M_1, M_3), (M_2, M_3)]\). Therefore, the union is all four outcomes that make up B.

3.7.f. \(\overline{A} \cap B = [(M_1, M_2), (M_1, M_3), (M_2, M_3)]\), and \(A = [(M_1, T_1), (M_1, T_2), (M_2, T_1), (M_2, T_2), (M_3, T_1), (M_3, T_2), (T_1, T_2)]\), so the union is \(S = [(M_1, M_2), (M_1, M_3), (M_2, M_3), (M_1, T_1), (M_1, T_2), (M_2, T_1), (M_2, T_2), (M_3, T_1), (M_3, T_2), (T_1, T_2)]\), whereas \(S = A \cup B\), so the two subsets match.

3.8. 35/66 = 0.53030303

3.9. 36/120 = 0.3

3.10. 675/1820 = 0.370879121

3.11. 20000/120000 = 0.166666667

3.12. 1/9 * 1/9 = 1/81

3.13.a. 0.68

3.13.b. 0.73

3.13.c. 0.32

3.13.d. 0.41

3.13.e. 1

3.14.a. 0.54

3.14.b. 0.18

3.14.c. Rate of return will be less than 10%.

3.14.d. 0.46

3.14.e. The empty set.

3.14.f. 0

3.14.g. Rate of return will be at least 10% or negative.

3.14.h. 0.72

3.14.i. Yes, because \(A \cap B = \emptyset\)

3.14.j. No, \(A \cup B\) do not cover 0% – 10%.

3.15.a. 0.5

3.15.b. 0.25

3.15.c. 0.25

3.16. \(A \cup B = [E_1, E_2, E_3, E_7, E_8, E_9]\), so the \(P(A \cup B) = 6/10 = 0.6\), whereas \(P(A) = P(B) = 4/10 = 0.4\), so \(P(A) + P(B) = 0.8\).

3.17.a. 0.86

3.17.b. 0.91

3.17.c. 0.14

3.17.d. 1

3.17.e. 0.77

3.17.f. No, any number of complaints between 1 to 9 falls under both.

3.17.g. Yes, \(P(A \cup B) = 1\)

3.18.a. 0.87

3.18.b. 0.35

3.18.c. The five classes cover all possible outcomes, and P(S) = 1.

3.19. $$P(A \cap B) = 0.25$$

3.20. $$P(A \cap B) = 0$$

3.21. $$P(A \cap B) = 0.24$$

3.22. $$P(A \cup B) = 0.75$$

3.23. \(P(A|B) = 0.67\). This does not equal P(A), so the events are not statistically independent.

3.24. \(P(A|B) = 0.8 = P(A)\), therefore, the events are statistically independent.

3.25. \(P(A|B) = 0.75\). The events are not statistically independent.

3.26. \(P(A|B) = 0.625\). The events are not statistically independent.

3.27. 1/9

3.28.a. 7! = 5040

3.28.b. 1/5040

3.29. 49*50 = 2450

3.30. 1/120

3.31. 0.2

3.32. N = 60, P = 1/60

3.33. 28

3.34.a. 42

3.34.b. 6

3.34.c. 6

3.34.d. 6/42, if considering only the lead role, the chance is 1/7, which is the same probability.

3.34.e. (6 + 6)/42 = 2/7. \(P(A \cup B) = P(A) + P(B) - P(A \cap B)\), and \(P(A \cap B) = \emptyset\), so we can add up 1/7 + 1/7 to reach the same result.

3.35.a. 150

3.35.b. 40/150 = 0.27

3.35.c. If A is the event that the craftsman brother gets selected, and B is the event that the labourer brother gets selected, then \(P(A) = 4/10\), \(P(B) = 10/15\) and \(P(A \cap B) = 40/150\), so \(P(A \cup B) = 120/150\), and its complement is the answer \(\overline{P(A \cup B)} = 30/150\)

3.36.a. 90

3.36.b. We look at the complement. There are 10 combinations of U.S funds that won’t under-perform, and 3 combinations of such international funds. So the probability of the complement is 30/90, and the probability of the original event in question is 60/90, or 0.67.

3.37. $$P(A \cup B) = 0.3 + 0.25 - 0.2 = 0.35$$

3.38. $$P(A \cup B) = 0.3 + 0.2 - 0.15 = 0.35$$

3.39.a. We take ‘unsuccessful’ to mean ‘no immediate action’, and if we call that A, then \(P(A) = 0.95\), and the probability of 4 consecutive outcomes that belong to A is 0.81.

3.39.b. We’d like the probability of “at least 4 unsuccessful calls” where here “unsuccessful” means anything not leading to a donation at all. Anything after those 4 calls work, because the question states that we’re looking for the first successful call after “at least 4 unsuccessful” ones. So it’s the same answer whether there were 4 unsuccessful calls and then a successful one, and if there were 10 unsuccessful calls and only then a successful one. The probability of any type of donation happening is 0.1, so the answer is \(0.9^4 = 0.65\)

3.40. Because B and C are mutually exclusive, \(P(A \cup B \cup C) = P(A) + P(B) + P(C) - P(A \cap B) - P(A \cap C)\), and because A is independent of both, \(P(A \cap B) = P(A)P(B)\), and \(P(A \cap C) = P(A)P(C)\), so \(P(A \cup B \cup C) = 0.069\)

3.41. For independent events, \(P(A \cap B) = P(A)P(B)\), so 0.833.

3.42.a. $$P(B|A) = \frac{P(A \cap B)}{P(B)}=\frac{0.1}{0.18}=0.555$$

3.42.b. $$P(A|B) = \frac{P(A \cap B)}{P(A)}=\frac{0.1}{0.12}=0.833$$

3.43. If A is “item is defective”, and B is “inspector accepted item”, then \(P(B|A) = 0.8, P(A \cap B) = 0.01\), so \(P(A) = 0.125\)

3.44. Let A be “analyst is successful at stocks” and let B be “analyst is successful at bonds”. \(P(A) = 1/12, P(B) = 1/20\). Because they’re independent, \(P(B \cap A) = P(A)P(B)=1/240\). So \(P(A \cup B) = 1/12 + 1/20 - 1/240 = 31/240\)

3.45. Let A be the event “loan is for a high risk client”, and B will be “loan in default”. So \(P(A) = 0.15, P(B) = 0.05, P(A \cap B) = 0.02\), and the answer is \(P(B|A) = 0.02/0.15 = 0.133\)

3.46.a. $$P(A \cup B) = 0.4 + 0.5 - 0 = 0.9$$

3.46.b. $$P(A \cup C) = 0.8 + 0.5 - 0.4*0.8 = 0.88$$

3.46.c. \(P(C|B) = \frac{P(B \cap C)}{P(B)} = 0.75\), so \(P(B \cap C) = 0.375\) and \(P(A \cup B) = 0.5 + 0.8 - 0.375 = 0.925\)

3.47. The number of combinations for events A and B are 10 each, so the probabilities are 0.1. The number of combinations for getting both A and B is 210, so the overall probability is: \(P(A \cup B) = P(A) + P(B) - P(A \cap B) = 0.1 + 0.1 - \frac{1}{210} = 0.95\), and the analyst’s claim is a bit presumptuous.

3.48.a. Let A be the event “occurred on Monday” and B be the event “occurred on the last hour of the shift”, so \(P(A) = 0.3, P(B) = 0.2, P(A \cap B) = 0.04, P(A) = P(A \cap B) \cup P(A \cap \overline{B}) \Rightarrow 0.3 = 0.04 + P(A \cap \overline{B}) \Rightarrow P(A \cap \overline{B}) = 0.26. P(\overline{B}|A) = \frac{P(A \cap \overline{B})}{P(A)} = \frac{0.26}{0.3} = 0.8667\)

3.48.b. \(P(A)P(B) = (0.3)(0.2) = 0.06 \neq 0.04 = P(A \cap B)\), so no.

3.49.a. Let A be the event “signed up for reading class” and B be the event “signed up for math class”. \(P(A) = 0.4, P(B) = 0.5, P(B|A) = 0.3, P(A \cap B) = P(B|A)P(A) = (0.3)(0.4) = 0.12\)

3.49.b. $$P(A|B) = \frac{P(A \cap B)}{P(B)} = \frac{0.12}{0.5} = 0.24$$

3.49.c. $$P(A \cup B) = P(A) + P(B) - P(A \cap B) = 0.4 + 0.5 - 0.12 = 0.78$$

3.49.d. \(P(A)P(B) = (0.4)(0.5) = 0.2 \neq 0.12 = P(A \cap B)\), so no.

3.50. Let A be “new customer”, and be be “used rival”, so \(P(A) = 0.15, P(B|A) = 0.8, P(B) = 0.6, P(A \cap B) = P(B|A)P(A) = (0.8)(0.15) = 0.12, P(A|B) = \frac{P(A \cap B)}{P(B)} = \frac{0.12}{0.6} = 0.2\)

3.51. $$P(B) = 0.2, P(A|B) = 0.8, P(A \cap B) = P(D) = 0.16, P(D \cap C) = P(A \cap B \cap C) = 0.02, P(C|D) = \frac{P(C \cap D)}{P(C)} = \frac{0.02}{0.16} = 0.125$$

3.52. 0.05

3.53. 0.05

3.54. 0.05

3.55. 0.20

3.56. $$\frac{0.05}{0.3} = 0.1667$$

3.57. $$\frac{0.1}{0.4} = 0.25$$

3.58. $$\frac{0.1}{0.25} = 0.4$$

3.59. 4

3.60. 1

3.61. $$\frac{P(A|B_1)}{P(A|B_2)} = \frac{0.8}{0.4} = 2$$

3.62. $$\frac{P(A|B_1)}{P(A|B_2)} = \frac{0.4}{0.2} = 2$$

3.63. $$\frac{P(A|B_1)}{P(A|B_2)} = \frac{0.2}{0.4} = 0.5$$

3.64.a. 0.12

3.64.b. $$\frac{P(A|B)}{P(B)} = \frac{0.19}{0.27} = 0.703$$

3.64.c. No, \(P(A)P(B) = (0.27)(0.79) = 0.2133 \neq 0.19 = P(A \cap B)\)

3.64.d. \(\frac{P(A|B)}{P(B)} = \frac{0.07}{0.21} = 0.33\)

3.64.e. No, \(P(A)P(B) = (0.19)(0.21) = 0.0399 \neq 0.07 = P(A \cap B)\)

3.64.f. 0.79

3.64.g. 0.27

3.64.h. $$P(A \cup B) = P(A) + P(B) - P(A \cap B) = 0.79 + 0.27 - 0.19 = 0.87$$

3.65.a. 0.3

3.65.b. 0.38

3.65.c. Let A be a high prediction and B be a high outcome, $$ P(A|B) = \frac{P(A \cap B)}{P(B)} = \frac{0.23}{0.38} = 0.605 $$

3.65.d. $$P(B|A) = \frac{P(A \cap B)}{P(A)} = \frac{0.23}{0.3} = 0.766$$

3.65.e. Let b be a low outcome, \(P(B|A) = \frac{P(A \cap B)}{P(A)} = \frac{0.01}{0.3} = 0.033\)

3.66.a. 0.25

3.66.b. 0.32

3.66.c. Let A be “traded” and B will be “never reads”, so $$P(A|B) = \frac{P(A \cap B)}{P(B)} = \frac{0.04}{0.25} = 0.16$$

3.66.d. $$P(A|B) = \frac{P(A \cap B)}{P(A)} = \frac{0.04}{0.32} = 0.125$$

3.66.e. Let B be “regularly reads the paper”, so \(P(B) = 0.34, P(\overline{B}) = 0.66, P(A \cap \overline{B}) = P(A) - P(A \cap B) = 0.32 - 0.18 = 0.14, P(A|\overline{B}) = \frac{P(A \cap \overline{B})}{P(\overline{B})} = \frac{0.14}{0.66} = 0.212\)

3.67.a. Let D be “defective”. Because A, B & C are mutually exclusive and collectively exhaustive, \(P(D) = P(D \cap A) \cup P(D \cap B) \cup P(D \cap C) = P(D \cap A) + P(D \cap B) + P(D \cap C) - P((D \cap A) \cap (D \cap B)) - P((D \cap A) \cap (D \cap C)) - P((D \cap B) \cap (D \cap C)) + P((D \cap A) \cap (D \cap B) \cap (D \cap C)) = 0.02 + 0.05 + 0.03 - 0 - 0 - 0 + 0 = 0.1\)

3.67.b. Let G be “good. Because G and D are mutually exclusive and collectively exhaustive, \(P(B) = P(G \cap B) \cup P(D \cap B) = P(G \cap B) + P(D \cap B) - P((G \cap B) \cap (D \cap B)) = 0.3 + 0.05 - 0 = 0.35\)

3.67.c. $$P(D|B) = \frac{P(D \cap B)}{P(B)} = \frac{0.05}{0.35} = 0.142$$

3.67.d. $$P(B|D) = \frac{P(D \cap B)}{P(D)} = \frac{0.05}{0.1} = 0.5$$

3.67.e. No, for example: $$P(A|D) = \frac{P(A \cap D)}{P(D)} = \frac{0.02}{0.1} = 0.2 \neq P(A) = 0.29$$ And the same goes for other intersections.

3.67.f. \(P(G|A) = 0.931, P(G|B) = 0.857, P(G|C) = 0.916\), so A.

3.68.a. 0.32

3.68.b. 0.25

3.68.c. Let W be “worked on additional problems”, so \(P(A|W) = \frac{P(A \cap W)}{P(W)} = \frac{0.12}{0.32} = 0.375\)

3.68.d. $$P(W|A) = \frac{P(A \cap W)}{P(A)} = \frac{0.12}{0.25} = 0.48$$

3.68.e. Let D be “expects a grade below C”. Because C and D are mutually exclusive, \(P(C \cup D) = P(C) + P(D) - P(C \cap D) = 0.38 + 0.1 - 0 = 0.48\), so \(P((C \cup D) \cap W) = P(C \cap W) \cup P(D \cap W) = P(C \cap W) + P(D \cap W) - P((C \cap W) \cap (D \cap W)) = 0.12 + 0.02 - 0 = 0.14, P((C \cup D)|W) = \frac{P((C \cup D) \cap W)}{P(W)} = \frac{0.14}{0.32} = 0.4375\)

3.68.f. No, for example: \(P(A|W) = \frac{P(A \cap W)}{P(W)} = \frac{0.12}{0.32} = 0.375 \neq P(A) = 0.25\), and the same goes for other intersections.

3.69.a. 0.77

3.69.b. 0.19

3.69.c. Let S be “single”, and L be “left the job within the year, so \(P(L|S) = \frac{P(L \cap S)}{P(S)} = \frac{0.06}{0.23} = 0.26\)

3.69.d. \(P(\overline{S}|\overline{L}) = \frac{P(\overline{S} \cap \overline{L})}{P(\overline{L})} = \frac{0.64}{0.81} = 0.79\)

3.70.a. 0.76 3.70.b. 0.77 3.70.c. 0.1

3.71.

Men | Women | ||
---|---|---|---|

Joined | 0.028 | 0.054 | 0.082 |

Not Joined | 0.372 | 0.546 | 0.918 |

Total | 0.4 | 0.6 | 1 |

a. 0.082

b. Let W be “women” and J be “joined the club”, so \(P(W|J) = \frac{P(W \cap J)}{P(J)} = \frac{0.054}{0.082} = 0.658\)

3.72. Let G be “significant growth”, and I1, I2 & I3 will greater, similar and lower interest events.

I1 | I2 | I3 | ||
---|---|---|---|---|

\(G\) | 0.025 | 0.3 | 0.12 | 0.445 |

\(\overline{G}\) | 0.225 | 0.3 | 0.03 | 0.555 |

Total | 0.25 | 0.6 | 0.15 | 1 |

a. 0.025

b. 0.445

c. \(P(I3|G) = \frac{P(I3 \cap G)}{P(G)} = \frac{0.12}{0.445} = 0.269\)

3.73. \(P(H) = 0.42, P(S) = 0.22, P(S|H) = 0.34\)

a. $$P(H \cap S) = P(S|H)P(H) = (0.34)(0.22) = 0.0506$$

b. $$P(H \cup S) = P(S) + P(H) - P(S \cap H) = 0.22 + 0.42 - 0.0506 = 0.5894$$

c. $$P(H|S) = \frac{P(H \cap S)}{P(S)} = \frac{0.0506}{0.22} = 0.23$$

3.74. Let U be the event of graduating at the top 10% of the class.

$$>Q_1$$ | $$Q_2 \cup Q_3$$ | $$Q_4$$ | ||
---|---|---|---|---|

\(U\) | 0.175 | 0.25 | 0.05 | 0.475 |

\(\overline{U}\) | 0.075 | 0.25 | 0.2 | 0.525 |

Total | 0.25 | 0.5 | 0.25 | 1 |

a. 0.475 b. $$P(Q_1|U) = \frac{P(Q_1 \cap U)}{P(U)} = \frac{0.175}{0.475} = 0.368$$

c. We can reach this result in one of two ways: c.1. $$P(\overline{Q_1}|\overline{U}) = 1 - P(Q_1|\overline{U}) = 1 - \frac{P(Q_1 \cap \overline{U})}{P(\overline{U})} = 1 - \frac{0.075}{0.525} = 0.857$$

c.2. Because \(Q4\) and \((Q2 \cup Q3)\) are mutually exclusive, then \(P((Q_2 \cup Q_3) \cap \overline{U}) \cup P(Q_4 \cap \overline{U}) = P((Q_2 \cup Q_3) \cap \overline{U}) + P(Q_4 \cap \overline{U}) - P(((Q_2 \cup Q_3) \cap \overline{U}) \cap (Q_4 \cap \overline{U})) = P((Q_2 \cup Q_3) \cap \overline{U}) + P(Q_4 \cap \overline{U}) - 0 = P((Q_2 \cup Q_3) \cap \overline{U}) + P(Q_4 \cap \overline{U})\), so we can do: \(P(\overline{Q_1}|\overline{U}) = P(((Q_2 \cup Q_3) \cup Q_4)|\overline{U}) = \frac{P(((Q_2 \cup Q_3) \cup Q_4) \cap \overline{U})}{P(\overline{U})} = \frac{P((Q_2 \cup Q_3) \cap \overline{U}) \cup P(Q_4 \cap \overline{U})}{P(\overline{U})} = \frac{0.2 + 0.25}{0.525} = 0.857\)

3.75.a. Let H be “high sales” and F will be “favorable reaction”. \(P(H|F) = \frac{P(H \cap F}{P(F)} = \frac{0.173}{0.303} = 0.570957096\)

3.75.b. Let L be “low sales” and U will be “unfavorable reaction”. \(P(L|U) = \frac{P(L \cap U}{P(U)} = \frac{0.141}{0.272} = 0.518382353\)

3.75.c. Let N be ‘neutral”, because N and F are mutually exclusive: \(P(L \cap \overline{U}) = P(L \cap (N \cup F)) = P((L \cap N) \cup (L \cap F)) = P(L \cap N) + P(L \cap F) - P((L \cap N) \cap (L \cap F)) = P(L \cap N) + P(L \cap F) - 0 = P(L \cap N) + P(L \cap F)\) $$ P(L|\overline{U}) = \frac{P(L \cap \overline{U})}{P(\overline{U})} = \frac{P(L) - P(L \cap U)}{P(\overline{U})} = \frac{0.296 - 0.141}{0.272} = 0.569852941 $$

3.75.d. $$P(\overline{U}|L) = \frac{P(L \cap \overline{U})}{P(L)} = \frac{0.155}{0.296} = 0.523648649$$

3.76. Let M1 and M2 be the machines, M1 being the faulty one, and let F be the event of a faulty piece.

$$M_1$$ | $$M_2$$ | ||
---|---|---|---|

\(F\) | 0.04 | 0 | 0.04 |

\(\overline{F}\) | 0.36 | 0.6 | 0.96 |

Total | 0.4 | 0.6 | 1 |

$$P(M_1|F) = \frac{P(M_1 \cap \overline{F})}{P(\overline{F})} = \frac{0.36}{0.96} = 0.375$$

3.77.a. Let E be the event of finding a course enjoyable, and let V be the event of receiving strong positive evaluations.

$$P(E|V)^3 = (\frac{P(E \cap V)}{P(V)})^3 = (\frac{P(E \cap V)}{P(V \cap E) \cup P(V \cap \overline{E})})^3 = (\frac{0.42}{0.42 + 0.075})^3 = (0.848)^3 =
0.6108$$

3.77.b. $$1 - P(\overline{E}|V)^3 = 1 - (\frac{P(\overline{E} \cap V)}{P(V)})^3 = 1 - (\frac{P(\overline{E} \cap V)}{P(V \cap E) \cup P(V \cap \overline{E})})^3 = 1 - (\frac{0.075}{0.42 + 0.075})^3 = 1 - (0.151)^3 = 0.996521691$$

For all the following exercises, it should be assumed that A1 and A2 are complements, and that B1 and B2 are complements.

3.78. $$P(A_2) = 0.6, P(A_1|B_1) = \frac{P(B_1|A_1)P(A_1)}{P(B_1|A_1)P(A_1) + P(B_1|A_2)P(A_2)} = \frac{0.24}{0.24 + 0.42} = 0.36$$

3.79. $$P(A_2) = 0.2, P(A_1|B_1) = \frac{P(B_1|A_1)P(A_1)}{P(B_1|A_1)P(A_1) + P(B_1|A_2)P(A_2)} = \frac{0.48}{0.48 + 0.04} = 0.923$$

3.80. $$P(A_2) = 0.5 \\ P(B_1) = P(B_1|A_1)P(A_1) + P(B_1|A_2)P(A_2) = (0.4)(0.5) + (0.7)(0.5) = 0.55 \\ P(B_2) = 1 - P(B_1) = 0.45\\ P(A_1 \cap B_1) = P(B_1|A_1)P(A_1) = (0.4)(0.5) = 0.2 \\ P(A_1) = P(A_1 \cap B_1) + P(A_1 \cap B_2) \Rightarrow P(A_1 \cap B_2) = P(A_1) - P(A_1 \cap B_1) = 0.5 - 0.2 = 0.3 \\ P(B_2|A_1) = \frac{P(A_1 \cap B_2)}{P(A_1)} = \frac{0.3}{0.5} = 0.6 \\ P(A_1|B_2) = \frac{P(B_2|A_1)P(A_1)}{P(B_2)} = \frac{(0.6)(0.5)}{0.45} = 0.67$$

3.81. $$P(A_2) = 0.6 \\ P(B_1) = P(B_1|A_1)P(A_1) + P(B_1|A_2)P(A_2) = (0.6)(0.4) + (0.7)(0.6) = 0.66 \\ P(B_2) = 1 - P(B_1) = 0.34 \\ P(B_1 \cap A_2) = P(B_1|A_2)P(A_2) = (0.7)(0.6) = 0.42 \\ P(A_2) = P(A_2 \cap B_1) + P(A_2 \cap B_2) \Rightarrow P(A_2 \cap B_2) = P(A_2) - P(A_2 \cap B_1) = 0.6 - 0.42 = 0.18 \\ P(A_2|B_2) = \frac{P(A_2 \cap B_2)}{P(B_2)} = \frac{0.18}{0.34} = 0.529$$

3.82. $$P(A_2) = 0.4 \\ P(A_1|B_1) = \frac{P(B_1|A_1)P(A_1)}{P(B_1|A_1)P(A_1) + P(B_1|A_2)P(A_2)} = \frac{(0.6)(0.4)}{(0.6)(0.6) + (0.4)(0.4)} = 0.461$$

3.83. Let A be “received the material” and let B be “adopted the book”. \(P(\overline{A}) = 0.2 \\ P(A|B) = \frac{P(B|A)P(A)}{P(B|A)P(A) + P(B|\overline{A})P(\overline{A})} = \frac{(0.3)(0.8)}{(0.3)(0.8) + (0.1)(0.2)} = 0.923\)

3.84. Let P1, P2 and P3 be stocks that performed better, same or worse than last year, and let R be stocks that were rated as good by the analyst. $$P(P_1|R) = \frac{P(R|P_1)P(P_1)}{P(R|P_1)P(P_1) + P(R|P_2)P(P_2) + P(R|P_3)P(P_3)} = \frac{(0.4)(0.25)}{(0.4)(0.25) + (0.2)(0.5) + (0.1)(0.25)} = 0.44$$

3.85. $$P(B) = P(A \cap B) \cup P(\overline{A} \cap B) \\ \Rightarrow P(\overline{A} \cap B) = P(B) - P(A \cap B) \\ \Rightarrow P(\overline{A}|B) = \frac{P(\overline{A} \cap B)}{P(B)} = \frac{P(B)}{P(B)} - \frac{P(A \cap B)}{P(B)} = 1 - P(A|B)$$

Let F be the event of the process functioning correctly, and let D be the event of a defective bulb.
$$P(D) = P(D|F)P(F) + P(D|\overline{F})P(\overline{F}) = (0.1)(0.9) + (0.1)(0.5) = 0.14 \\
P(F|D) = \frac{P(D|F)P(F)}{P(D)} = \frac{(0.1)(0.9)}{0.14} = 0.642$$
$$P(\overline{D}) = 1 - P(D) = 0.86 \\ P(\overline{D}|F) = 1 - P(D|F) = 0.9 \text{ (proof at beginning of question)} \\ P(F|\overline{D}) = \frac{P(\overline{D}|F)P(F)}{P(\overline{D})} = \frac{(0.9)(0.9)}{0.86} = 0.94$$

3.86. Don’t purchase corpses of dead animals. It’s not rational behaviour for a self-interested economic agent.

As for the question, let A1 be the event of a chicken coming from Free Range Farms, and let A2 be the event of a chicken coming from Big Foods. Let B be the event of a chicken weighing less than 3 pounds. $$ P(\overline{B}|A_1) = 1 - P(B|A_1) = 0.9 \text{ (according to proof at 3.85)} \\
P(A_2) = 1 - P(A_1) = 0.6 \\ P(B) = P(B|A_1)P(A_1) + P(B|A_2)P(A_2) = (0.1)(0.4) + (0.2)(0.6) = 0.16 \\ P(\overline{B}) = 1 - P(B) = 0.84 \\ P(A_1|\overline{B}) = \frac{P(\overline{B}|A_1)P(A_1)}{P(\overline{B})} =
\frac{(0.9)(0.4)}{0.84} = 0.428$$

Calculating the chances of 3 chickens out of 5 matching the previous event requires the use of the binomial distribution formula, which as far as I’ve seen, has not yet been introduced in this book.
$$P = \binom{5}{3}0.428^3(1 - 0.428)^2 = 0.256$$

3.87. This question is so poorly phrased that I can’t even approach it.

3.88. Mutually exclusive events are events that can’t both happen at the same time. Independence is when the occurrence of one event has no affect on the probability of another.

3.89.a. True, \(\overline{P(A) \cup P(B)} = \overline{P(A) + P(B) - P(A \cap B)} = \overline{P(A) + P(B) - (P(A) - P(A \cap \overline{B})} = \overline{P(B) + P(A \cap \overline{B})} = \overline{P(B) + P(\overline{B}) - P(\overline{A} \cap \overline{B})} = \overline{1 - P(\overline{A} \cap \overline{B})} = \overline{\overline{P(\overline{A} \cap \overline{B})}} = P(\overline{A} \cap \overline{B})\)>

3.89.b. False, because collectively exhaustive events might not be mutually exclusive, in which case their sum is greater than their union. The union of collectively exhaustive events is equal to 1, and if their sum is greater than that, it cannot equal 1.

3.89.c. True, \(\frac{n!}{x!(n-x!)} = \frac{n!}{(n-x)!(n-(n-x))!}\)

3.89.d. True, according to Bayes Theorem, \(P(A|B) = \frac{P(B|A)P(A)}{P(B)} = P(B|A)\frac{P(A)}{P(B)} = P(B|A)\)

3.89.e. True, \(P(A) = P(\overline{A}) = 1-P(A) \Rightarrow 2P(A) = 1 \Rightarrow P(A) = 0.5 \text{ and } P(\overline{A}) = 1-P(A) = 0.5\)

3.89.f. True, we’ll prove for A, \(P(A|B) = P(A)=1-P(\overline{A}) \Rightarrow P(\overline{A}) = 1-P(A|B)=P(\overline{A}|B)\)

3.89.g. False, suppose A and B are not collectively exhaustive, we’ll get a probability greater than 1, while P(S) = 1, \(P(A \cup B) = P(A) + P(B) < 1 \text{ and } P(\overline{A}) + P(\overline{B}) = 1-P(A) + 1-P(B) = 2-(P(A)+P(B)) > 1\)

3.90. Conditional probability is the ratio of occurrences of some event A, out of all the times that some other event B has occurred. Sometimes in real world data all we have is conditional probabilities and we’d like to calculate the individual probabilities, and sometimes it’s the other way around.

3.91. Given a subjective idea of the chances of some event A happening, if we know that some other event B has happened, we’d like to update out prediction as to the likelihood of event A. Bayes theorem provides us with a tool in these scenarios.

3.92.a. True, \(P(A \cup B) \geq P(A)=P(A \cap B) + P(A \cap \overline{B}) \geq P(A \cap B)\)

3.92.b. True, \(P(A \cup B) = P(A) + P(A) + P(A \cap B) \geq P(A) + P(B)\)

3.92.c. True, \(P(A) = P(A \cap B) + P(A \cap \overline{B}) \geq P(A \cap B)\)

3.92.d. True, if we can say that the intersection of an event with itself equals itself, then: \(P(A) = P(A \cap A) + P(A \cap \overline{A}) = P(A \cap A) = P(A) \Rightarrow P(A \cap \overline{A})= 0\)

3.92.e. False. For any event A, \(0 \leq P(A) \leq 1\), so two events A and B can both have a probability of 1, and their sum will be greater than 1, particularly, 2.

3.92.f. False. \(P(A \cap B) = 0 \Rightarrow P(A \cup B) = P(A) + P(B) + P(A \cap B) = P(A) + P(B)\), but \(P(A) + P(B) = 1\) only if A and B are complements, not in every case where they are mutually exclusive.

3.92.g. False. \(P(A \cup B) = P(A)+P(B)-P(A \cap B) = 1 \Rightarrow P(A \cap B) = P(A)+P(B)-1\). Therefore the intersection equals zero only if \(P(A) + P(B) = 1\), which is only the case if A and B are complements.

3.94.a. False. See example 3.23. above. \(P(T_1|D_1)=0.1 \gt 0.18 = P(T_1)\)

3.94.b. False. Suppose event A with probability of 0.5. \(P(A\cap\overline{A})=0 \Rightarrow P(A|\overline{A})=\frac{P(A\cap\overline{A})}{P(\overline{A})}=0 \neq P(A)\)

3.94.c. \0 \leq (P(B) \leq 1\), Therefore for any \(P(B) \geq 0, P(A|B) = \frac{P(A \cap B)}{P(B)} \geq P(A \cap B)\)

3.94.d. False. \( P(A \cap B) = P(A|B)P(B) \leq P(A)P(B) \Rightarrow P(A|B) \leq P(A) \), and in example 3.23, we see a counter-example where \(P(T_1|D_1) = 0.9 \gt 0.18 = P(T_1)\)

3.94.e. In subjective probability, if an event B is assumed to have some probability P(B), we say this is a prior probability, and given knowledge that event A has happened, we update that probability to P(B|A). The assumption that the posterior probability must be at least as large as the prior probability means \(P(B|A) \geq P(B)\), which we’ve already disproved in section a of this question.

3.95. $$P(A \cup B) = P(A) + P(B) – P(A \cap B) = P(A) + P(B) – P(A|B)P(B) = P(A) + P(B)[1-P(A|B)]$$

3.96.a. $$P(A \cap B) = P(A|B)P(B) = 0.08$$

3.96.b. The events are not independent because \(P(A|B) = 0.4 \neq 0.3 = P(A)\)

3.96.c. $$P(B|A) = \frac{P(A|B)P(B)}{P(A)} = \frac{(0.4)(0.2)}{0.3} = 0.26667$$

3.96.d. $$P(\overline{A}) = P(\overline{A} \cap B) + P(\overline{A} \cap \overline{B}) \Rightarrow P(\overline{A} \cap \overline{B}) = P(\overline{A}) – P(\overline{A} \cap B)$$
$$ P(B) = P(A \cap B) + P(\overline{A} \cap B) \Rightarrow P(\overline{A} \cap B) = P(B) – P(A \cap B)$$
$$ P(\overline{A} \cap \overline{B}) = P(\overline{A}) – P(\overline{A} \cap B) = P(\overline{A}) – P(B) + P(A \cap B) = 0.7 – 0.2 + 0.08 = 0.58$$

3.97.a. Given event A that the thinner wire will arrive within a week, and event B that the thicker wire will arrive within a week, we have: $$P(A \cap B) = (0.6)P(B)$$
$$(0.6)P(B)=(0.4)P(A) \Rightarrow P(A) = 1.5P(B)$$
$$P(A \cup B) = P(A) + P(B) – P(A \cap B) = 1.5P(B) + P(B) – 0.6P(B) = 1.9P(B) \Rightarrow P(B) = 0.42$$

3.97.b. $$P(A) = \frac{P(A|B)P(B)}{P(B|A)} = \frac{(0.6)(0.42)}{0.4} = 0.63$$

3.97.c. $$P(A \cap B) = P(A|B)P(B) = (0.6)(0.42) = 0.252$$

3.98.a. Given event A “employee has an MBA” and event B “employee is over 35”:
$$P(A \cap B) = P(B|A)P(A) = 0.105$$

3.98.b. $$P(A|B) = \frac{P(B|A)P(A)}{P(B)} = 0.2625$$

3.98.c. $$P(A \cup B) = P(A) + P(B) – P(A \cap B) = 0.35 + 0.4 – 0.105 = 0.645$$
3.98.d.
$$P(\overline{A} \cap B) = P(B) – P(A \cap B) = 0.4 – 0.105 = 0.295$$
$$P(\overline{A}|B) = \frac{P(\overline{A} \cap B)}{P(B)} = \frac{0.295}{0.4} = 0.7375$$
3.98.e. No, \[P(B|A) = 0.3 \neq 0.4 = P(B)\]

3.98.f. No, \[P(A \cap B) = 0.105 \neq 0\]

3.98.g. No, \[P(A \cup B) = P(A) + P(B) – P(A \cap B) = 0.645 \neq 1\]

3.99.a. Let A be the event of someone ordering a vegetarian meal, and B will be the event of a customer being a student.
$$P(A \cap B) = P(A|B)P(B) = (0.25)(0.5)=0.125$$
3.99.b. $$P(B|A) = \frac{P(A|B)P(B)}{P(A)} = \frac{(0.25)(0.5)}{0.35} = 0.357$$
3.99.c. $$P(\overline{A} \cap \overline{B}) = 1-P(A \cup B) = 1-(P(A) + P(B) – P(A \cap B) = 0.275$$
3.99.d. No, \[P(A|B) = 0.25 \neq 0.35 = P(A)\]

3.99.e. No, \[P(A \cap B) = 0.125 \neq 0\]

3.99.f. No, \[P(A \cup B) = P(A) + P(B) – P(A \cap B) = 0.725 \neq 1\]

3.100.a. Let A be ‘exceeds 160 acres’ and let B be ‘owned by persons over 50 years old’. \[P(A \cap B) = P(B|A)P(A) = (0.55)(0.2) = 0.11\]

3.100.b. $$P(A \cup B) = P(A) + P(B) – P(A \cap B) = 0.2 + 0.6 – 0.11 = 0.69$$
3.100.c. $$P(A|B) = \frac{P(B|A)P(A)}{P(B)} = \frac{(0.55)(0.2)}{0.6} = 0.18333$$
3.100.d. No, \[ P(B|A) = 0.55 \neq 0.6 P(B)\]

3.101.a. Let H be the event of an employee only having highschool training, and let M be the event of an employee being male. \[P(A \cap M) = P(H|M)P(M) = 0.48\]

3.101.b. Let G be the event of an employee having graduate training and let W be the event of an employee being a woman. W and M are mutually exclusive and collectively exhaustive, so \[P(G) = P(G \cap M) + P(G \cap W) = P(G|M)P(M) + P(G|W)P(W) = (0.1)(0.8) + (0.15)(0.2) = 0.11\]
3.101.c. $$P(M|G) = \frac{P(G|M)P(M)}{P(G)} = \frac{(0.1)(0.8)}{0.11} = 0.727$$
3.101.d. No, for example, \[P(G|M) = 0.1 \neq 0.11 = P(G)\]
3.101.e. $$P(W|\overline{G})=\frac{P(W \cap \overline{G})}{P(\overline{G})} = \frac{P(W) – P(W \cap G)}{P(\overline{G})} = \frac{P(W) – P(G|W)P(W)}{P(\overline{G})} = 0.191$$

3.102.a. Let F be the event of an employee favoring the plan, W be the event of an employee being a woman, and N be the event of an employee being a night-shift worker. \[P(W \cap F) = P(F|W)P(W) = (0.4)(0.3) = 0.12\]

3.102.b. $$P(N \cup W) = P(N) + P(W) – P(N \cap W) = 0.5 + 0.3 – 0.12 = 0.68$$
3.102.c. No, \[P(W|N) = 0.2 \neq 0.3 = P(W)\]

3.102.d. $$P(N|W) = \frac{P(W|N)P(N)}{P(W)} = \frac{(0.2)(0.5)}{0.3} = 0.333$$
3.102.e. $$P(\overline{N} \cap \overline{F}) = 1 – P(N \cup F) = 1 – (P(N) + P(F) – P(N \cap F)) = 1 – P(N) – (P(F|W)P(W) + P(F|\overline{W})P(\overline{W})) + P(F|N)P(N) = 1 – 0.5 – (0.4)(0.3) – (0.5)(0.7) + (0.65)(0.5) = 0.355$$

3.103.a. \[\binom{16}{12} = \frac{16!}{12!(16-12)!} = 1820\]

3.103.b. The number of possibilities of 8 men and 4 women is \[\binom{8}{4} = \frac{8!}{4!(8-4)!} = 70\], and the number of possibilities of 7 men and 5 women is \[\binom{8}{7}\binom{8}{5} = \frac{8!}{7!(8-7)!\frac{8!}{5!(8-5)!} = 448\], so in total there are 518 possibilities and the probability is \[\frac{518}{1820} = 0.28\]

3.104.a. \[\binom{12}{2} = \frac{12!}{2!(12-2)!} = 66\]

3.104.b. \[\frac{1}{12} + \frac{1}{12} = \frac{1}{6}\]

3.105. Let A be the event of a stock being up two years after it has been purchased, and let B be the event of Mr. Roberts receiving the first year bonus for a stock. Then we have the following: $$P(A) = 0.4 \Rightarrow P(\overline{A}) = 0.6$$ $$P(B|A) = 0.6$$ $$P(B|\overline{A}) = 0.4$$ $$P(B) = P(A \cap B) + P(\overline{A} \cap B) = P(B|A)P(A) + P(B|\overline{A})P(\overline{A}) = 0.48$$

3.106.a. Let C be the event of a patient being cured, and let T be the event of a patient receiving the treatment. $$P(C \cap T) = P(C|T)P(T) = (0.75)(0.1) = 0.075$$ 3.106.b. $$P(T|C) = \frac{P(C|T)P(T)}{P(C)} = \frac{P(C|T)P(T)}{P(C|T)P(T) + P(C|\overline{T})P(\overline{T})} = \frac{(0.75)(0.1)}{(0.75)(0.1) + (0.5)(0.9)} = 0.1428$$ 3.106.c. $$\frac{1}{\frac{100!}{10!(100-10)!}} = \frac{10!(100-10)!}{100!}$$

3.107.a. The probability of renewals (R) in January (J), is \[P(R|J) = (0.08)(0.81) + (0.41)(0.79)+ (0.06)(0.6) + (0.45)(0.21) = 0.5192\]

3.107.b. $$P(R|J) = (0.1)(0.8) + (0.57)(0.76)+ (0.24)(0.51) + (0.09)(0.14) = 0.6482$$
3.107.c. The probability of renewal has risen, but in most categories the percentage of renewals dropped, only to be leveled a the decrease in the share of subscriptions under subscription service, which have the lowest renewal rate. So in the long term this is not necessarily good news.

3.108. Let A be the event of a passenger carrying illegal amounts of liquor, and let B be the event of a passenger identified by the system. $$P(A|B) = \frac{P(B|A)P(A)}{P(B)} = \frac{P(B|A)P(A)}{P(B|A)P(A) + P(B|\overline{A})P(\overline{A})} = \frac{(0.8)(0.2)}{(0.8)(0.2) + (0.2)(0.8)} = 0.5$$ It appears that the system produces the same results as picking passengers at random.

3.109. Let A be the event of a person having contracted the disease, and let B be the event of a positive test result. $$P(B|\overline{A}) = 1 – P(\overline{B}|\overline{A}) = 0.2$$ $$P(A|B) = \frac{P(B|A)P(A)}{P(B)} = \frac{P(B|A)P(A)}{P(B|A)P(A) + P(B|\overline{A})P(\overline{A})} = \frac{(0.8)(0.08)}{(0.8)(0.08) + (0.2)(0.92)} = 0.248$$

3.110. Let A be the event of a sale, and let B be the event of an existing customer. $$P(A|B) = \frac{P(B|A)P(A)}{P(B)} = \frac{P(B|A)P(A)}{P(B|A)P(A) + P(B|\overline{A})P(\overline{A})} = \frac{(0.7)(0.4)}{(0.7)(0.4) + (0.5)(0.6)} = 0.482$$

3.111. Let A be the event of a final A grade, and let B be the event of an A on the midterm. $$P(A|B) = \frac{P(B|A)P(A)}{P(B)} = \frac{P(B|A)P(A)}{P(B|A)P(A) + P(B|\overline{A})P(\overline{A})} = \frac{(0.7)(0.2)}{(0.7)(0.2) + (0.1)(0.8)} = 0.63636$$

3.112.a. $$P(O_w|F_w) = \frac{P(O_w \cap F_w)}{P(F_w)} = \frac{0.149}{0.29} = 0.513$$ 3.112.b. $$1 – P(O_i|F_i) = 1 – \frac{P(O_i \cap F_i)}{P(F_i)} = 1 – \frac{0.21}{0.391} = 0.462$$

3.113.a. Let G be the event of a student who will graduate, and let A be the event of an entering freshmen. $$P(A \cap G) = P(G|A)P(A) = (0.62)(0.73) = 0.4526$$ 3.113.b. $$P(G) = P(G|A)P(A) + P(G|\overline{A})P(\overline{A}) = (0.62)(0.73) + (0.78)(0.27) = 0.6632$$ 3.113.c. $$P(A \cup G) = P(A) + P(G) – P(A \cap G) = 0.73 + 0.6632 – 0.4526 = 0.9406$$ 3.113.d. No, \[P(G|\overline{A}) = 0.78 \neq 0.6632 = P(G)\]

3.114.a. Let S be the event of a store being successful, and let G be the event of a good assessment.
$$P(G) = P(G|S)P(S) + P(G|\overline{S})P(\overline{S}) = (0.7)(0.6) + (0.2)(0.4) = 0.5$$
3.114.b. $$P(S|G) = \frac{P(G|S)P(S)}{P(G)} = \frac{(0.7)(0.6)}{0.5} = 0.84$$
3.114.c. No, \[P(G|S) = 0.7 \neq 0.5 = P(G)\]\]

3.114.d. We’ll calculate the probability of no store being successful, as \[0.4^5 = 0.01024\], therefore, the probability of at least one being successful is \[1 – 0.01024 = 0.98976\]

3.115.a. Let A be the event of a customer ordering wine, and let \[C_r, C_o, C_n\] be the events that each type of customer visits. $$P(A) = P(A|C_r)P(C_r) + P(A|C_o)P(C_o) + P(A|C_n)P(C_n) = (0.7)(0.5) + (0.5)(0.4) + (0.3)(0.1) = 0.58$$ 3.115.b. $$P(C_r|A) = \frac{P(A|C_r)P(C_r)}{P(A)} = \frac{(0.7)(0.5)}{0.58} = 0.603$$ 3.115.c. $$P(C_o|A) = \frac{P(A|C_o)P(C_o)}{P(A)} = \frac{(0.5)(0.4)}{0.58} = 0.344$$

3.116.a. Let A be the event of a purchase and let the events of customers falling into the categories be \[C_h, C_c, C_o\]. $$P(A) = P(A|C_h)P(C_h) + P(A|C_c)P(C_c) + P(A|C_o)P(C_o) = (0.2)(0.3) + (0.6)(0.5) + (0.8)(0.2) = 0.52$$ 3.116.b. $$P(C_h|A) = \frac{P(A|C_h)P(C_h)}{P(A)} = \frac{(0.2)(0.3)}{0.52} = 0.115$$

3.117. $$\frac{\binom{8}{5}}{\binom{16}{5}} = 0.012$$

3.118.a. Let C be the event of someone being guilty of a crime, and let G be the event of someone wearing gloves. $$P(C|G) = \frac{P(G|C)P(C)}{P(G)} = \frac{P(G|C)P(C)}{P(G|C)P(C) + P(G|\overline{C})P(\overline{C})} = \frac{(0.6)(0.5)}{(0.6)(0.5) + (0.8)(0.5)} = \frac{3}{7}$$ 3.118.b. A jury should not convict based on evidence that has such low accuracy.

3.119. Let the types of error events be \[E_d, E_m, E_o\] and let F be the event of a failure. $$P(E_d|F) = \frac{P(F|E_d)P(E_d)}{P(F)} = \frac{P(F|E_d)P(E_d)}{P(F|E_d)P(E_d) + P(F|E_m)P(E_m) + P(F|E_o)P(E_o)} = \frac{(0.6)(0.5)}{(0.6)(0.5) + (0.7)(0.3) + (0.3)(0.2)} = 0.526$$

3.120. Let N be the event of a new operating system introduced, and let G be the event of growth. $$P(G|N) = \frac{P(N|G)P(G)}{P(N)} = \frac{P(N|G)P(G)}{P(N|G)P(G) + P(N|\overline{G})P(\overline{G})} = \frac{(0.3)(0.7)}{(0.3)(0.7) + (0.1)(0.3)} = 0.875$$

3.121. Let D be the event of lumber having defects and let the event of lumber coming from each supplier be the events \[S_n, S_m, S_s\]. Then: $$P(S_n) = P(S_n|D)P(D) + P(S_n|\overline{D})P(\overline{D}) = (0.3)(0.2) + (0.4)(0.8) = 0.38$$ $$P(S_m) = P(S_m|D)P(D) + P(S_m|\overline{D})P(\overline{D}) = (0.5)(0.2) + (0.2)(0.8) = 0.26$$ $$P(S_s) = P(S_s|D)P(D) + P(S_s|\overline{D})P(\overline{D}) = (0.2)(0.2) + (0.4)(0.8) = 0.36$$ $$P(\overline{D}|S_n) = 1 – P(D|S_n) = 1 – \frac{P(S_n|D)P(D)}{P(S_n)} = 1 – \frac{(0.3)(0.2)}{0.38} = 0.1578$$ $$P(\overline{D}|S_m) = 1 – P(D|S_m) = 1 – \frac{P(S_m|D)P(D)}{P(S_m)} = 1 – \frac{(0.5)(0.2)}{0.26} = 0.3846$$ $$P(\overline{D}|S_s) = 1 – P(D|S_s) = 1 – \frac{P(S_s|D)P(D)}{P(S_s)} = 1 – \frac{(0.2)(0.2)}{0.36} = 0.1111$$

3.122. Let R be the event of an acre with regular plowing and let H be the event of high yields. $$P(H|R) = 1 – P(\overline{H}|R) = 0.6$$ $$P(R|H) = \frac{P(H|R)P(R)}{P(H)} = \frac{P(H|R)P(R)}{P(H|R)P(R) + P(H|\overline{R})P(\overline{R})} = \frac{(0.6)(0.4)}{(0.6)(0.4) + (0.5)(0.6)} = 0.4444$$

]]>The algorithm for calculating percentiles and quartiles presented in this chapter does not produce the same results as does Google Spreadsheet, and it appears that the same issue arises with MS Excel. I wrote the following Google Spreadsheet formulas for correctly calculating quartiles (assuming the dataset is in column A):

Q1: =if(MOD(count(A:A) + 1, 4)=0, indirect("A"&(count(A:A) + 1)/4), indirect("A"&rounddown((count(A:A) + 1)/4)) + 0.25*(indirect("A"&roundup((count(A:A) + 1)/4)) – indirect("A"&rounddown((count(A:A) + 1)/4))))

Q3: =if(MOD(count(A:A) + 1, 4)=0, indirect("A"&(count(A:A) + 1)*0.75), indirect("A"&rounddown((count(A:A) + 1)*0.75)) + 0.75*(indirect("A"&roundup((count(A:A) + 1)*0.75)) – indirect("A"&rounddown((count(A:A) + 1)*0.75))))

I also found an answer to someone that presented a similar issue, that suggests different (though similar) formulas: https://superuser.com/a/343368

2.1.a. The mean is 66, the median 75, and there’s no mode.

2.1.b. Because of the small outlier contained in the dataset, the median is the best choice for predicting future weekly specials. However, for examining past performances, such as gross revenue as a factor of weekly specials, the mean is still the better choice.

2.2.a.12

2.2.b. 13

2.2.c. 8

2.3.a. 3.5

2.3.b. 3.55

2.3.c. 3.7

2.4.a. 5.94

2.4.b. 6.35

2.5.a. 17.75

2.5.b. 20.74

2.6.a. Mean is 10.1, the median is 10.5, and the mode is 11

2.6.b. 6 < 7.25 < 10.5 < 12.75 < 14

2.7. The mode and median are both zero, and the mean is 0.44.

2.8.a. 25.58

2.8.b. 22.5

2.8.c. 22

2.9.a. Q1 = 2.9825, Q3 = 3.3675

2.9.b. 3.1

2.9.c. 3.39

2.10.a.8.54

2.10.b. 9

2.10.c. Comparison of mean and median suggest that this is a skewed left distribution, but this is not accurate because this isn’t a continuous unimodal dataset. A visual examination of a histogram that represents this dataset indicates that the dataset is right skewed. The positive value of skewness confirms that this is the case.

2.10.d. 2 < 6 < 9 < 10.75 < 21

2.11.a. The mean volume is 236.99, which for 100 bottles means that the volume of the entire sample is 23,699, a small fraction less than the advertised 237 mL.

2.11.b. The median volume is 237.

2.11.c. It’s difficult to tell the skewness from the same of the histogram, and different sample widths might visually indicate different results. The calculated skewness is 0.13, which confirms that the distribution is almost symmetric, although, being positive, we have to conclude that it is slightly skewed to the right.

2.11.d. 224 < 233.25 < 237 < 241 < 249

2.12.

s^2 = 5.14

s = 2.27

2.13.

s^2 = 20.3

s = 4.50

2.14. 17.57%

2.15.a. 28.77

2.15.b. 12.70

2.15.c. 44.15%

2.16.

Stem | Leaf |

1 | 2,3,4,5,7,8,9 |

2 | 0,1,2,3,7,9 |

3 | 1,3,5,8 |

4 | 0,2,5,9 |

5 | 3 |

6 | 5 |

IQR = 38 – 18 = 20

2.17.a. Trick question. The variance is 25 and not the standard deviation, so the standard deviation is 5, and we need k=2 so that the interval would be [75 – 2*5, 75 + 2*5]. For k=2, Chebyshev’s theorem gives us: [1 – (1/(2^2)] * 100% = 75%.

2.17.b. Approximately 95% of observations are between 65 and 85.

2.18. The question says population, which implies the use of the empirical rule.

- Almost all observations, [230 – 3 * 20, 230 + 3 * 20]
- Approximately 95%, [230 – 2 * 20, 230 + 2 * 20]

2.19.a. The 68% that’s between [425, 475] + half the observations within [400, 500] + half the observations within [425, 525], so in total: (68 + 13.5 + 2.5)% = 84%

2.19.b. Everything within [400, 500] + half of the observations within [375, 525], so (95 + 2.5)% = 97.5%

2.19.c. Almost none.

2.20.a. Common stocks have a mean annual percentage return of 8.16%. It should be noted that the real annual growth should be calculated with a geometric mean, but this is irrelevant to this exercise. U.S. Treasury Bills have a mean annual percentage return of 5.78%. From the perspective of annual returns alone (disregarding the risk of fluctuations), common stocks are a better investment, according to past performance.

2.20.b The standard deviation for the annual percentage return on common stocks is approximately 22.30%. The standard deviation for the annual percentage return on U.S. Treasury Bills is approximately 1.47%. It appears that the standard deviation on stocks is much higher, implying higher risk of fluctuations, that could partly explain the higher returns. We’ll need to examine the coefficients of variance to determine which investment is more worthwhile. The coefficient of variance on stocks is 273.41%, whereas that of treasury bills is 25.43%. This reinforces our assumption that stocks are riskier.

2.21.a. 26.8

2.21.b. 8.48266

2.21.c. 8.48266

2.21.d. 8.48266

2.21.e. 31.65%

2.22.a. The range is 0.54, the variance is 0.010, and the standard deviation is 0.10. Higher accuracy is hard to reach with Google Spreadsheet because of floating point calculation errors. My standard deviation was 0.1017 and my variance was 0.01034, the STDEV output was 0.1024 and the VAR output was 0.01048.

2.22.b.The IQR is 0.13. Given that it’s significantly lower than the range, we conclude that the dataset has either very high or very low outliers.

2.22.c. The coefficient of variation is 2.67%.

2.23.a. The mean is 261.0545

2.23.b. The variance is 306.4373 and the standard deviation is 17.5053

2.23.c. The coefficient of variation is 6.7%.

2.24.a. The standard deviation is 1.0048

2.24.b. According to Chebyshev’s theorem we know that at least 75% lie within 2 standard deviations of the mean. However, because the standard deviation is very small relative to the mean, we can assume that it’s closer to the empirical rule’s 95%.

2.25. The mean is 52.64 and the standard deviation is 12.7147

2.26.a. 4.2

2.26.b. 4.5833

2.27.a. 101

2.27.b. The sample variance is 4195 and the sample standard deviation is 64.76.

2.28.

# of Hours | fi | mi | fi*mi | mi-mean | (mi-mean)^2 | fi*(mi – mean)^2 |

4 < 10 | 8 | 7 | 56 | -8.4 | 70.56 | 564.48 |

10 < 16 | 15 | 13 | 195 | -0.4 | 0.16 | 2.4 |

16 < 22 | 10 | 19 | 190 | 3.6 | 12.96 | 129.6 |

22 < 28 | 7 | 25 | 175 | 9.6 | 92.16 | 645.12 |

- Approximate mean = (56 + 195 + 190 + 175) / (8 + 15 + 10 +7) = 15.4
- Approximate variance = (564.48 + 2.4 + 129.6 + 645.12) / (8 + 15 + 10 + 7 – 1) = 34.4, and the approximate standard deviation is 5.86.

2.29. 3.2251

2.30.a. Mean = 1.4

2.30.b. Sample variance = 23.8710 and standard deviation = 4.8857.

2.31.a. 9.36

2.31.b. 8.9063

2.32.a. 11.025

2.32.b. 0.9195

2.33. Mean = 1.654, and standard deviation = 10.6850.

2.34.a. 261.5454

2.34.b. 2735.3564

2.34.c. The exact mean was 261.0545, and we see that the approximate mean, although very close, is not precise. The variance was 306.4373 and now it’s 2735.3564, which is indicative of the fact that the result is less precise.

2.35.a. 2.33

2.35.b. 0.9058

2.36.a. 1392.5

2.36.b. 0.9930

2.37.a. -45

2.37.b. -0.9

2.38.a. Cov = 4.2679

2.38.b. r = 0.1283

2.38.c. |r| < 2/sqrt(n), so there isn’t enough data to identify a linear correlation between the drug units and recovery times.

2.39.a. Cov = -5.5, r = -0.7760

2.39.b. |r| > 2/sqrt(n), so there is a linear correlation and it is negative, meaning that the higher level of service results in lower delivery times.

2.40.a. Cov = -20.75

2.40.b. r = -0.9366

2.41.a. 2.9072

2.41.b. 0.9617

2.42. r = 0.9300

2.43. Cov = 9.9642, r = 0.9852

2.44.a. mean = 18.1325

2.44.b. s^2 = 204.7017, s = 14.3074

2.45.a. 43.1

2.45.b. s = 10.1644

2.45.c. 20 < 35 < 45 < 50 < 60

2.46.

Location 2 variation: 52.62222222

Location 2 standard deviation: 7.254117605

Location 3 variation: 75.82222222

Location 3 standard deviation: 8.707595663

Location 4: variation: 22.27777778

Location 4 standard deviation: 4.719934086

2.47. Cov = 5.18954, r = 0.24499, there is no linear correlation.

2.48.a.

2.48.b. r = 0.5602

2.49. I mistakenly assumed that population #3 would be smaller than population #1. The real variances are:

Population #1: 6

Population #2: 14

Population #3: 7.14

Population #4: 54

2.50.a. [295 – 1.59*63, 295 + 1.59*63]

2.50.b. [295 – 2.5*63, 295 + 2.5*63]

2.51.a. [9.2 – 2.5*3.5, 9.2 + 2.5*3.5]

2.51.b. [9.2 – 3.5, 9.2 + 3.5]

2.52.a. [29,000 – 2*3000, 29,000 + 2*3000]

2.52.b. [29,000 – 2*3000, 29,000 + 2*3000]

2.53.a. IQR = 21.5, so if the dataset is bell-shaped, most of the employees completed the task within the same range of ~20 seconds.

2.53.b. 222 < 249.5 < 263 < 271 < 299

2.54.a. The mean is 41.6826.

2.54.b. s^2 = 284.3546, s = 16.8628.

2.54.c. 95th percentile = 70

2.54.d. Five number summary: 18 < 25.75 < 39 < 54.25 < 73

2.54.e. CV = 40.45%

2.54.f. 100*[1 – (1/k^2)]% = 90% => k = sqrt(10) = ~3.16, so we take k = 3.17, and the result is:

[41.68 – 3.17*16.86, 41.68 + 3.17*16.86]

2.55.a. Cov = 16.55

2.55.b. r = 0.8653

2.56.a. Cov = 106.9333

2.56.b. r = 0.9887

1.1.a. Continuous numerical variable

1.1.b. Categorical variable with a nominal level of measurement

1.1.c. Categorical variable with an ordinal level of measurement

1.1.d. Discrete numerical variable

1.2.a. Categorical variable with a nominal level of measurement

1.2.b. Categorical variable with an ordinal level of measurement

1.2.c. Continuous numerical variable

1.3. Categorical with an ordinal level of measurement

1.4.a. Categorical variable with an ordinal level of measurement

1.4.b. Discrete numerical variable

1.4.c. Categorical variable with a nominal level of measurement

1.4.d. Categorical variable with a nominal level of measurement

1.5.a. Categorical variable with a nominal level of measurement

1.5.b. Discrete numerical variable

1.5.c. Categorical variable with a nominal level of measurement

1.5.d. Categorical variable with an ordinal level of measurement

1.6.a. Categorical variable with a nominal level of measurement

1.6.b. Discrete numerical variable

1.6.c. Categorical variable with a nominal level of measurement

1.6.d. Categorical variable with an ordinal level of measurement

1.7.a. Shift / Benefits?

1.7.b. Employee ID / Gender

1.7.c. Time (in seconds)

1.8.a. activity_level

1.8.b. col_grad, smoker,

1.8.c. BMI, daily_cost

1.8.d. hh_income_est, age

1.9.a

1.9.b.

1.10.

1.11.a.

1.11.b.

1.12.

1.13.

1.14.a.

1.14.b.

1.14.c.

1.15.a.

1.15.b.

1.15.c.

1.15.d.

1.16.

1.17.a.

1.17.b.

1.18.a.

1.18.b.

1.19.a.

1.19.b.

1.19.c. (Israel)

1.20.

1.21.

1.22.

1.23.a.

Data from: http://www2.census.gov/library/publications/2010/compendia/statab/130ed/tables/11s1002.xls

Looking at the TOC here: https://www.census.gov/eos/www/naics/2017NAICS/2017_NAICS_Manual.pdf

It’s clear that durable goods are aggregated on the “33, 321, 327” line and non-durable goods are aggregated on the “31, 32 (except 321 and 327)” line.

1.23.b.

Skipped a few exercises…

1.30.a. 6 (5-7)

1.30.b. 8 (7-8)

1.30.c. 8 (8-10)

1.30.d. 10 (8-10)

1.30.e. 11 (10-11)

1.31. The sample size is 110 observations, so for all subsections, we choose the number of classes to be k=8.

- upper((85-20)/8) = 9
- upper((190-30)/8) = 20
- upper((230-40)/8) = 24
- upper((500-140)/8) = 45

1.32. Sample size is 28 observations, so we choose the number of classes to be k=6.

w=upper((65-12)/6)=9

a.

Class | Frequency |

10-19 | 5 |

20-29 | 3 |

30-39 | 7 |

40-49 | 4 |

50-59 | 5 |

60-69 | 4 |

b.

c.

d.

Stem | Leaf |

1 | 2,3,5,7 |

2 | 1,4,8 |

3 | 2,5,6,7,9 |

4 | 0,1,4 |

5 | 1,4,6,9 |

6 | 2,4,5 |

1.33

Stem | Leaf |

1 | 0 |

2 | 3, 4, 6, 8, 9 |

3 | 0, 5, 6, 9 |

4 | 4, 5, 8 |

5 | 0, 2, 5 |

6 | 2, 7 |

1.34.a.

Class | Relative Frequency |

0 < 10 | 16.3% |

10 < 20 | 20.4% |

20 < 30 | 26.5% |

30 < 40 | 24.4% |

40 < 50 | 12.2% |

1.34.b.

Class | Relative Frequency |

0 < 10 | 8 |

10 < 20 | 18 |

20 < 30 | 31 |

30 < 40 | 43 |

40 < 50 | 49 |

1.34.c.

Class | Relative Frequency |

0 < 10 | 16.3% |

10 < 20 | 36.7% |

20 < 30 | 63.3% |

30 < 40 | 87.8% |

40 < 50 | 100% |

1.35.

Skipping 1.36 – 1.74 (the rest of the chapter).

- There are infinite potential integrations, so the work is unending.
- This diverts the attention of your development staff from what they do well (project management, in the case described above) to other types of work.
- Bugs are usually hard to control because you’re working against systems that you don’t host, and whose code you can’t see.

The need for integration tools aimed at software vendors derives from the scenario described just now, which I like to call the integration pitfall. Application providers are faced with pressure, having to offer built-in integrations with a growing subset of related products, so that their own product would be useful to new target audiences. Some tools were developed with the aim of assisting these software providers, instead of targeting their end users directly.

Elastic.io is one such tool. It lets you connect with a variety of third party services and map fields from these data sources to your own application with a simple drag and drop interface. In sum, Elastic.io is meant for non-technical users that want to extend the application that they offer with built-in integrations. A very different approach is that of Nectil, a French service provider that developed their own software for web application integrations, and that offer small businesses tailored solutions on top of that framework.

Most other tools aim at the developer crowd, trying to help them consume APIs with greater ease and speed. Temboo provides SDKs for server side API consumption. It supports a very extensive amount of APIs and provides access to them through a unified interface. The service is SaaS based, though cheap, and meant for low levels of consumption (anything really big would require custom pricing). Webshell is an API that lets you integrate different APIs from third party services and create new functionality. It’s a JS library, which handles authentication for you and lets you construct your custom APIs using an editor. DERI Pipes is a software that lets you create automated processes that transform and mash together web content (much like Yahoo! Pipes, but more technically complex).

Cumula is a PHP application framework that takes the idea even further. Based on the assumption that modern web applications no longer sit on top of a single on-premise database, but instead connect with a multiplicity of services, they’ve contrived a framework that lets developers build applications with a unified interface for local and remote data sources. It’s modular, with the use of components for describing parts of the application. Each component has its own routing and templating. A collection of DataStore and DataService packages are available as dependencies, included in the same Github repo. As of now, Github’s popularity metrics indicate very low acceptance rates, and the same goes for the associated Google Group. Nevertheless, this is a relatively new project, and they’ve already received some media coverage.

Other platforms aim to help developers find and consume APIs, without altering the way they work or requiring the utilization of a specific framework. APIHub (renamed to the anypoint portal after being acquired by Mulesoft) and the ProgrammableWeb projects are API explorers, within which one can quickly search and learn about new APIs. Mashape does the same, although focused more on developer APIs and less on connecting with third party applications. Mashape acts as a middleman between providers and consumers, relieving API consumers from having to maintain separate authentication processes with a multiplicity of remote services, and thus enabling the proliferation of usage of such services.

Edit: Since the time of writing this, a new contestant has emerged, called CloudElements, that provides a uniform API for accessing multiple cloud based applications.

For most of us, this is already a given truth, and looks like a rather obvious progression from previous technological advancements. However, at the same time, we also take for granted that the web applications we use for our day-to-day trade are contained silos, with their own special UIs and terminologies. Each of these contained silos holds a segment of our data, and forces us to sign in so we can view it, and to adapt to their idea of how to present our data to us.

Gradually, businesses are adopting data aggregators that simplify specific segments of their day to day work. For instance, a business that relies on PPC-based marketing, might start using a PPC management tool, from which to manage their entire ad spend from a single user interface. Now, they no longer have to sign in to multiple services, and cope with the varied complexities of each system.

With the adoption of such tools, comes a new expectation – that other aspects of their business be managed with the same simplicity. Each aspect of a business’s operation deserves a specific tool that would abstract away the complexities of the underlying systems. But then arises a new multiplicity of tools, which suffer from the same sorts of problems – multiple sign-ins, multiple UIs, and data that’s contained within specific apps, not usable anywhere else.

I propose a bigger revolution – one of a super-distribution of data at large. The principle logic behind this change is the recognition that building a truly one-stop-shop for any and every need of a business is unachievable. Instead, one should focus on enabling the distributed model, in which multiple complementing services are used in parallel. With this recognition in mind, I wouldn’t suggest to anyone to try to replace the PPC management tool mentioned above, but to complement it by enabling it to communicate with other tools meant for other specific aspects of a business’s operation.