Probability rules

FW8051 Statistics for Ecologists

Learning objectives

Understand and be able to work with basic rules of probability.

Understand and be able to work with probability distributions in R.

Needed for:

  • Maximum Likelihood and Bayes Theorem; the latter is fundamental to Bayesian statistics.

  • Formulating models using different probability distributions.

Summary Probability Rules

  1. \(P(A \mbox{ and } B) = P(A)P(B \mid A) = P(B)P(A \mid B)\)
  1. \(P(A \mbox{ or } B) = P(A) + P(B) - P(A \mbox{ and } B)\)

Visualization of the probability(a or b) rule from Whitlock and Schluters book.

  1. \(P(\mbox{not } A) = 1-P(A)\)
  1. \(P(A \mid B) =\frac{P(A \mbox{ and } B)}{P(B)}\)

(probability of “A given B”)

Visualization of the probability(a or b) rule from Whitlock and Schluters book.

[Think-Pair-Share] Black Bear Diet Analysis

You are analyzing scat samples from a population of black bears to understand their seasonal diet.

  • 50% of the samples contain Berries (Event \(B\)).
  • 45% of the samples contain Fish (Event \(F\)).
  • 37% of the samples contain both Berries and Fish (Event \(B \text{ and } F\)).

If you randomly select a scat sample that contains Berries, what is the probability that it also contains Fish?

Want to find: P(F \(|\) B)

= P(F and B)/P(B)

= 0.37/0.50 = 0.74

By Catch [Think-Pair-Share]

You are an observer on a commercial vessel targeting Yellowfin Tuna. You are monitoring hauls for the presence of Tuna and protected Sea Turtles.

Based on your observer logs:

  • 20% of the hauls contain Tuna (Event \(T\)).
  • 5% of the hauls include a Sea Turtle (Event \(S\)).
  • 1% of the hauls include both Tuna and a Sea Turtle

What percent of the hauls include neither Tuna nor a Sea Turtle?

P(not(T or S))

= 1 – P(T or S)

= 1 – [P(T) + P(S) – P(T and S)]

= 1 – [0.2 + 0.05 – 0.01] = 0.76

[Think-Pair-Share]

A wildlife biologist surveys 100 different plots, looking for pheasants. Suppose:

  • 30% of the plots contain pheasants.

  • The biologist has a 60% chance of detecting pheasants when they are present.

On what percentage of the plots should we expect the wildlife biologist to see a pheasant.

P(present and seen) = P(Seen \(|\) present)P(present)

= 0.3 x 0.6 = 0.18

Mutually Exclusive Events

Two events are mutually exclusive if they cannot both be true: P(A and B) = 0.

Sample Space

  • Cause of deer fawn mortality: {wolf, cougar, bear, other}
  • Location: {inside protected area, outside of a protected area} (disjoint areas)

What about1:

  • Mammal and lay eggs: Not mutually exclusive, platypus + four species of echidna
  • Teeth and feathers: Yes, mutually exclusive

P(A or B) = P(A) + P(B) for mutually exclusive events.

Mutually exclusive events

You are tracking radio-collared elk calves in Yellowstone to determine what is driving population declines. You classify the primary cause of death for each calf.

Let’s assume the probabilities for a specific season are:

  • 15% chance of being killed by a Wolf (Event \(W\)).
  • 10% chance of being killed by a Bear (Event \(B\)).
  • 5% chance of being killed by a Cougar (Event \(C\)).

What is the probability that a calf is killed by any predator? (i.e., Wolf OR Bear OR Cougar).

P(W or B or C) = P(W) + P(B) + P(C) = 0.30

Independence

Events A and B are independent if \(P(A \mid B) = P(A)\)


Intuitively, knowing that event B happened does not change the probability that event A happened.


If A and B are independent then:

\(P(A \mbox{ and } B) = P(A)P(B \mid A) = P(A)P(B)\)


We will use this rule to construct Likelihoods!

Summary of Special Cases

If events A and B are mutually exclusive:

  • \(P(A \mbox{ or } B) = P(A) + P(B)\)
  • \(P(A \mbox{ and } B) = 0\)

If events A and B are independent:

  • \(P(A \mid B) = P(A)\)
  • \(P(A \mbox{ and } B) = P(A)P(B)\)

Law of Total Probability

If events \(B_1, B_2, \ldots, B_k\) are mutually exclusive and together make up all possibilities, then:

\(P(A) = \sum_iP(A|B_i)P(B_i)\)

Visualization of the total law of probability.

Special Case: \(P(A) = P(A | B)P(B) + P(A | \mbox{not } B)P(\mbox{not } B)\)

Jewel wasp: from Whitlock and Schluter Example 5.8

Picture of a Jewel Wasp from Whitlock and Schluters book.

Females can manipulate sex of the eggs they lay

  • Previously parasitized hosts lay more male eggs
  • Other Hosts (not previously parasitized) lay more female eggs

Jewel Wasp [Exercise]

Suppose:

  • When a wasp finds a host, there is a 0.20 probability another wasp has already laid eggs in it
  • If the host is unparastized, the female lays a male egg with prob = 0.05 (and female egg with prob = 0.95)
  • If the host already has eggs, the female lays a male egg with prob = 0.90 (and female egg with prob = 0.10)

Use the total law of probability to determine the probability(sex of new egg is male).

\(P(A) = \sum_iP(A|B_i)P(B_i)\)

Hint: let A = {male}, B = {previously parasitized, not previously parasitized}

Jewel Wasp

Probability(sex of new egg is male)

= P(male & previously parasitized) + P(male & not previously parasitized)

= P(male | previously parasitized)P(previously parasitized) + P(male | not previously parasitized)*P(not previously parasitized)

= 0.2 x 0.9 + 0.05 x 0.8 = 0.22

Tree Diagram

Branching Tree diagram representing differen potential events (host parasitized, yes or no), then sex of the egg conditional on the host being parasitized or not. The probability of each branch is determined by multiplying the probabilities of the first event (host parasitized or not) by the conditional probabilities of the second event (sex of the egg being male or female, conditional on parasitism status).

Bayes Theorem

Let \(\bar{A}\) = not(A)


\(P(A \mid B)\) = \(\frac{P(A \normalsize \mbox{ and } B)}{P(B)}\)

\(= \frac{P(B \mid A)P(A)}{P(B \mbox{ and } A) + P(B \mbox{ and } \bar{A}) }\)

\(= \frac{P(B \mid A)P(A)}{P(B \mid A)P(A) + P(B \mid \bar{A})P(\bar{A}) }\)


The last two expressions can be extended to more than 2 groups using the total law of probability

Example: Trisomy 21 or Down Syndrome

Caused by an extra copy of chromosome 21.

  • 1 in 800 children have Down Syndrome, i.e., \(P(D) = 1/800 = 0.00125\)
  • A multiple-marker screening test can be performed in the second trimester of pregnancy
  • False Positive: \(P(+ | \bar{D}) = 0.05\)
  • False Negative: \(P(- | D) = 0.19\)

Given that one tests positive, what is the probability that the fetus has Down Syndrome? \(P(D | +)\)

Use Bayes rule: \(= P(A|B) = \frac{P(B \mid A)P(A)}{P(B \mid A)P(A) + P(B \mid \bar{A})P(\bar{A}) }\). It might also help to draw a probability tree.

Probability Tree

Branching Tree diagram representing different potential events (down syndrom yes or no) and result of the test conditional on having or not having Downs syndrome. The probability of each branch is determined by multiplying the probabilities of the first event (down syndrom yes or no) by the conditional probabilities of the second event (testing positive given one has or does not have Downs syndrome).

\(P(D \mid +) = P(D \mbox{ and } +)/P(+)\)

= 0.0010125/[0.0010125 + 0.0499375] = 0.02

Lets Make A Deal

Picture of 3 closed doors.
  • 3 doors (2 goats and 1 car)
  • Monte knows where the car is, but you don’t
  • You pick a door and Monte opens one of the remaining doors holding a goat.
  • Should you switch doors?

Lets Make A Deal

Picture of a car (far left) and two goats.
  • 3 doors (2 goats and 1 car)
  • Monte knows where the car is, but you don’t
  • You pick a door and Monte opens one of the remaining doors holding a goat.
  • Should you switch doors?

Answer

4 options, determined by 2 decisions

Step 1:

  • You choose the door with the car behind it.
  • You choose the door without the car behind it.

Step 2:

  • You switch your choice
  • You do not switch your choice.

Lets Make A Deal

Branching diagram for the Monte Hall problem with the first branch representing the intial choice (door includes a car or a goat) and the second branch representing whether the individual switches doors or not.

Probability Tree

Branching diagram for the Monte Hall problem with the first branch representing the intial choice (door includes a car or a goat) and the second branch representing whether the individual switches doors or not. Probabilities are associated with each choice along each branch.

P(Win \(|\) Switch) = 0 + 2/3

P(Win \(|\) do not Switch) = 1/3 + 0 = 1/3

Probability Tree

Branching diagram for the Monte Hall problem with the first branch representing the intial choice (door includes a car or a goat) and the second branch representing whether the individual switches doors or not. Probabilities are associated with each choice along each branch.

P(win \(|\) switch) = P(win & switch \(|\) car first)P(car first) + P(win & switch \(|\) goat first)P(goat first) = 0 + 2/3

P(win \(|\) stay put) = P(win & stay put \(|\) car first)P(car first) + P(win & stay put \(|\) goat first)P(goat first) = 1/3 + 0

For some interesting comments on the problem, see this website