Since probability is a model for uncertainty and a lack of information, one expects that probabilities change as new information comes to light. For instance when playing poker, you would like to know how the probability of certain hands changes as various cards are revealed, or when modeling the weather, you like to keep your models and predicted probabilities up to date based on the current information.
Conditional probability deals with how the probability of an event changes with certain information. In
general given two event
Typically we read the above probability
Let's illustrate this first with an example:
Suppose you flip a fair coin three times. What is the probability of three heads?
Well as we have seen already, we know the sample space is given by
Let
Now lets see what happens to the probabilities when you are given some information.
Suppose an oracle told you that the first coin flip is going to be heads. How does this change the probability of getting three heads?
Since we know the first flip is going to be heads, can update our sample space to events which
only contain heads on the first toss. Namely we can pretend our sample space is the event that
the first flip is heads,
Indeed by revealing that a head has occurred on the first flip, the probability of getting all head has improved.
How do we make this more general? Here, it is useful to think of
The following definition makes this intuition more precise.
The quantity
Can you obtain the result of Example 1 using the formula
Suppose you flip a fair coin three times. What is the probability of getting exactly two heads, given that the first flip is a heads?
To solve this, define the events
The concept of conditional probability allows us to calculate the probability of the intersection of two events called the law of multiplication.
Note that
This formula is very similar to the rule of products we used for counting, but is in fact way more
general since it doesn't require events in the sample space to be equally likely. Indeed, it is often
useful to think of the sets
Suppose you draw two cards from a standard 52 card deck. What is the probability that the two cards are spades?
In order to do this, we suppose that the two cards are drawn one after another and define two
events
We are then interested in the probability of
We can compare this to our combinatorial approach using the rule of products using the fact that all two card combinations are equally likely, which gives
The law of multiplication can be iterated to apply to multiple sets. For instance, given sets
The law of multiplication can often be applied systematically to find the probability of a certain event
by breaking things down into various cases. Specifically, suppose that
Therefore using the additive property of disjoint events and the law of multiplication we can write
The second equality above is often referred to as the law of total probability. It is often convenient to represent this law graphically as probability tree:
The above tree is to be interpreted as follows: The top node of the tree corresponds to the entire
sample space
The formula
is obtained by working from the bottom of the tree upwards along the red path multiplying probabilities on the way up and then adding probabilities of each branch.
In the probability tree, we chose to include the event
Naturally this can be generalized to
Suppose an urn contains 5 red balls and 2 greens balls. You reach in and draw two balls out, one by one, without returning the balls to the urn. What is the probability that the second ball is red?
This problem is actually quite simple and doesn't require the law of total probability or
conditional probability since every ball is equally likely to be the second ball (why is this?).
Therefore since there are 7 balls and 5 red balls, the probability that the second ball is red
is just
However it is illustrative to see how to use the law of total probability to compute this.
Lets define some events:
Of course, we know that
This can be visualized using the following tree:
where we have highlighted the relevant branches red.
An example of the type above is often referred to as a probability urn model (Pólya's urn). It is a convenient toy model that serves as a general framework for many exercises in probability and can unify many probablistic models (see Urn problem or Pólya urn model). For instance we can model a fair coin flip by drawing from a urn with equally many red a green balls.
A mentioned in the above example, one does not really need to use the law of total probability to show
that
Consider the same set up as Example 4. Suppose instead after the first ball is drawn, if it’s green, a red ball is added to the urn, if it’s red, a green ball is added to the urn (while keeping the first ball you drew). Then the second ball is drawn. What is the probability that the second ball is red?
In this case the probabilities of the first or the second ball being red are not equally likely since we are adding more balls depending on the outcome of the first draw. In this case the law of total probability becomes and very valuable tool for keeping track of the various dependencies. An easy calculation shows that the probability tree is now given by:
Therefore we have
It is often the case that knowledge of one event has no effect on the probability of another event. This is known in probability as independence.
Outcomes of successive coin flips are generally treated as independent events, since the outcome of one flip should not affect the outcome of the second flip.
The weather at two significantly different times could be considered independent. For instance, knowing that it rained today generally doesn't have any bearing on the probability that it will rain a month from today. This is a hallmark property of chaos.
The concept of independence is an approximate property. It is used for it's mathematical and conceptual convenience. In practice, it is unlikely that two events are actually independent since there are often many convoluted factors that can lead one event to depend on the other. However, for many events, like successive coin flips, the dependence between two events is so weak and so convoluted that for all intents and purposes they are independent.
We can make the idea of independence more precise using conditional expectation. If
In general, using the definition of conditional probability
Using the definition of conditional probability it is easy to see that
Therefore
Can two mutually exclusive events be independent?
The definition
Suppose you roll a fair six sided die. Define the following events based on
Are
Lets use definition
Therefore, since
For
Independence is a property of the probability measure and not just the events in question. In general, two events cannot be independent without knowing the probability model used. Indeed any two events with a non-empty overlap can be made independent with the right assignment of probabilities. The next example illustrates this.
Consider the following Venn diagram for sets
The probabilities of various non-overlapping regions are labeled, namely
Are
Here we can find