Welcome to calculus.
I'm Professor Ghrist. We're about to begin lecture 43 on
probability densities. We've seen that fair or uniform
probabilities lead to geometry, to counting, length, area and volume.
But what happens when probability is not fair?
In this lesson, we'll define and describe probability density functions.
In our last lesson, we computed probabilities under the assumption of
fairness. Mainly, that any point is as likely as
any other point to be chosen at random. This is not always a good assumption.
There are many instances where there is a bias.
Where certain outcomes are more likely than others.
This bias is encoded in the notion of a probability density function, sometimes
called at PDF. This is a function or domain that tells
you what outcomes are more likely than others such as, exam scores or heights.
We define a probability density function rho as a function that satisfies the
following two criteria. First, rho is non-negative.
And second, the integral of rho is equal to 1.
We have to specify a little bit more. Namely a domain D on, which we are
discussing PDF. So, in particular the integral of rho
over D equals 1. Now, that's the definition but it's
certainly not a very intuitive definition.
What does it mean? Well, before answering that, let's
consider a specific example in the context of a collection of light bulbs.
These light bulbs will eventually fail. But the question is, when?
It happens with some sort of randomness. But how is this randomness regulated?
Well, there's some underlying probability density function.
Lets assume that it were exponential. And that is, the light bulb is more
likely to fail early. And less likely to fail, later on.
This would be a function rho of t, of the form e to the minus alpha t.
Let's say, where t is time, and alpha is some positive constant.
Is this a PDF? Well, it is certainly satisfying the
first criteria. It is non negative.
As for the second criterion, let's specify a domain D for the time as 0 to
infinity, then in this case, what would the integral over this domain be?
Well integrating an exponential function is easy enough.
This gives e to the minus alpha t times negative 1 over alpha evaluating from t
to infinity, we get 1 over alpha. This is not going to work unless of
course alpha is equal to 1. So, what we could do is modify the PDF by
adding a coefficient of alpha out in front.
If we do that, then the integral is going to be equal to 1.
Now, that's a good example of a PDF but we still don't know quite what it means.
Well, let's consider that meaning in the context of fairness, which we already
have some experience with. Fairness connotes a uniform density
function. That means a PDF that is constant on the
domain. What would that constant be.
Well, it has to satisfy the integral over D equals rho that is this constant, times
the volume of the domain. Now, in order to be a PDF this has to
satisfy that integral equals 1. So, what does that tell us about rho, rho
this constant must be one over the volume of the domain.
Let's see what that looks like in the context of the domain being an interval,
let's say, from a to b. In this case, rho is 1 over the length of
this interval. That is 1 over b minus a.
What would it look like in the case of a discrete or zero-dimensional domain?
Well, let's say we had a die, single die. Then, the domain consists of six points,
the different outcomes for the faces. The PDF would be one over the volume of
this domain. Volume in dimension zero being simply
counting. This means that rho is equal to the
constant 1 6th if we had a different discrete set.
Let's say for flipping a coin, then since we only have two points in that domain,
heads and tails, then rho would be equal to 1 over 2, or 1 half.
Now, consider this one carefully because what we have in general is that for a
discrete set of n points, rho, a uniform density is a constant 1 over n.
In the case of, say, flipping a coin, notice that the value of rho is precisely
the probability of getting that outcome. You have a 50-50 chance for getting
heads, if you roll a six sided die, your probability of landing any one outcome is
1 6th. Notice, also, what happens if want to
consider the probability of landing in a collection of outcomes.
Let's say, what' the probability of getting four or five?
Well, we would add up these values of rho.
1 6th plus 1 6th is 1 3rd. Now, does that intuition carry over into
the continuous case? No, the probability of landing at any
single point in an interval is not one over the length of that interval, not at
all. However, if we take a sub-interval, then
we can make sense of the probability in terms of lengths.
If we consider, with what probability does a randomly chosen point in the
domain D lie within a subset A of D? Then we have answered this question.
In the case, of a uniform probability density function, we know that the
probability of landing in A is the volume fraction.
That is the volume of A divided by the volume of D.
We could write that as the integral over the domain, A, of 1 over the volume of D.
But that is precisely the integral of the uniform PDF rho that constant 1 over
volume of D, but integrated over A, not overall of D.
This leads us to consider but more generally the formula that the
probability capital P of landing in A with a point chosen at random.
Is the ratio, the integral of rho over A to the integral of rho over D and this
explains why we want the integral of the PDF, rho over all of D to be equal to 1.
So that we can simply write the probability of landing an A as the
integral of the PDF over the sub-domain, A.
This holds in the uniform case, but it also holds in general.
If we have a non-constant PDF, and we want to know what is the probability of
lying or landing in subset A, we integrate the probability element.
That is, rho of x d x over the domain A. Let us interpret these results, in this
simple case, of a domain being the interval, from a to b, given our PDF rho.
What is the probability that a randomly chosen point in that domain lies between
a, and b. Or by our definition this probability, P,
is the integral rho of x, dx, as x goes from a to b.
Well, that integral is by definition 1. What does that mean?
When you see a probability of 1, that means yes, it will happen.
Let's keep going. What's the probability that a randomly
chosen point is exactly a? Well, that probability is the integral of
rho of x, dx, as x goes from a to a. From what we know about integrals, that
is equal to 0. When you have a probability of zero, this
means no, it's not going to happen. What's the probability that a randomly
chosen point is closer to a than to b? Well, we would simply integrate rho of
(x)dx from the left point a to the midpoint of the domain.
For concreteness, consider the example of a company that advertises half of its
customers are served within five minutes. What are your odds of having to wait for
more than ten? Lets assume an exponential PDF rho of t
is alpha e to the minus alpha t over the domain from zero to infinity.
Our first problem is, we don't know alpha but we do know the probability of your
serving time. Being in the interval from zero to five.
That is, by definition, the integral of alpha e to the minus alpha t d t, as t
goes from 0 to 5. And we're told that that probability is
one half. Now, we can do that integral easily
enough, evaluating at the limits, and then doing a little bit of algebra to
solve for alpha. I'm going to leave it to you to follow
the computations, and see that alpha is 1 5th times log of 2.
With that in hand, we can now address the question of the probability of having to
wait for more than ten minutes. Now, we would compute the probability of
being in the interval from 10 to infinity.
Thus, we would perform the same integral as before.
But evaluated at limits t goes from 10 to infinite, this yields e to the -10 alpha.
If alpha is 1 5th log 2, what is negative 10 alpha?
That's negative 2 times log of 2, that is log of 2 to the negative 2 power.
When we exponentiate that, we get 1 over 2 squared or 1 4th.
That means that you have the 25% chance of having to wait for more than 10
minutes. That doesn't sound so good.
But what are the odds of having to wait for more than 30 minutes?
Well, we would follow the same computation, and need to compute negative
30 alpha. That is, log of 2 to the negative 6th.
Substituting that in would give us odds of about 1.5%.
There's one type of PDF that is of crucial importance that you're going to
see again and again. This is called a Gaussian or sometimes a
normal PDF. This is the function rho of x equals 1
over the square root of 2 pi times e to the minus x squared.
You've probably seen this before, this is sometimes called a Bell curve.
It has a peak around x equals 0 and then drops off.
Now, there are a few things to observe. First of all, in this case your domain is
the entire real line. That is, this is a setting of infinite
extent, anything could happen. Your PDF is certainly positive, in fact,
its strictly positive. But, the tricky thing is in verifying
that it's a PDF. That is, verifying that the integral over
the entire real line is equal to 1. You're going to have to trust me on that
for now. You don't quite have enough at your
disposal to prove this. Now, you will often see Gaussians that
are translated about some middle point or mean.
You'll often see them stretched out or rescaled somehow.
What I want you to know about Gaussians for the moment is that they are
everywhere and all about. Gaussians come up in somewhat surprising
places. If you look at the binomial coefficients
that you obtain from Pascal's triangle. And consider what the row look like, you
notice that the rows tend to go up in the middle and then down at the sides in a
manner reminiscent of a shifted Gaussian. In fact, if you were to divide these
binomial quotients by 2 to the n. Where n is the rho number then you'd
obtain something that, in the limit as you go down, converges to something very
much like a Gaussian. This is a hint at one of the deeper
truths of mathematics, that Gaussians are limits of individual decisions.
Left or right. Heads or tails.
That compound upon one another to converge to such distributions.
Gaussians are indeed everywhere. So, now we see, not only what a
probability density is but also how to compute probability by means of
integration. In our next lesson, we'll introduce a few
of the main characters of probability theory and see what roll they have to
play in our story of calculus.