Welcome to calculus. I'm professor Ghrist. We're about to begin lecture 49,
numerical integration. In chapter three of this course, you had
a lot of practice computing definite and indefinite integrals. In this lesson,
we're going to see how the discrete, or digital calculus,
helps us to solve definite integrals. This is the beginning of the subject
known as numerical analysis. In this lesson, we'll continue with our applications of
discrete calculus to the differential. This time,
in the context of the definite integral. Instead of using the fundamental
theorem of integral calculus, we sometimes need to take
a different approach since, as you know,
some integrals are hard to compute. However, even the very definition
of the definite integral in terms of a limit of Riemann sums
invokes the language of discrete calculus using what
looks like a forward difference. Let's explore how sequences
enter the picture. Let's assume that one is
given a sequence of X values, X sub N going from say,
X nought to X capital N. This is bounding our
domain of integration. Then, one is given a sequence,
f, of function values, f sub n that is thought of a sample of some smooth or continuous function, f. Now, the question is,
given only these values, how can we approximate the integral,
even though we do not know the actual function, F, but
only the sample values? Well, of course we can't get the exact
answer, but we can approximat. The simplest approximation
is the Riemann sum itself. If we consider our sequence of X values,
and our sequence of f values as determining the partition and
the function values. Then we can write out
a Riemann sum of the form sum, n goes from 0 to capital N- 1
of f sub n times delta X n. That is the forward difference of X. We might call this the left Riemann sum. Because if we take a partition,
starting with X nought, and using the forward difference of
our sequence X, then of course, the sample point that we have chosen for
this partition is the leftmost point. Notice that the sum is
only going up to N- 1, and we do not include that right endpoint. Now, we can reverse things and use what
we might call a right Riemann sum, ignoring the first sample
point at X nought, and summing up all the rest using
the backward difference of x. As approximations,
these Riemann sums are not terribly good. On any given sub-interval, you're probably
going to estimate too low or too high. There's a better way to do things. Instead of using rectangles,
one could use a trapezoid. This is the basis for what is called the
trapezoid rule of numerical integration. What's the area of one of
these trapezoidal elements? Well, of course,
it's the same as taking the midpoint and evaluating the area of
the rectangle defined there by. When one works out what that
means in terms of the sequence, it is the midpoint height,
that is fn plus one half, delta f at n times the width delta x at n. Where here we're using
the forward difference. If we sum this up over
all of the intervals, let's say as n goes from 0 to N- 1,
then one obtains a much better estimate for
the definite integral of f. A few remarks are in order. First, the trapezoid rule is really
just an average of the left and the right hand Riemann sums. And second, if you're in
the context of a uniform grid, that is, the spacing between
the h values is a constant, h, then the trapezoid rule
takes on a very nice form. It is this width or step size, h, times the following weighted
sum of the function value's f. The first point is weighted
with coefficient one half. The last point, f sub N, likewise. All of the other interior sample
points have weights one and so one simply takes the sum. This is a very simple formula
both to remember and to use. There's an even better approximation
that is called Simpson's rule, and its derivation tends to be
unexplained in most calculus courses, but we're going to take the time and
do it. Simpson's rule is a third order method. That means we approximate the integrand
f by a polynomial of degree three. How do you approximate
a function with a polynomial? Well, you already know the answer to that. We do a Taylor expansion. Let's set things up so that we're
expanding about x = 0 for convenience. We keep all terms of order three and
below and pack everything else into
big O of x to the fourth. We're going to focus on the range, as x goes from negative h to positive h,
where h is our step size. Therefore, the higher order terms can
be written as big O of h to the fourth. That's gonna be important later. Now what do we want to do? We want to integrate from negative h to h,
and so we can integrate each term
in this Taylor expansion. Now keeping in mind that these
derivatives are all evaluated at zero and hence constant we see that because we're
integrating Over a symmetric domain. All of the odd degree terms in
X integrate out to give us 0. We're left with a much simpler integral. How do we do that? Well, the first term is a constant,
f evaluated at 0. That integrates to that constant times x. The second term has an x squared in it. That integrates to x cubed over 3
with the constants out in front. The error terms, big O of h to the fourth,
when we're integrating from negative h to h, still gives us something
that is in big O of h to the fourth. Now when we evaluate from negative h to h, we can simply evaluate from 0 to h, and
pull out a constant of 2 if we like. This gives us an expression for
the integral that requires knowing the function value at 0, and the second
derivative of that function at 0. Now, the first term is gonna be easy. If I'm given sample points,
let's say at X n and X n-1 and X n+1. Then evaluating at the middle point,
that is, at Xn, is simple. That function is simply f sub n. But how do we estimate the second
derivative of our integrand, when all we know is the function value where
we are, and to the left and the right? Well, this requires remembering
a few things from discrete calculus. First of all, we write the forward and the backward differences
in terms of operators. The shift operator E and
the identity operator I. When you don't have the ability to move
two steps in front or behind, the proper way to estimate a second derivative
is to use the 2nd central difference. This is the forward difference
of the backward difference, or the backward difference of the forward
difference, it doesn't matter. In either case,
what is this going to give you? Well, in language of operators we
can simply multiply things out and we see that the second central difference of f at n is f n-1- 2f n + f n+1. This is a good estimate for
the second derivative. Well not exactly,
we have to be a little bit careful here. Remember the second derivative
is d squared f over d x squared. So to estimate that, we'll use the second
central difference of f in the numerator. And in the denominator,
we'll use delta x at n squared. Of course that delta x is our
uniform step size h, and so we get an h squared in the denominator. Applying the second central difference,
we obtain, after a little bit of
algebraic simplification, an estimate for
the integral from negative h to h as h times quantity one-third f n-1 +
four-thirds f n + one-third f n+1. That is the basis of Simpson's rule. That's a lot of work. But let's see how it all comes together. We're going to assume that our integration
domain is divided into equally spaced subintervals with h. We're going to set it up so that there is
an even number of these subintervals so that we can apply Simpson's
rule on incident pairs. On each such pair of subintervals, with h, we weight the function values by these
coefficients, one thirds, four thirds, and one thirds, that we derived
in our previous few slides. Now, we do this on all pairs
of incident subintervals, and then add the results of
these integrals together, being careful to account for
what happens on the overlaps. When we do so,
we obtain a system of weights for the function values, and
this comprises Simpson's rule. We can approximate the integral of f as h, the width times one-third f nought, plus four-thirds f1,
plus two-thirds f2, etc. With two-thirds and
four-thirds alternating, until the very end, where the last
point fN has weight one-third. Let's summarize our integration methods, under the assumption of
a uniform grid of step size h. In this case, all of the methods can
be described in terms of the weights that are placed on the sampled
function values f sub n. In the left and right Riemann sums,
we simply ignore one of the endpoints, whereas in the trapezoid rule,
we're averaging things together. Simpson's rule, as we saw,
has a more complicated system of weights. Now all of these methods have
an order associated to them. We used a third order curve to approximate
the integrand in Simpson's rule. The trapezoid rule is
a first-order method. We used line segments. The left and right Riemann
sums are zeroth-order methods. We just took the constants and
built rectangles from them. Because all of these are really using some
Taylor series to approximate the integral, they all have an error term
that can be expressed, in terms of big O, of the step size H, and one can describe the accuracy
of the method in terms of big O. Here Simpson's rule is best with the error
term being in big O of h to the fourth. The Trapezoid rule, big O of h squared. The Riemann sums left and
right have error in big O of h. Let's put these methods to the test,
in the context of an example. The answer to which we already know. Consider the integral of 1 over x,
as x goes from 1 to e. This is, of course,
log of e minus log of 1, that is, 1. If we take our domain and divide it up
into six equally spaced subintervals, computing the sample points at each,
then how do we proceed? Well, in this case,
h is e- 1 divided by 6. For the left Riemann sum, we multiply
the sample values f sub n by this h and then add them all up,
not taking that last endpoint. When we do so,
we get an answer that is not too bad. It's within 10% of the correct answer. If we do a right Riemann sum, then,
because of the convexity of this curve, 1 over x, we get a very different value. It is under the true value,
instead of over. Again, by about 10%. The trapezoid rule is a great
improvement over these methods, returning an answer that is
within 1% of the correct answer. And notice that we're using
almost exactly the same values, just balancing out what happens at
the left and the right hand endpoints. But Simpson's rule beats them all. When we use that system of weights, those third order approximates
return a value of that integral that is much better than any of the other
methods that we have studied. This is really just the very beginnings
of the wonderful subject of numerical analysis that comprises both numerical
ODE's, numerical integration and more. If you have enjoyed these
introductory lectures, you'll want to take some
dedicated courses in those areas. This concludes our brief
introduction to numerical analysis. In our next lesson, we'll continue the
theme of building bridges between smooth and discrete calculus by considering the
digital version of an improper integral. These are familiar objects,
infinite series, but you'll see them in a new light.