Welcome to calculus. I'm professor Ghrist. We're about to begin lecture 49, numerical integration. In chapter three of this course, you had a lot of practice computing definite and indefinite integrals. In this lesson, we're going to see how the discrete, or digital calculus, helps us to solve definite integrals. This is the beginning of the subject known as numerical analysis. In this lesson, we'll continue with our applications of discrete calculus to the differential. This time, in the context of the definite integral. Instead of using the fundamental theorem of integral calculus, we sometimes need to take a different approach since, as you know, some integrals are hard to compute. However, even the very definition of the definite integral in terms of a limit of Riemann sums invokes the language of discrete calculus using what looks like a forward difference. Let's explore how sequences enter the picture. Let's assume that one is given a sequence of X values, X sub N going from say, X nought to X capital N. This is bounding our domain of integration. Then, one is given a sequence, f, of function values, f sub n that is thought of a sample of some smooth or continuous function, f. Now, the question is, given only these values, how can we approximate the integral, even though we do not know the actual function, F, but only the sample values? Well, of course we can't get the exact answer, but we can approximat. The simplest approximation is the Riemann sum itself. If we consider our sequence of X values, and our sequence of f values as determining the partition and the function values. Then we can write out a Riemann sum of the form sum, n goes from 0 to capital N- 1 of f sub n times delta X n. That is the forward difference of X. We might call this the left Riemann sum. Because if we take a partition, starting with X nought, and using the forward difference of our sequence X, then of course, the sample point that we have chosen for this partition is the leftmost point. Notice that the sum is only going up to N- 1, and we do not include that right endpoint. Now, we can reverse things and use what we might call a right Riemann sum, ignoring the first sample point at X nought, and summing up all the rest using the backward difference of x. As approximations, these Riemann sums are not terribly good. On any given sub-interval, you're probably going to estimate too low or too high. There's a better way to do things. Instead of using rectangles, one could use a trapezoid. This is the basis for what is called the trapezoid rule of numerical integration. What's the area of one of these trapezoidal elements? Well, of course, it's the same as taking the midpoint and evaluating the area of the rectangle defined there by. When one works out what that means in terms of the sequence, it is the midpoint height, that is fn plus one half, delta f at n times the width delta x at n. Where here we're using the forward difference. If we sum this up over all of the intervals, let's say as n goes from 0 to N- 1, then one obtains a much better estimate for the definite integral of f. A few remarks are in order. First, the trapezoid rule is really just an average of the left and the right hand Riemann sums. And second, if you're in the context of a uniform grid, that is, the spacing between the h values is a constant, h, then the trapezoid rule takes on a very nice form. It is this width or step size, h, times the following weighted sum of the function value's f. The first point is weighted with coefficient one half. The last point, f sub N, likewise. All of the other interior sample points have weights one and so one simply takes the sum. This is a very simple formula both to remember and to use. There's an even better approximation that is called Simpson's rule, and its derivation tends to be unexplained in most calculus courses, but we're going to take the time and do it. Simpson's rule is a third order method. That means we approximate the integrand f by a polynomial of degree three. How do you approximate a function with a polynomial? Well, you already know the answer to that. We do a Taylor expansion. Let's set things up so that we're expanding about x = 0 for convenience. We keep all terms of order three and below and pack everything else into big O of x to the fourth. We're going to focus on the range, as x goes from negative h to positive h, where h is our step size. Therefore, the higher order terms can be written as big O of h to the fourth. That's gonna be important later. Now what do we want to do? We want to integrate from negative h to h, and so we can integrate each term in this Taylor expansion. Now keeping in mind that these derivatives are all evaluated at zero and hence constant we see that because we're integrating Over a symmetric domain. All of the odd degree terms in X integrate out to give us 0. We're left with a much simpler integral. How do we do that? Well, the first term is a constant, f evaluated at 0. That integrates to that constant times x. The second term has an x squared in it. That integrates to x cubed over 3 with the constants out in front. The error terms, big O of h to the fourth, when we're integrating from negative h to h, still gives us something that is in big O of h to the fourth. Now when we evaluate from negative h to h, we can simply evaluate from 0 to h, and pull out a constant of 2 if we like. This gives us an expression for the integral that requires knowing the function value at 0, and the second derivative of that function at 0. Now, the first term is gonna be easy. If I'm given sample points, let's say at X n and X n-1 and X n+1. Then evaluating at the middle point, that is, at Xn, is simple. That function is simply f sub n. But how do we estimate the second derivative of our integrand, when all we know is the function value where we are, and to the left and the right? Well, this requires remembering a few things from discrete calculus. First of all, we write the forward and the backward differences in terms of operators. The shift operator E and the identity operator I. When you don't have the ability to move two steps in front or behind, the proper way to estimate a second derivative is to use the 2nd central difference. This is the forward difference of the backward difference, or the backward difference of the forward difference, it doesn't matter. In either case, what is this going to give you? Well, in language of operators we can simply multiply things out and we see that the second central difference of f at n is f n-1- 2f n + f n+1. This is a good estimate for the second derivative. Well not exactly, we have to be a little bit careful here. Remember the second derivative is d squared f over d x squared. So to estimate that, we'll use the second central difference of f in the numerator. And in the denominator, we'll use delta x at n squared. Of course that delta x is our uniform step size h, and so we get an h squared in the denominator. Applying the second central difference, we obtain, after a little bit of algebraic simplification, an estimate for the integral from negative h to h as h times quantity one-third f n-1 + four-thirds f n + one-third f n+1. That is the basis of Simpson's rule. That's a lot of work. But let's see how it all comes together. We're going to assume that our integration domain is divided into equally spaced subintervals with h. We're going to set it up so that there is an even number of these subintervals so that we can apply Simpson's rule on incident pairs. On each such pair of subintervals, with h, we weight the function values by these coefficients, one thirds, four thirds, and one thirds, that we derived in our previous few slides. Now, we do this on all pairs of incident subintervals, and then add the results of these integrals together, being careful to account for what happens on the overlaps. When we do so, we obtain a system of weights for the function values, and this comprises Simpson's rule. We can approximate the integral of f as h, the width times one-third f nought, plus four-thirds f1, plus two-thirds f2, etc. With two-thirds and four-thirds alternating, until the very end, where the last point fN has weight one-third. Let's summarize our integration methods, under the assumption of a uniform grid of step size h. In this case, all of the methods can be described in terms of the weights that are placed on the sampled function values f sub n. In the left and right Riemann sums, we simply ignore one of the endpoints, whereas in the trapezoid rule, we're averaging things together. Simpson's rule, as we saw, has a more complicated system of weights. Now all of these methods have an order associated to them. We used a third order curve to approximate the integrand in Simpson's rule. The trapezoid rule is a first-order method. We used line segments. The left and right Riemann sums are zeroth-order methods. We just took the constants and built rectangles from them. Because all of these are really using some Taylor series to approximate the integral, they all have an error term that can be expressed, in terms of big O, of the step size H, and one can describe the accuracy of the method in terms of big O. Here Simpson's rule is best with the error term being in big O of h to the fourth. The Trapezoid rule, big O of h squared. The Riemann sums left and right have error in big O of h. Let's put these methods to the test, in the context of an example. The answer to which we already know. Consider the integral of 1 over x, as x goes from 1 to e. This is, of course, log of e minus log of 1, that is, 1. If we take our domain and divide it up into six equally spaced subintervals, computing the sample points at each, then how do we proceed? Well, in this case, h is e- 1 divided by 6. For the left Riemann sum, we multiply the sample values f sub n by this h and then add them all up, not taking that last endpoint. When we do so, we get an answer that is not too bad. It's within 10% of the correct answer. If we do a right Riemann sum, then, because of the convexity of this curve, 1 over x, we get a very different value. It is under the true value, instead of over. Again, by about 10%. The trapezoid rule is a great improvement over these methods, returning an answer that is within 1% of the correct answer. And notice that we're using almost exactly the same values, just balancing out what happens at the left and the right hand endpoints. But Simpson's rule beats them all. When we use that system of weights, those third order approximates return a value of that integral that is much better than any of the other methods that we have studied. This is really just the very beginnings of the wonderful subject of numerical analysis that comprises both numerical ODE's, numerical integration and more. If you have enjoyed these introductory lectures, you'll want to take some dedicated courses in those areas. This concludes our brief introduction to numerical analysis. In our next lesson, we'll continue the theme of building bridges between smooth and discrete calculus by considering the digital version of an improper integral. These are familiar objects, infinite series, but you'll see them in a new light.