Section 4: Derivatives

4.0 The Trials of the Animals

The lion, as we all know, is king of the beasts. The other animals live in fear of him and his appetite for their flesh. And so it came to pass that Giraffe, Zebra, Gazelle, and Warthog decided to make a public works project of protecting themselves from the lion.

"We all know that the lion is lazy," said Zebra. "Perhaps while he naps the afternoon away, we might trap him inside his lair."

"By what means?" asked Gazelle.

"We shall pile up dirt around his lair to make a wall," suggested the warthog.

Giraffe was silent, because, as everybody knows, Giraffe has no voice.

And so the project was agreed to. On the designated afternoon, hundreds of animals came to help move earth to erect the wall. As the sun sunk low, the wall was complete.

"But how shall we know if the sides of the wall are steep enough?" asked the zebra.

"One of us will have to test them," said Warthog.

And so they sent Giraffe. Giraffe ran toward the wall. When he reached its base, he bounded to the top in just one stride. There he observed that his height was exactly half the horizontal distance he had bounded. He returned to the others and wrote, "50% grade," in the sand.

This upset Zebra. "Does anybody doubt that the lion is sure-footed enough to negotiate a 50% grade?" he asked. "I was certain we had built the wall steeper than that. I myself shall go and inspect it."

And off he ran. Like Giraffe, when Zebra reached the base of the wall, he bounded up. But Zebra, whose legs are shorter than Giraffe's, needed two strides to reach the top. Once there, he observed how high he was and where is footprint was in the middle of the wall. He hurried back to the others and reported, "My observation is that the wall begins with a 75% grade, then shallows out to a 25% grade halfway up."

This time it was Gazelle who was miffed. "We worked so hard at that wall," he complained, "yet I would have no trouble scrambling up a 75% grade. The lion wouldn't either. I too shall inspect the wall."

And off he ran. He too reached the base of the wall and bounded up from there. But Gazelle's legs are even shorter than Zebra'a, and he needed four strides to reach the top. He too observed the height of his footprints, then hurried back to give his report. "It's not as bad as Zebra makes it out to be," he said. "It seems that the wall begins with an 87.5% grade, then a quarter of the way up goes to a 62.5% grade, then halfway up to a 37.5% grade, and from three quarters up to the top, it's a 12.5% grade."

Warthog wasn't sure whether the lion could climb such a wall or not. But he was sure he was the only one of the four who had not yet spoken on the subject. And being one who liked the sound of his own voice, "I'll check it too," he said, and off he went.

It took the short-legged Warthog eight strides to reach the top. And he took the time to figure the grade from each of his footprints. The first eighth of the way was a 93.75% grade, the second eighth an 81.25% grade, the third eighth a 68.75% grade, the fourth eighth a 56.25% grade, the fifth eighth a 43.75% grade, the sixth eighth 31.25% grade, the seventh eighth an 18.75% grade, and the last eighth a 6.25% grade.

And while he pondered all this, he heard a yawn from inside the wall. "You foolish creatures," came a sleepy voice. It was, of course, the lion. "You think you have penned me in, yet each of you is able to climb the wall to measure it. And do you think I would be unable to climb it just to fetch something as mundane as my dinner? Tonight I'll be having ham and bacon, and I'm certain you'll be joining me."

Figure 4-1 shows a graph of the profiles that each of the animals perceived climbing the wall. Because Giraffe could bound 8 meters horizontally in each stride, he saw the entire wall to have the same slope. Zebra saw it differently because he could only go 4 meters per stride. Gazelle could only go 2, and Warthog only 1. And so each saw his own version of the wall.

But look at what they each saw in their first bound. Giraffe: 50% grade, Zebra: 75% grade, Gazelle: 87.5% grade, and Warthog 93.75% grade. And by grade, I mean the percentage that the wall goes up of each unit it goes horizontally. After Warthog's demise, Rabbit, satisfied that the lion was no longer hungry, tried the first step of the wall. Rabbit went 0.5 meters horizontally and found the initial grade to be 96.875%. Mouse went 5 cm horizontally up the wall and found the initial grade to be 99.6875%. Ant tried it, going 1 mm horizontally in his first step, and found the grade to be 99.99375%.

With a little experimentation, you will see that all the numbers given above are consistent with the following function:

            -x²
   f(x)  =  ---  +  4                                            eq. 4.0-1
             16

f(x) represents the height of the wall as a function of how far you are from its center, which is given at x = 0. At 8 meters either side of the center of the wall, you find yourself at ground level, that is, f(8) = f(-8) = 0. One of those two spots (and I'll say that it is at x = -8) is what we have been calling the base of the wall, or where the animals take their first step in climbing it.

And how did each animal figure out the percent grade that he perceived? He simply took the vertical progress of one step and divided by the horizontal progress of that same step. For example, Giraffe took a huge bound 4 meters vertically and 8 meters horizontally. So he divided it out and saw a 50% grade on his first (and only) step. Ant, on the other hand, took a tiny step of 0.999375 mm vertically and 1.000000 mm horizontally. When he divides, he gets a 99.9375% grade. We see that at the base of the wall, the size of your stride makes a big difference as to the grade you will experience. But as the strides get tinier and tinier, the grade that the tiny animals experience seems to approach a limit of 100%. So, is it true for this wall that at x = -8:

                   vertical progress
           lim    -------------------  =  100%                   eq. 4.0-2
 step size --> 0  horizontal progress

What each animal perceived was an average grade over his step size. At the base of the wall, that average started at the base (ie x = -8) and continued some way past the base. But if we want to know the exact grade at exactly the base of the wall, and not averaged with any other part of the wall, we had better hope that the limit shown in 4.0-2 exists.

So, what is horizontal progress? Well, that's simply step size. For reasons that are tied up in tradition, I will call the step size, h. What about the vertical progress. Well, each animal starts at the base of the wall, takes a step, and goes up a little. If the animal starts out at -8 and has a horizontal step size of h = 0.1, then his vertical progress will be the height he ends up minus the height he started. So:

   vertical progress  =  f(-8 + 0.1) - f(-8)  =                  eq. 4.0-3

              / -(-7.9)²     \     / -(-8)²     \
             (  -------- + 4  ) - (  ------ + 4  )  =   0.099375
              \    16        /     \   16       /

Or, if we simply wanted to stick in the step size symbol, h, we would have:

   vertical progress  =  f(-8 + h) - f(-8)  =                    eq. 4.0-4

              / -(-8 + h)²    \      / -(-8)²    \
             (  --------- + 4  ) -  (  ----- + 4  )
              \    16         /      \   16      /

If you multiply out the square, then you have:

   vertical progress  =  f(-8 + h) - f(-8)  =                    eq. 4.0-5

              / -64 + 16h - h²     \     / -64     \
             (  -------------- + 4  ) - (  --- + 4  )
              \       16           /     \  16     /

A lot of stuff cancels in 4.0-5. The -64's go away and the 4's go away as well. That leaves you with:

                                                16h - h²
   vertical progress  =  f(-8 + h) - f(-8)  =   --------        eq. 4.0-6
                                                   16

And now we're supposed to divide that by the horizontal progress to get the grade. But the horizontal progress is simply the step size, h. So now we have:

    vertical progress      f(-8 + h) - f(-8)     16h - h²
   -------------------  =  ----------------  =   --------       eq. 4.0-7
   horizontal progress            h                 16h

Finally, we wanted to know if there was a limit to the grade as the step size, h, goes to zero. Look at the right-hand part of 4.0-7. Isn't that exactly what we have here, except with h taking the place of x? (Note also that the polynomial in the numerator is written backwards from the tradition way -- that is it has it's highest power last, and the traditional way to note it is with the highest power first. But it makes no difference what order we sum a thing in, right?) Recall that the rule from section 2.5 was simply to find the coefficients of the lowest power in the numerator and denominator that are not both zero. We proved that rule, so here we can just apply it and not bother with all that delta-epsilon stuff. The coefficients that the rule specifies in this case are 16 for the numerator and 16 for the denominator (that is, the 1st power coefficients). The rule says to simply take their ratio, which is 1, or in percent, that's 100%. And that is what the exact grade at exactly the base of the wall must be.

But what if we have a more general interest in the grade of the wall? What if just knowing its grade at the base is not enough? We would like to know the grade of the wall as a function of how far, horizontally, we are from the wall's center.

Let's do the same thing we just did, but instead of putting in -8 (which corresponds to the base of the wall), let's just put in x, which corresponds to any point on the wall you'd like. If you take a 0.1 meter step horizontally anywhere on the wall, your vertical progress is:

   vertical progress  =  f(x + 0.1) - f(x)  =                   eq. 4.0-8

              / -(x + 0.1)²     \     / -x²     \
             (  ----------- + 4  ) - (  --- + 4  )
              \      16         /     \  16     /

If you multiply out 4.0-8 and do cancellations (which I'm sure you are busy doing in your notebook as you read along), you will get:

                                             -(2x*0.1 + 0.1²)
   vertical progress = f(x + 0.1) - f(x)  =  ----------------   eq. 4.0-9
                                                     16

You can see that the size of the vertical step we take is not just dependent upon the size of the horizontal step, but also dependent upon where you are on the wall (that is at what x) you are when you take that step. The more negative x is, the more the vertical progress will be with each 0.1 meter horizontal step you take. If x is positive, which means that you are past the center of the wall, then your vertical progress is definitely negative, that is you are going down hill, toward the lion's lair.

But what if we take steps even smaller than 0.1 meters? What if we want to know the limit of the grade as the step size goes toward zero? Let's do the 4.0-8 and 4.0-9 again, except this time let's put in the variable step size, h, instead of the fixed size, 0.1.

   vertical progress  =  f(x + h) - f(x)  =                     eq. 4.0-10

              / -(x + h)²     \     / -x²     \
             (  --------- + 4  ) - (  --- + 4  )
              \     16        /     \  16     /

Go ahead and multiply out the square then cancel whatever cancels. You should get:

                                             -(2x*h + h²)
   vertical progress  =  f(x + h) - f(x)  =  ------------       eq. 4.0-11
                                                   16

But vertical progress is not what we are really interested in. What we want to know is what is the grade? Remember that grade is vertical progress divided by horizontal progress. And remember that the horizontal progress is simply the step size, h. So to get the grade you divide 4.0-11 by h.

    vertical progress      f(x + h) - f(x)      -(2x*h + h²)
   -------------------  =  ----------------   =  ------------    eq. 4.0-12
   horizontal progress             h                 16h

But this still gives us the average grade over a step of horizontal length, h, starting at the point x. We don't want the average. We want the exact grade at exactly x. And to find that, we take the limit as the step size, h, goes toward zero. Once again, 4.0-12 shows that we have the quotient of polynomials. Don't be confused. The variable, h, is playing the role that x played back in section 2.5. So to apply the rule we established back in section 2.5, we have to find the lowest power of h at which the numerator coefficient and the denominator coeffecient are not both zero. Again that occurs in the 1st power of h.

The numerator coefficient of the 1st power of h is simply whatever we multiply by the term, h, in the numerator. In this case, it's 2x. The denominator coefficient whatever we multiply by the term, h, in the denominator. In this case, it's 16. The rule tells us to use the quotient of those two for the limit. So the grade is:

                      f(x + h) - f(x)     2x
   grade(x)  =  lim   ---------------  =  --                    eq. 4.0-13
              h --> 0        h            16

Now, if I give you any horizontal point on the wall, x, you can give me the grade at that point by multiplying x by 2/16 (of course, if you want the grade in percent, you must, of course, multiply that result by 100%). The point is, the grade is a function of x, just as the height of the wall, f(x), is. If you name an x, I can tell you both the height of the wall, f(x), and the grade of the wall, grade(x). Notice that we derived the function grade(x) from f(x), and hence, we call grade(x) the derivative of f(x).

Another everyday example of the derivative of something is speed. Suppose you drive from home to your friend's house 60 miles away. It takes you 1 hour. Clearly your average speed is 60 miles per hour. But was that your speed the whole way? You might have stopped at McD's and ate a burger and fries for half an hour, then whipped along at 120 miles per hour the rest of the time.

But let's say that you don't like to get speeding tickets, and you drove around 60 miles per hour the whole way. If there are mile markers along the road every tenth of a mile, you can gauge your speed by measuring the time elapsed between passage of mile markers, and dividing that elapsed time (converted to hours) into 0.1 miles. Still, that only gives you the average speed over that period of time. Within a 0.1 mile distance you may have slowed to 50 mph to avoid rear-ending a school bus, then accelerated to 70 mph in order to pass it. But your reckoning of speed by the mile markers still indicates 60 mph.

Your speedometer ought to give you a better reckoning of your instantaneous speed. Modern speedometers work by counting the number of magnetic ticks on an automobile's drive shaft that pass by a sensor in some fixed period of time. Let's say that time is 0.1 seconds. The computer in the dashboard knows how much travel distance each tick represents. It can multiply each count by that distance (in miles), then divide by 0.00002778 hours (which is the same as 0.1 seconds), and lo, it has computed your average speed in miles per hour over a tenth of a second.

But that is still not your instantaneous speed. If you had an old-style speedometer, it might work by spinning a centrifugal lever at the same rate as the drive shaft. The lever would be attached to a spring. The degree to which the lever stretched the spring would be an instantaneous measure of your speed. In this case, by using a mechanical trick, we avoided having to take a limit. But the centrifugal mechanism does, in it's own way, determine the limit of distance traveled divided by elapsed time as elaped time goes toward zero. And that is, in fact, the definition of instantaneous speed.

At any time, t, during your trip, there is a number, x, which represents how far you have traveled up to that moment. So we have a function of time, x(t), that represents your distance traveled. At any moment, t, we also have your instantaneous speed, v(t), assuming you have a suitable way of measuring it. It is true that:

                   x(t + h) - x(t)
  v(t)  =   lim    ---------------                              eq. 4.0-14
          h --> 0         h

What 4.0-14 means is, "find the distance between where you are now and where you'll be in a short time, h, from now. Divide that by the time that elapses until then, which is also h. Your instantaneous speed, right now, is the limit of that quotient as the elapsed time, h, goes toward zero." And so, the instantaneous speed is derived from what your position is at various times. That is why speed (or more properly, velocity) is the time derivative of position, or more commonly, the derivative of position with respect to time.

4.1 The Main Pillar of the Temple

In the last section we saw two examples of deriving a function from another function. In both cases we used the same recipe. The recipe is this:

If you have a function, f(x), and you want to find the value of the derived function at x, then find both f(x) and f of a nearby point, x + h, take the difference of those two values, and divide that by h. When you take the limit of that quotient (which is commonly called the divided difference) as h (which is the distance between x and the point that is nearby) goes toward zero, you have the derived function.

That is what we did to find the grade of the wall that the animals built. We knew the height of the wall at any point. Each animal used the divided difference (his change in altitude divided by his horizontal step size) to determine his own perception of the wall's grade. We used smaller and smaller animals to see what happens as their horizontal step size goes toward zero. And we saw that a limit exists. That limit was the derived function that gave us the exact grade at an exact point on the wall. We derived it from the function that gives us the height of the wall.

That is what we did also to determine instantaneous speed. We took the difference between where we are now and where we will be in a little while, and we divided that by the time elapsed over that little while. As the little while goes toward zero, we found that there was a limit, and that limit is the instantaneous speed. We derived the speed function from the function that gives us position as a function of time.

So here is the main idea: Take real function of a real variable, f(x). Form the divided difference of f(x):

    f(x + h) - f(x)
    ---------------                                              eq. 4.1-1
           h

In other words, take the difference between what the function is at x and what it is a short distance, h, from x, then divide by the short distance, h. If you take the limit as h goes toward zero, and that limit exists, then you have the derived function's value at x. The derived function is called the derivative.

The concept of taking this limit of the divided difference to find the derivative is so commonly used in mathematics that we have special notations for it. If f(x) is a function and we find the limit of the divided difference exists over some domain, then we can express the derivative of f(x) as either f'(x) or as

   df
   --
   dx

The first notation is due to Isaac Newton. You can see a brief biograpy of Isaac Newton by clicking here. The second is due to Gottfried Leibniz, who wanted to show that the derivative was a quotient of the differential in the function, f, divided by the differential in x. Leibniz imagined that the difference in both the numerator and in the denominator of 4.1-1 had an infinitesimal existence even as they both went to zero. He coined the term, differential, to describe such infinitesimal quantities. You can see a brief biography of Gottfried Leibniz by clicking here. Today both notations are in common use. Yet another notation you might see in some books would have a dot above the function name (which in this case is f) instead of the tick mark just to its right.

Important: The definition of the derivative of any function, f(x), is:

             df             f(x + h) - f(x)
   f'(x)  =  --  =    lim   ---------------                      eq. 4.1-2
             dx     h --> 0        h

wherever that limit exists.

More on the "`d`" Notation

There really is nothing about Leibniz' "d" notation that is not contained in the limit equation given in equation 4.1-2. If we let the symbol, Dx, be the same as h, and if we let the symbol, Df, be the same as f(x + h) - f(x) (which is the same as f(x + Dx) - f(x) ), then equation 4.1-2 becomes

             df               Df
   f'(x)  =  --  =     lim    --                                  eq. 4.1-2a
             dx     Dx --> 0  Dx

So from a notation point of view, it's just another way of notating this limit of a ratio. The D operator stands for "difference." Dx is the difference between this x and another x a little ways away. Df is the difference between f of this x and f of the x that's a little ways away. The d operator stands for what happens to those differences in the limit. Which brings up an important point. The d and the D are both operators, NOT numbers. So when you see a d or a D in both numerator and denominator, you cannot cancel them. But if you saw a dx/dx, that you can cancel and say that it is equal to 1. That's because the dx in the numerator is identical to the dx in the denominator. Likewise if you saw Dx/Dx.

In Leibniz' way of looking at things, the symbol, dx, means that when you take the limit as Dx goes to zero, dx is the value that Dx takes on the instant before it winks out entirely and becomes zero. There is no real number that describes dx. It is closer to zero (though not equal to zero) than any nonzero real number can possibly be. Likewise df is the value that Df takes on the instant before Dx winks out entirely and becomes zero. Presumably Df winks out as well at that point, but remember that we are interested in the value of Df immediately before that happens. Again there is no real number that can describe df because it is closer to zero than any nonzero real number can possibly be. Yet although both dx and df are infinitesimal, their ratio is real whenever the limit exists. And that ratio, df/dx, is the derivative.

(For what it's worth to this discussion, mathematicians have devised an entirely self-consistent system of arithmetic among infinitesimal quantities. And yes, there is a whole tinier set of infinitessimal quantities that are as tiny compared to the infinitessimals we have been discussing as the ones we have been discussing are to the reals. They are the d² infinitessimals. And there is a set of even tinier infinitessimals for d³, and so on indefinitely)

When you think about it, the Leibniz notation better indicates what is going on when you take a derivative than does the Newton notation. For one thing, it clearly shows that a derivative of a function is taken with respect to a particular independent variable. In this case, that variable is x. It also shows that a derivative is always a ratio or quotient that happens in the limit as its denominator goes to zero (of course the numerator must go to zero at the same time for the limit to exist). Still the Newton notation is a convenient shorthand that requires fewer pencil strokes and fewer keystrokes at the keyboard. That is why I'll be using mostly the Newton notation throughout this tutorial.

The Derivative Is a Slope

You recall that in algebra you described straight lines that were not vertical using the equation, y = mx + b. And you recall as well that the term, m, you called the slope of the line. Suppose we take such a line as a function:

   f(x)  =  mx + b                                               eq. 4.1-3

If you make up values for m and b and plot it, you will find that it is indeed a straight line. Let's apply the definition given by 4.1-2 to find the derivative of this function.

                    (m(x + h) + b)  -  (mx + b)
   f'(x)  =   lim   ---------------------------                  eq. 4.1-4
            h --> 0              h

Do you see how we got 4.1-4 from 4.1-2 and 4.1-3? Make sure you understand how to make those substitutions. You are likely to have to do it on an exam.

When you multiply out the m(x + h), you get:

                    mx + mh + b  -  mx - b
   f'(x)  =   lim   ----------------------                       eq. 4.1-5
            h --> 0           h

There are some major cancellation here. Once you do them, you are left with:

                    mh
   f'(x)  =   lim   --                                           eq. 4.1-6
            h --> 0  h

And when you apply the rule we discovered back in section 2.5, you get, simply

   f'(x)  =  m                                                   eq. 4.1-7

That means that the derivative of a straight line function (also called a linear function) is exactly its slope, m. And it doesn't matter what you choose for x. The derivative of a straight line is everywhere equal to its slope..

But what about functions that are not straight lines? What do their derivatives mean? Back in algebra, you talked about straight lines and their slopes. You also talked about parabolas and other curves, but you never talked about their slopes.

Remember the wall that the animals built? It was a parabola, wasn't it. The animals wanted to know its grade, but that is just a different word for slope. Here again is figure 4-1. Starting at the base of the wall, each animal found a straight line that intersected the parabola at two points. Each animal determined the slope of that line and called that the grade at the base. We subsequently discovered that as you bring the two points of intersection closer and closer together, the slope of the resulting line approaches a limit. And at the limit, we have a line that is tangent to the wall, and we are finding the slope of that tangent line.

That is how a derivative is a slope. If when you graph f(x) you get some curve, then the derivative, f'(x), gives you the slope of the line that is tangent to that same curve at x.

Figure 4-2 shows an arbitrary function graphed in green together with its derivative, which is graphed in red. Fig. 4-2: A Function and its Derivative Never mind what the equation is for f(x). That is unimportant for now. Instead, look carefully at the behavior of the two functions. From x = -1 to x = 0, the green function increases by almost 3 squares. In that region it is sloping nearly 3 squares up for every square to the right. In that same region, the red function, which is the derivative of the green function, is between +2 and +3. That is because the red function graphs the slope of the green function.

From x = 0 to x = 1, the green function grows less steep. In other words, its slope lessens. At the same time, the red function goes down, because it is representing the lesser slope of the green function.

Somewhere between x = 1 and x = 2 the green function levels out completely, that is, its slope becomes zero. At the corresponding x value, the red function is zero.

From that point to about x = 5, the green function is sloping down, that is, it has a negative slope. In that entire region, the red function is less than zero, as you would expect.

At about x = 5, the green function levels out again, having at that point a slope of zero. At that same x value, the red function is again zero. To the right of that, the green function slopes back up again, and correspondingly the red function is positive in that region.

You might try holding a straight edge up to the screen, tangent to the green function in various places. Count the squares up and squares to the right that the straight edge traverses, then use the quotient of squares up divided by squares to the right to estimate the slope of the green function at the point of tangency. Then compare your estimate to the value of the red function at the same x.

The Derivative Is a Rate

Let's attach a different story to figure 4-2. Let's say that the horizontal axis measures seconds. For the green function, the vertical function measures tens of meters. In fact, it measures your progress down the road in your car. The story the green function tells goes something like this: "Prior to time -1 seconds, you were tooling along at about 30 meters per second (66 miles per hour) when you spotted a 50 dollar bill in the road. You screeched a halt, coming to a stop at about time 1.5 seconds. You immediately threw it into reverse, backed up, halted again, this time at about time 5 seconds, when you came even with the bill. Right away you snatched it up, then proceded on your way, but at a lesser speed." In this story, the red graph shows exactly how fast you were going at each second in tens of meters per second. When you were going in reverse, your speed is considered negative. The red graph is your rate.

In algebra you probably solved rate problems ad nauseum. But in all the problems, the rate (e.g. speed, dollars per hour, yen per Deutschmark, etc.) remained constant throughout the problem. Even when the rate did change, it changed in jumps (e.g. For 4 hours you are paid $5 per hour, then for the next four hours you are paid $8 per hour). The math you were learning then just wasn't up to dealing with rates that changed constantly with time. Yet the real world is full of rates that do change constantly with time or with other variables. And that is why you are learning calculus now. The concept of a derivative is simply a rate that can change constantly with time or with some other variable. It is the most central concept in calculus, even though the concept of limits underlies it. The derivative has some remarkable properties that you will learn about shortly. Those properties are so elegant that you will eventually come to know the derivative primarily by its properties, and that's how it should be. But don't ever forget that you came to the derivative by taking a limit. When you get confused, come back to that. Everthing you need to know about derivatives is hidden in the definition given here in equation 4.1-2.

Link to Important Coached Exercise

It is with near certainty that you will be required on some exam to find the derivative of some function by applying equation 4.1-2. So here I give you a coached exercise for finding the derivative of f(x) = x². In the same box we shall cover how you can find the derivative of g(x) = xⁿ where n is any counting number. So to dive deeply into derivatives, click here.

Return to Table of Contents

Move on to section 4.2: Rules to Live By