If you drop a ball from a certain height, it is easy to calculate its average speed. Divide the distance traveled by the time elapsed. But that doesn't tell you how fast the ball is falling at any given time, i.e. the instantaneous rate of change. It is motionless the instant you let go of it, and falling quite quickly when it hits the ground. How fast is it traveling when it is halfway down?
To see this another way, graph the position of the ball as a function of time. At time 0 (x = 0), the graph begins at y = 2 meters, the height of a human being. The graph is almost horizontal at this point, since the ball is moving very slowly. But by 0.2 seconds the curve is definitely bending downward. At 0.6 seconds the curve crashes into the x axis, the floor, at a steep angle. How fast is the ball falling after 0.3 seconds? What is its velocity?
We know it's average speed, the slope of the line joining the start and end points of the curve. We can calculate its approximate speed at time 0.3 seconds: plot the points and 0.2 and 0.4 seconds, draw a line between them, and calculate the slope. If that isn't accurate enough for you, draw a line connecting the points at 0.25 seconds and 0.35 seconds, or perhaps 0.29 and 0.31 seconds, and so on. As the time interval shrinks, the line segments approach the line that is tangent to the curve at 0.3 seconds, and the slopes of the shrinking line segments approach the slope of the tangent line. This is the derivative, the speed of the ball.
This process works well, as long as the curve doesn't wiggle around at the microscopic scale. It may feel like a sudden jolt when you crash into a brick wall, but if the time interval is small enough, the curve is smooth, and there is a well defined speed at each millisecond. If you're looking for an example, in the real world, where this process fails, you have to go down to quantum mechanics, where a particle suddenly tunnels through a barrier, without a definite speed or position. Well - we've got other mathematics to describe that. Let's get back to the derivative.
If f is a function of a real variable x, the derivative of f at x is the limit of f(x+h)-f(x) over h as h approaches 0. This fraction is called the difference quotient, and it represents the slope of the ever shrinking line segments as they approach the tangent line at x. If f is not defined at x, or the limit does not exist, f has no derivative at that point.
One-sided limits, where h is exclusively positive or negative, make one-sided derivatives meaningful. This is useful when we want to know how fast the ball was moving when it hit the floor, realizing that the graph does not continue beyond that point.
To illustrate, let y = 180-80x2, as x runs from 0 to 1.5. This is the height, in centimeters, of a ball falling on the moon, where we don't have to worry about air resistance. At time 0 the ball starts at 180cm, the height of the astronaut, and at 1.5 seconds the graph reaches the x axis, as the ball hits the moon's surface. How fast is it traveling at 0.3 seconds? We must compute f(0.3+h)-f(0.3), divide by h, and take the limit as h approaches 0. The difference quotient becomes -h-48, which approaches -48 as h goes to 0. After 0.3 seconds the ball is descending (hence the minus sign) at 48 cm/s.
You'll notice that the units are right. The numerator, given in terms of f(x), measures height, in centimeters, and the denominator, h, is in seconds. For every value of h, the fraction has units of cm/s, hence the limit, or derivative, is also measured in cm/s.
Since units are our own creation, we hope we can rescale the units as we like, and the derivative will scale accordingly. Had we measured in meters, rather than centimeters, the ball would fall at 0.48 m/s. Multiply f by a constant k, and the derivative should scale accordingly. We will see that this is the case.
The process of computing a function's derivative is called differentiation, from the word differential, or diffference. Actually the derivative should be called the differential, and sometimes it is.
If a function's derivative exists, the function is "differentiable" at that point. Of course the function may be differentiable throughout an interval, or over its entire domain, which is the usual meaning.
A function is piecewise differentiable if it is differentiable through a specified domain, except for a discrete set of points. In other words, f might be differentiable for x < 0 and for x > 0, but not at x = 0, such as abs(x). The adverb "piecewise" can be applied to other adjectives as well, such as continuous. And it may apply in higher dimensions. If white = 0 and black = 1, the checkerboard is piecewise continuous. Each square is continuous, in fact each square is constant, but the function is not continuous, and may not even be defined, on the borders between the squares.
Differentiation at all points is a transformation mapping one function into another. We shall use the notation f′ to indicate the differentiated function. Thus f′(x) is the derivative of f at x. Take the derivative of f′ at every point, giving the second derivative, written f′′. Repeat this process, giving the higher order derivatives. f′n is the nth derivative of f, i.e. differentiate n times.
A function is smooth if it is differentiable, and the derivative is continuous. Such a function is sometimes called C1. If the second derivative exists, and is continuous, the function is C2, and so on.