If this error term approaches 0 as x approaches 0, then f is continuous at p. However, if the error is an ever decreasing percentage of the length of x, as x approaches 0, then f is differentiable at p. Thus differentiable implies continuous. To state it analytically, f is differentiable if e(x)/|x| approaches 0 as |x| approaches 0, where |x| is the length of the vector x, or the distance from p.
Sometimes we divide e by |x|, then write f = f(p) + x.v + |x|×e(x). Here f is differentiable if e approaches 0 as |x| approaches 0. It's the same definition, just a matter of convention.
Let u be a unit vector, indicating the direction of travel, and let h be a real number, indicating the distance traveled. Thus x is h×u, and |x| = h. What about f(p+x)-f(p) over h? Using the latter definition of e, the limit becomes h×u.v+h×e(x) over h, or u.v+e(x). Yet e approaches 0 for small h, so the directional derivative exists, and is the dot product of u and v. If we are moving in the general direction of v, we are moving up hill. If we are moving away from v, we are moving down hill. If we are moving perpendicular to v, the elevation is constant, as though we were walking along the side of a mountain.
The n partial derivatives are equal to v dotted with the n unit vectors that point along the axes. In other words, the n partials are the n components of the vector v.
Suppose there are two vectors, v and w, that both demonstrate differentiability at the point p. Each has its own error function, which approaches 0 as we stay close to p. Let v and w differ in the jth component. Now v and w give different answers for the partial of f with respect to the jth coordinate. There is only one partial in this direction, one right answer, hence there is only one vector v that establishes differentiability of f at p. Call this vector the gradient of f at p. This is written∇f(p). It represents the derivative, or the total derivative, of f at p.
If f is differentiable, we know what the gradient is; its components must be the partials of f with respect to the coordinates in n space. Let's consider an example.
If f(x,y) = xy, ∇f = y,x. If you're looking east, the slope up, or down, is determined by the y coordinate. And as you look north, the slope is determined by the x coordinate. The gradient becomes the vector y,x. At the origin, the gradient is 0, and the surface is flat. Of course it does curve up as you move into the first quadrant, and down as you move into the second quadrant, but if you're small enough, like a tiny ant, the surface looks flat at the origin.
Note that a nontrivial gradient implies an "up hill" direction, precluding a local minimum or maximum. This is analogous to the one dimensional case, where a local maximum or minimum requires a derivative of 0.