Neither Div nor Curl nor Both Constitute the Derivative

Charles S. Barnett

Adjunct Mathematics Instructor

Las Positas College

Presented at the 41st annual fall conference

California Mathematics Council, Community Colleges

Monterey, California

December 13 and 14, 2013

Abstract

The Div and Curl operators are derivative-like, but are only aspects of the derivative of a map from 3-space to 3-space. The derivative of such a given map is a linear map from 3-space to 3-space. By employing the Frechet difference quotient at the outset, we can construct that derivative without use of the methods of advanced analysis. The process, results, and properties parallel those of the one-dimensional case.

Introduction

In the usual calculus sequence we begin by defining the derivative of a real-valued function of a real variable via the limit of a difference quotient. Next come the techniques for finding the derivative of familiar functions without resort to the defining relation. As we work our way to the study of functions from n-space to m-space where 1≤n≤3 and 1≤m≤3, we encounter “partial derivatives,” “gradient,” “divergence,” and “curl.” But “derivative” standing alone seems to disappear, and it reappears, if at all, in analysis courses above, or to the side of, Advanced Calculus. That traditional approach has the advantage of quickly getting to methods that have applications in engineering and scientific disciplines, the source of most students who study the calculus sequence. But these derivative-like concepts, especially Divergence and Curl, seem to appear out of a vacuum, not as a continuation of the development of the derivative concept. An alternative approach is available.

There exists a relatively smooth route from the definition of the derivative for the one-variable case to the definition of the derivative for the case in which both domain and range can have any (not necessarily equal) dimensions. The approach involves constructing the limit for the Frechet difference quotient, essentially a directional derivative. If this limit satisfies certain smoothness conditions, then a derivative emerges, and it is a linear map; the linearity is demonstrable directly from the defining relation.

When both the domain and range of a function are real 3-space, then the derivative is a linear map from 3-space to 3-space, and the Divergence and Curl are aspects of that derivative, but neither one nor both constitute the derivative. Further properties of the derivative emerge, without the use of coordinates, along lines that parallel those used to derive comparable properties of the derivative for the one-variable case. We do return to coordinates when performing calculations.

This approach to the multi-dimensional derivative illustrates the beauty and utility of generalization without having to resort to the more advanced methods of higher analysis.

What follows is expository in nature, hence the level of justification for the assertions given varies from a minimum of none to, in a few cases, a maximum of relatively acceptable plausibility arguments. I assert that solid proofs of the claims displayed appear in the references cited in the annotated bibliography.

The usual calculus course proceeds through the material somewhat as follows: (1) scalar-valued functions of a scalar, (2) vector-valued functions of a scalar, (3) scalar-valued functions of a vector, and (4) vector-valued functions of a vector. I will follow that sequence to the main goal of (4), except that I will skip (2).

I begin by defining the derivative of a real-valued function of a real variable in a way that may seem strange and unnecessarily convoluted. However, the definition I use sets the stage for the definitions that apply to the other two cases. This first section, and only this section, is new; at least I have not seen it in print. All else is, well, derivative.

Scalar-valued Functions of a Scalar

Situation: R represents the real numbers. f:R→R.

Goal: Define the derivative of f at x, f'x.

For x,y∈R2 consider the following limit:

f'x,y=limh→01hfx+hy-fx . (1)
If this limit exists, then f'x,y is called “the derivative of f at x with respect to y.”

Definition: f:R→R is “differentiable” in R if f'x,y exists for each x∈R and each y∈R. If further, for each y∈R f'x,y is continuous in x, then f:R→R is said to be “continuously differentiable in R.”

Claim: f'x,y is linear at variable y.

Pick and fix a in R and consider f'x,ay. Show first that f'x,ay = af'x,y. Well,

f'x,ay=limh→01hfx+hay-fx


=limh→0ahafx+hay-fx


=alimt→01tfx+ty-fx t=ha

=af'x,y .

Therefore, f'x,ay=af'x,y (2)

Now let y and z denote two non-zero elements of R and consider f'x,y+z.

f'x,y+z=limh→01hfx+hy+hz-fx

=1limh→01hfx+hy+hky-fx
for some k∈R

=limh→01hfx+h1+ky-fx


=21+kf'x,y

=f'x,y+kf'x,y

=3f'x,y+f'x,ky

=4f'x,y+f'x,z. (3)

Justifications for the numbered steps:

=1k=zy

=2a=1+k in (2)

=3a=k in (2)

=4z=ky

So, f'x,y is linear in the second variable.

Definition: The derivative of f at x, denoted by f'x, is the linear map from R→R defined by f'xy=f'x,y.

Examples:

1.  f is a constant function, fx=k, where k is a constant. Then f'xy=0, i.e. f'x, sends all y to zero for every x∈R.

Argument: fx+hy-fx=k-k=0.

2.  f is a linear map, i.e., fx=ax, where a∈R is an arbitrary but fixed number.

Then f'x is a constant map, i.e.,

f'xy=ay for every x∈R, y∈R.

Argument: fx+hy-fx=ax+hy-ax=hay.

3.  fx=x2 implies that f'x=2x, i.e.,

f'xy=2xy all x,y∈R2

Argument: fx+hy-fx=x+hy2-x2

=x2+2hxy+h2y2-x2

=2hxy+h2y2

Now, divide by h and pass to the limit to see that

f'x,y=2xy⟹f'x=2x.

Best affine approximation to f

Pick and fix x0∈R. Then the mean value theorem says that

fx=fx0+f'x0x-x0+error. (4)

You will see direct parallels to (4) as we work our way up to functions from R3⟶R3.

Scalar-valued Functions of a Vector

Situation: f:R3⟹R

Goal: Define the derivative of f at X, f'X. For X,Y∈R3×R3, consider the following limit that defines function f':R3× R3→R.


f'X,Y=limh⟶01hfX+hY-fX

If this limit exists, then f'X,Y is called “the derivative of f at X with respect to Y”.

Definition: The function f:R3→R is said to be differentiable in R3 if f'X,Y exists for each X,Y in R3×R3. It is said to be continuously differentiable if, for each Y∈R3, f'X,Y exists for each X∈R3 and is continuous in X.

Claim: f'X,Y is linear in the variable Y. First, show that f'X,aY=af'X,Y for a∈R.

Pick and fix a in R. If a=0, the difference quotient is 0. If a≠0, then

f'X,aY= limh→01hfX+haY-fX

=limh→0ahafX+haY-fX

=alimt→01tfX+tY-fX

=af'X,Y 5

Now, consider f'X,Y+Z. We must show that f'X,Y+Z=f'X,Y+f'X,Z,

where

f'X,Y+Z=limh→01hfX+hY+hZ-fX.

There exists a version of the mean value theorem that we will need:

If f'X+tY,Y exists for 0≤t≤1, then there exists θ,0<θ<1 such that

fX+Y-fX=f'X+θY,Y (6)

Now consider

f'X,Y+Z=limh→01hfX+hY+hZ-fX

=1limh→01hfX+hY+hZ-fX+hY+fX+hY-fX

=2limh→01hfX+hY+hZ-fX+hY+limh→01hfX+hY-fX

=3limh→01hf'X+hY+θhZ,hZ+f'X,Y

=4limh→0hhf'X+hY+θhZ,Z+f'X,Y

=5f'X,Z+f'X,Y (7)

Justifications:

=1 Subtract and add fX+hY

=2 Clear

=3 Apply Eqn. (6) with X←X+hY and Y⟵hZ to obtain the first term.

Apply definition of f'X,Y to yield the second term.

=4 Apply Eqn. (5) with X←X+hY+θhZ, a←h, Y←Z.

=5 Execute the limit operation.

Definition: If f is continuously differentiable, then the derivative of f'X at X is the linear map

f'X:R3→R ,

defined by

f'XY=f'X,Y (8)

The Gradient

Thus far I have not mentioned the gradient; classical textbooks introduce and discuss the gradient early in developing the theory of functions from R3→R. I turn now to connecting our derivative to the gradient.

Much of what follows could be developed without reference to a specific basis or a specific scalar product, but henceforth I will employ the usual dot product and the standard basis. I use the following notation for base vectors.

A1=i=1,0,0, A2=j=0,1,0, A3=k=0,0,1.

Let
Y=i=13yiAi .

Then

f'X,Y=1f'X,i=13yiAi

=2 i=13f'X,yiAi

=3 i=13yif'X,Ai

=4i=13yilimh→01hfX+hAi-fX

=5i=13yi∂f∂xi.

Justifications:

=1 Clear

=2 f' is linear in the second variable.

=3 f' is linear in the second variable.

=4 Definition of f'

=5 Components of X+hAi are equal to those of X except for the ith, which is xi+h.

Therefore, the difference quotient used to define f'X,Ai is identical to that used to define the partial derivative.

So,
f'X,Y=i=13yi∂f∂xi . (9)

Now, there exists a theorem from linear algebra which says that if T:R3→R is linear, then there exists a unique vector, B, say, in R3 such that

TY=B∙Y . (10)

Invoke the fact that f' is linear in the second variable, compare Eqns. (9) and (10), and you see that the unique vector B is ∇fX in the case at hand. That is,

f'X,Y=f'XY=∇fX∙Y . (11)

So, knowing the gradient, ∇fX, is equivalent to knowing the derivative f'X, but ∇fX is not the derivative. ∇fX is a member of R3; f'X is a linear map from R3 to R.

Best affine approximation to f

Given f:R3→R, consider the best affine approximation to f in the neighborhood of a fixed point X0. We see from Eqn. (11) that the best such approximation is

fX=fX0+f'X0X-X0+error (12)

or, in terms of the gradient,

fX=fX0+∇fX0∙X-X0+error . (12')

Equation (12’) represents the classical version.

Vector-valued Functions of a Vector

Situation: F:R3→R3

Goal: Define the derivative of F at X, F'X, and discuss some of its properties. In particular, study the relationship of F'X to the Divergence and Curl.

For X,Y∈R3×R3, consider the following limit that defines F'X,Y:

F'X,Y=limh→01hFX+hY-FX 13

if the limit exists.

Definition: Function F is said to be differentiable if F'X,Y exists for every X∈R3 and Y∈R3. F is continuously differentiable if it is differentiable and F'X,Y is continuous in X.

In what follows I assume that all functions possess the smoothness required for whatever operation is indicated. The development is formal and sketchy. The references cited in the annotated bibliography contain extensive, careful, and broader-based arguments that lead to results much more general than those displayed here.

Examples:

1.  F:R3→R3 is a constant function.

FX=B∈R3 for every X. B is a fixed vector.

Then FX+hY-FX=B-B=0∈R3.

So, F'X,Y=0∈R3.

2.  F:R3→R3 is a linear map.

FX=TX, where T is linear.

Then FX+hY-FX=TX+hY-TX.

=TX+hTY-TX=hTY.

Now, divide by h and pass to the limit to see that

F'X,Y=TY for all X,Y∈R3×R3.

Let A1,A2,A3 represent the usual basis in R3. Then

FX=i=13fiXAi ,

where the fi represent the component functions. Let fi'X,Y represent the derivative with respect to Y of the component functions. In the previous section we sketched an argument that showed that the fi' are linear in the variable Y. It can be shown that F'X,Y inherits linearity in Y from the fi'. So,

F'X,Y=limh→01hFX+hY-FX

is linear in the variable Y.

Definition: The derivative F'X of F at X is the linear transformation

F'X:R3→R3

defined by

F'XY=F'X,Y . (14)

Examples (restatement in terms of F'rather than F'):

1.  The derivative of a constant function is the zero vector. So F'X transforms all of R3 into the zero vector.

2.  The derivative of a linear function is a constant.

FX=TX, T linear, implies that

F'X=T for every X.

Now we can get back to more familiar, classical ground. F'X is a linear map. Given a basis, a linear map can be represented by a matrix relative to that basis. So, as above, let A1,A2,A3 represent the standard basis. Then

X=i=13XiAi, Y=i=13yiAi, FX=i=13fiXAi ,

where the scalar-valued fi are the component functions of F.

Then

F'XY=i=13j=13yj∂fi∂xjAi (15)

I have invoked Eqn. (9) of the previous section and the linearity of F'X to arrive at Eqn. (15).

Observe from Eqn. (15) that, relative to the standard basis, F'XY is represented by

F'XY=∂f1∂x1∂f1∂x2∂f1∂x3∂f2∂x1∂f2∂x2∂f2∂x3∂f3∂x1∂f3∂x2∂f3∂x3y1y2y3 , (16)

where here and below I use ∙ to indicate the matrix of ∙ .

Best affine approximation to F

As in the previous R→R and R3→R cases, we have a best approximation via Taylor’s formula:

FX=FX0+F'X0X-X0+error. 17

We now have available the definition of the derivative of a function from R3→R3. That derivative is a linear map from R3→R3. To define Div and Curl and to show their connection to the derivative, I need some results from Linear Algebra. They appear in Appendix A as assertions without proof. In Appendix A I restrict attention to results about real n-space where 1≤n≤3, which suffices for present purposes. More general versions of most exist.

Divergence and Curl of a Vector Field

Given F:R3→R3 as above, then F'X is a linear map from R3→R3. Linear maps may be broken into the sum of their symmetric and skew-symmetric parts.

Let F'X* represent the adjoint of F'X. Then

F+'X=12F'X+F'X*=symmetric part (18)

and

F-'X=12F'X-F'X*= skew-symmetric part. (19)

Observe that F'X= sum of the two parts.

Formula (16) displays the matrix representation of F'X, and I wish to display some additional matrix representations. So, to simplify typography, until further notice let

fij≡∂fi∂xj . (20)

Then Formula (16) becomes

F'XY=f11f12f13f21f22f23f31f32f33 y1y2y3 , (16')

and the matrix of F'X* is

f11f21f31f12f22f32f13f23f33 . (21)

The matrix representations (16') and (21) combined with Eqns. (18) and (19) and followed by some matrix algebra lead to the matrix representations of 2F+'X and 2F-'X

They are

2F+'X=2f11f12+f21f13+f31f12+f212f22f23+f32f13+f31f23+f322f33 (22)

2F-'X=0f12-f21f13-f31f21-f120f23-f32f31-f13f32-f230 . (23)

Definition: The Divergence of F is a map Div F:R3→R, defined by

Div FX=Trace F'X.

The Trace of F'X is equal to the sum of the roots of the characteristic polynomial of F'X, a property of the linear map that is invariant to matrix representation. The trace of any linear map from R3→R3 is equal to the sum of the diagonal elements of any matrix representation. Hence, the familiar version of the Divergence:

Div FX=f11+f22+f33

=∂f1∂x1+∂f2∂x2+∂f3∂x3 .

Observe also that Div FX=Trace F+'X .

Definition: The Curl of F at X is defined by

2F-'XY=Curl FX×Y (24)

How do we know that a vector that makes the right side of Eqn. (24) equal to the left-hand side exists? Well, 2F-'X is a skew-symmetric transformation and item A5 of the Appendix asserts that such a vector exists for such maps; we name it Curl FX. Also, according to A5, if the matrix of a skew-symmetric map looks like

0ab-a0c-b-c0 , (25)

then the vector at issue (Curl FX here) is

-ci+bj-ak . (26)

Compare Eqns. (23), (25), and (26) and you see that

Curl FX=f32-f23i+f13-f31j+f21-f12k

or, in the usual notation,

Curl FX=∂f3∂x2-∂f2∂x3i+∂f1∂x3-∂f3∂x1j+∂f2∂x1-∂f1∂x2k (27)

Divergence and Curl of a Steady Flow (an application)

Consider a vector field F:R3→R3 and the flow induced by the differential equation