P a r t 1
Professor Emeritus of Physics
Yale University

Throughout the decade of the 1990’s, I taught a one-year course of a specialized nature to
students who entered Yale College with excellent preparation in Mathematics and the
Physical Sciences, and who expressed an interest in Physics or a closely related field. The
level of the course was that typified by the Feynman Lectures on Physics. My one-year
course was necessarily more restricted in content than the two-year Feynman Lectures.
The depth of treatment of each topic was limited by the fact that the course consisted of a
total of fifty-two lectures, each lasting one-and-a-quarter hours. The key role played by
invariants in the Physical Universe was constantly emphasized . The material that I
covered each Fall Semester is presented, almost verbatim, in this book.
The first chapter contains key mathematical ideas, including some invariants of
geometry and algebra, generalized coordinates, and the algebra and geometry of vectors.
The importance of linear operators and their matrix representations is stressed in the early
lectures. These mathematical concepts are required in the presentation of a unified
treatment of both Classical and Special Relativity. Students are encouraged to develop a
“relativistic outlook” at an early stage . The fundamental Lorentz transformation is
developed using arguments based on symmetrizing the classical Galilean transformation.
Key 4-vectors, such as the 4-velocity and 4-momentum, and their invariant norms, are
shown to evolve in a natural way from their classical forms. A basic change in the subject
matter occurs at this point in the book. It is necessary to introduce the Newtonian
concepts of mass, momentum, and energy, and to discuss the conservation laws of linear
and angular momentum, and mechanical energy, and their associated invariants. The
discovery of these laws, and their applications to everyday problems, represents the high
point in the scientific endeavor of the 17th and 18th centuries. An introduction to the
general dynamical methods of Lagrange and Hamilton is delayed until Chapter 9, where
they are included in a discussion of the Calculus of Variations. The key subject of
Einsteinian dynamics is treated at a level not usually met in at the introductory level. The
4-momentum invariant and its uses in relativistic collisions, both elastic and inelastic, is
discussed in detail in Chapter 6. Further developments in the use of relativistic invariants
are given in the discussion of the Mandelstam variables, and their application to the study
of high-energy collisions. Following an overview of Newtonian Gravitation, the general
problem of central orbits is discussed using the powerful method of [p, r] coordinates.
Einstein’s General Theory of Relativity is introduced using the Principle of Equivalence and
the notion of “extended inertial frames” that include those frames in free fall in a
gravitational field of small size in which there is no measurable field gradient. A heuristic
argument is given to deduce the Schwarzschild line element in the “weak field
approximation”; it is used as a basis for a discussion of the refractive index of space-time in
the presence of matter. Einstein’s famous predicted value for the bending of a beam of
light grazing the surface of the Sun is calculated. The Calculus of Variations is an
important topic in Physics and Mathematics; it is introduced in Chapter 9, where it is
shown to lead to the ideas of the Lagrange and Hamilton functions. These functions are
used to illustrate in a general way the conservation laws of momentum and angular
momentum, and the relation of these laws to the homogeneity and isotropy of space. The
subject of chaos is introduced by considering the motion of a damped, driven pendulum.
A method for solving the non-linear equation of motion of the pendulum is outlined. Wave
motion is treated from the point-of-view of invariance principles. The form of the general
wave equation is derived, and the Lorentz invariance of the phase of a wave is discussed in
Chapter 12. The final chapter deals with the problem of orthogonal functions in general,
and Fourier series, in particular. At this stage in their training, students are often underprepared
in the subject of Differential Equations. Some useful methods of solving ordinary
differential equations are therefore given in an appendix.
The students taking my course were generally required to take a parallel one-year
course in the Mathematics Department that covered Vector and Matrix Algebra and
Analysis at a level suitable for potential majors in Mathematics.
Here, I have presented my version of a first-semester course in Physics — a version
that deals with the essentials in a no-frills way. Over the years, I demonstrated that the
contents of this compact book could be successfully taught in one semester. Textbooks
are concerned with taking many known facts and presenting them in clear and concise
ways; my understanding of the facts is largely based on the writings of a relatively small
number of celebrated authors whose work I am pleased to acknowledge in the
Guilford, Connecticut
February, 2000
1.1 Invariants 1
1.2 Some geometrical invariants 2
1.3 Elements of differential geometry 5
1.4 Gaussian coordinates and the invariant line element 7
1.5 Geometry and groups 10
1.6 Vectors 13
1.7 Quaternions 13
1.8 3-vector analysis 16
1.9 Linear algebra and n-vectors 18
1.10 The geometry of vectors 21
1.11 Linear operators and matrices 24
1.12 Rotation operators 25
1.13 Components of a vector under coordinate rotations 27
2.1 Velocity and acceleration 33
2.2 Differential equations of kinematics 36
2.3 Velocity in Cartesian and polar coordinates 39
2.4 Acceleration in Cartesian and polar coordinates 41
3.1 The Galilean transformation 46
3.2 Einstein’s space-time symmetry: the Lorentz transformation 48
3.3 The invariant interval: contravariant and covariant vectors 51
3.4 The group structure of Lorentz transformations 53
3.5 The rotation group 56
3.6 The relativity of simultaneity: time dilation and length contraction 57
3.7 The 4-velocity 61
4.1 The law of inertia 65
4.2 Newton’s laws of motion 67
4.3 Systems of many interacting particles: conservation of linear and angular
momentum 68
4.4 Work and energy in Newtonian dynamics 74
4.5 Potential energy 76
4.6 Particle interactions 79
4.7 The motion of rigid bodies 84
4.8 Angular velocity and the instantaneous center of rotation 86
4.9 An application of the Newtonian method 88
5.1 Invariance of the potential under translations and the conservation of linear
momentum 94
5.2 Invariance of the potential under rotations and the conservation of angular
momentum 94
6.1 4-momentum and the energy-momentum invariant 97
6.2 The relativistic Doppler shift 98
6.3 Relativistic collisions and the conservation of 4- momentum 99
6.4 Relativistic inelastic collisions 102
6.5 The Mandelstam variables 103
6.6 Positron-electron annihilation-in-flight 106
7.1 Properties of motion along curved paths in the plane 111
7.2 An overview of Newtonian gravitation 113
7.3 Gravitation: an example of a central force 118
7.4 Motion under a central force and the conservation of angular momentum 120
7.5 Kepler’s 2nd law explained 120
7.6 Central orbits 121
7.7 Bound and unbound orbits 126
7.8 The concept of the gravitational field 128
7.9 The gravitational potential 131
8.1 The principle of equivalence 136
8.2 Time and length changes in a gravitational field 138
8.3 The Schwarzschild line element 138
8.4 The metric in the presence of matter 141
8.5 The weak field approximation 142
8.6 The refractive index of space-time in the presence of mass 143
8.7 The deflection of light grazing the sun 144
9.1 The Euler equation 149
9.2 The Lagrange equations 151
9.3 The Hamilton equations 153
10.1 The conservation of mechanical energy 158
10.2 The conservation of linear and angular momentum 158
11.1 The general motion of a damped, driven pendulum 161
11.2 The numerical solution of differential equations 163
12.1 The basic form of a wave 167
12.2 The general wave equation 170
12.3 The Lorentz invariant phase of a wave and the relativistic Doppler shift 171
12.4 Plane harmonic waves 173
12.5 Spherical waves 174
12.6 The superposition of harmonic waves 176
12.7 Standing waves 177
13.1 Definitions 179
13.2 Some trigonometric identities and their Fourier series 180
13.3 Determination of the Fourier coefficients of a function 182
13.4 The Fourier series of a periodic saw-tooth waveform 183
1.1 Invariants
It is a remarkable fact that very few fundamental laws are required to describe the
enormous range of physical phenomena that take place throughout the universe. The
study of these fundamental laws is at the heart of Physics. The laws are found to have a
mathematical structure; the interplay between Physics and Mathematics is therefore
emphasized throughout this book. For example, Galileo found by observation, and
Newton developed within a mathematical framework, the Principle of Relativity:
the laws governing the motions of objects have the same mathematical
form in all inertial frames of reference.
Inertial frames move at constant speed in straight lines with respect to each other – they
are non-accelerating. We say that Newton’s laws of motion are invariant under the
Galilean transformation (see later discussion). The discovery of key invariants of Nature
has been essential for the development of the subject.
Einstein extended the Newtonian Principle of Relativity to include the motions of
beams of light and of objects that move at speeds close to the speed of light. This
extended principle forms the basis of Special Relativity. Later, Einstein generalized the
principle to include accelerating frames of reference. The general principle is known as
the Principle of Covariance; it forms the basis of the General Theory of Relativity ( a theory
of Gravitation).
2 M A T H E M A T I C A L P R E L I M I N A R I E S
A review of the elementary properties of geometrical invariants, generalized
coordinates, linear vector spaces, and matrix operators, is given at a level suitable for a
sound treatment of Classical and Special Relativity. Other mathematical methods,
including contra- and covariant 4-vectors, variational principles, orthogonal functions, and
ordinary differential equations are introduced, as required.
1.2 Some geometrical invariants
In his book The Ascent of Man, Bronowski discusses the lasting importance of the
discoveries of the Greek geometers. He gives a proof of the most famous theorem of
Euclidean Geometry, namely Pythagoras’ theorem, that is based on the invariance of
length and angle ( and therefore of area) under translations and rotations in space. Let a
right-angled triangle with sides a, b, and c, be translated and rotated into the following
four positions to form a square of side c:
2 4
a 3
|(b – a) |
The total area of the square = c2 = area of four triangles + area of shaded square.
If the right-angled triangle is translated and rotated to form the rectangle:
M A T H E M A T I C A L P R E L I M I N A R I E S 3
a a
1 4
b b
2 3
then the area of four triangles = 2ab.
The area of the shaded square area is (b – a)2 = b2 – 2ab + a2
We have postulated the invariance of length and angle under translations and rotations and
c2 = 2ab + (b – a)2
= a2 + b2 . (1.1)
We shall see that this key result characterizes the locally flat space in which we live. It is
the only form that is consistent with the invariance of lengths and angles under
translations and rotations .
The scalar product is an important invariant in Mathematics and Physics. Its invariance
properties can best be seen by developing Pythagoras’ theorem in a three-dimensional
coordinate form. Consider the square of the distance between the points P[x1 , y1 , z1] and
Q[x2 , y2 , z2] in Cartesian coordinates:
4 M A T H E M A T I C A L P R E L I M I N A R I E S
Q[x2 ,y2 ,z2]
P[x1 ,y1 ,z1]

We have
(PQ)2 = (x2 – x1)2 + (y2 – y1)2 + (z2 – z1)2
= x2
2 – 2x1x2 + x1
2 + y2
2 – 2y1y2 + y1
2 + z2
2 – 2z1z2 + z1
= (x1
2 + y1
2 + z1
2) + (x2
2 + y2
2 + z2
2 ) – 2(x1x2 + y1y2 + z1z2)
= (OP)2 + (OQ)2 – 2(x1x2 + y1y2 + z1z2) (1.2)
The lengths PQ, OP, OQ, and their squares, are invariants under rotations and therefore
the entire right-hand side of this equation is an invariant. The admixture of the
coordinates (x1x2 + y1y2 + z1z2) is therefore an invariant under rotations. This term has a
geometric interpretation: in the triangle OPQ, we have the generalized Pythagorean
(PQ)2 = (OP)2 + (OQ)2 – 2OP.OQ cos
OP.OQ cos= x1x2 +y1y2 + z1z2 the scalar product. (1.3)
Invariants in space-time with scalar-product-like forms, such as the interval
between events (see 3.3), are of fundamental importance in the Theory of Relativity.
M A T H E M A T I C A L P R E L I M I N A R I E S 5
Although rotations in space are part of our everyday experience, the idea of rotations in
space-time is counter-intuitive. In Chapter 3, this idea is discussed in terms of the relative
motion of inertial observers.
1.3 Elements of differential geometry
Nature does not prescibe a particular coordinate system or mesh. We are free to
select the system that is most appropriate for the problem at hand. In the familiar
Cartesian system in which the mesh lines are orthogonal, equidistant, straight lines in the
plane, the key advantage stems from our ability to calculate distances given the
coordinates – we can apply Pythagoras’ theorem, directly. Consider an arbitrary mesh:
v – direction P[3u, 4v]
ds, a length
3v dv

Origin O 1u 2u 3u u – direction
Given the point P[3u , 4v], we cannot use Pythagoras’ theorem to calculate the distance
6 M A T H E M A T I C A L P R E L I M I N A R I E S
In the infinitesimal parallelogram shown, we might think it appropriate to write
ds2 = du2 + dv2 + 2dudvcos. (ds2 = (ds)2 , a squared “length” )
This we cannot do! The differentials du and dv are not lengths – they are simply
differences between two numbers that label the mesh. We must therefore multiply each
differential by a quantity that converts each one into a length. Introducing dimensioned
coefficients, we have
ds2 = g11du2 + 2g12dudv + g22dv2 (1.4)
where g11 du and g22 dv are now lengths.
The problem is therefore one of finding general expressions for the coefficients;
it was solved by Gauss, the pre-eminent mathematician of his age. We shall restrict our
discussion to the case of two variables. Before treating this problem, it will be useful to
recall the idea of a total differential associated with a function of more than one variable.
Let u = f(x, y) be a function of two variables, x and y. As x and y vary, the corresponding
values of u describe a surface. For example, if u = x2 + y2, the surface is a paraboloid of
revolution. The partial derivatives of u are defined by
f(x, y)/x = limit as h 0 {(f(x + h, y) – f(x, y))/h} (treat y as a constant), (1.5)
f(x, y)/y = limit as k 0 {(f(x, y + k) – f(x, y))/k} (treat x as a constant). (1.6)
For example, if u = f(x, y) = 3×2 + 2y3 then
f/x = 6x, 2f/x2 = 6, 3f/x3 = 0
M A T H E M A T I C A L P R E L I M I N A R I E S 7
f/y = 6y2, 2f/y2 = 12y, 3f/y3 = 12, and 4f/y4 = 0.
If u = f(x, y) then the total differential of the function is
du = (f/x)dx + (f/y)dy
corresponding to the changes: x x + dx and y y + dy.
(Note that du is a function of x, y, dx, and dy of the independent variables x and y)
1.4 Gaussian coordinates and the invariant line element
Consider the infinitesimal separation between two points P and Q that are
described in either Cartesian or Gaussian coordinates:
y + dy Q v + dv Q
ds ds
y P v P
x x + dx u u + du
Cartesian Gaussian
In the Gaussian system, du and dv do not represent distances.
x = f(u, v) and y = F(u, v) (1.7 a,b)
then, in the infinitesimal limit
dx = (x/u)du + (x/v)dv and dy = (y/u)du + (y/v)dv.
In the Cartesian system, there is a direct correspondence between the mesh-numbers and
distances so that
8 M A T H E M A T I C A L P R E L I M I N A R I E S
ds2 = dx2 + dy2 . (1.8)
dx2 = (x/u)2du2 + 2(x/u)(x/v)dudv + (x/v)2dv2
dy2 = (y/u)2du2 + 2(y/u)(y/v)dudv + (y/v)2dv2.
We therefore obtain
ds2 = {(x/u)2 + (y/u)2}du2 + 2{(x/u)(x/v) + (y/u)(y/v)}dudv
+ {(x/v)2 + (y/v)2}dv2
= g11 du2 + 2g12dudv + g22dv2 . (1.9)
If we put u = u1 and v = u2, then
ds2 = gij dui duj where i,j = 1,2. (1.10)
i j
(This is a general form for an n-dimensional space: i, j = 1, 2, 3, …n).
Two important points connected with this invariant differential line element are:
1. Interpretation of the coefficients gij.
Consider a Euclidean mesh of equispaced parallelograms:
P du Q
M A T H E M A T I C A L P R E L I M I N A R I E S 9
ds2 = 1.du2 + 1.dv2 + 2cosdudv
= g11du2 + g22dv2 + 2g12dudv (1.11)
therefore, g11 = g22 = 1 (the mesh-lines are equispaced)
g12 = coswhere is the angle between the u-v axes.
We see that if the mesh-lines are locally orthogonal then g12 = 0.
2. Dependence of the gij’s on the coordinate system and the local values of u, v.
A specific example will illustrate the main points of this topic: consider a point P
described in three coordinate systems – Cartesian P[x, y], Polar P[r, ], and Gaussian
P[u, v] – and the square ds2 of the line element in each system.
The transformation [x, y] [r, ] is
x = rcosand y = rsin. (1.12 a,b)
The transformation [r, ] [u, v] is direct, namely
r = u and = v.
x/r = cos, y/r = sin, x/= – rsin, y/= rcos
x/u = cosv, y/u = sinv, x/v = – usinv, y/v = ucosv.
The coefficients are therefore
g11 = cos2v + sin2v = 1, (1.13 a-c)
1 0 M A T H E M A T I C A L P R E L I M I N A R I E S
g22 = (–usinv)2 +(ucosv)2 = u2,
g12 = cos(–usinv) + sinv(ucosv) = 0 (an orthogonal mesh).
We therefore have
ds2 = dx2 + dy2 (1.14 a-c)
= du2 + u2dv2
= dr2 + r2d2.
In this example, the coefficient g22 = f(u).
The essential point of Gaussian coordinate systems is that the coefficients, gij ,
completely characterize the surface – they are intrinsic features. We can, in principle,
determine the nature of a surface by measuring the local values of the coefficients as we
move over the surface. We do not need to leave a surface to study its form.
1.5 Geometry and groups
Felix Klein (1849 – 1925), introduced his influential Erlanger Program in 1872. In
this program, Geometry is developed from the viewpoint of the invariants associated with
groups of transformations. In Euclidean Geometry, the fundamental objects are taken to
be rigid bodies that remain fixed in size and shape as they are moved from place to place.
The notion of a rigid body is an idealization.
Klein considered transformations of the entire plane – mappings of the set of all
points in the plane onto itself. The proper set of rigid motions in the plane consists of
translations and rotations. A reflection is an improper rigid motion in the plane; it is a
physical impossibility in the plane itself. The set of all rigid motions – both proper and
M A T H E M A T I C A L P R E L I M I N A R I E S 1 1
improper – forms a group that has the proper rigid motions as a subgroup. A group G is a
set of distinct elements {gi} for which a law of composition “ o ” is given such that the
composition of any two elements of the set satisfies:
Closure: if gi, gj belong to G then gk = gi
o gj belongs to G for all elements gi, gj ,
Associativity: for all gi, gj, gk in G, gi
o (gj
o gk) = (gi
o gj) o gk. .
Furthermore, the set contains
A unique identity, e, such that gi
o e = e o gi = gi for all gi in G,
A unique inverse, gi
–1, for every element gi in G,
such that gi
o gi
–1 = gi
–1 o gi = e.
A group that contains a finite number n of distinct elements gn is said to be a finite group
of order n.
The set of integers Z is a subset of the reals R; both sets form infinite groups under
the composition of addition. Z is a “subgroup“of R.
Permutations of a set X form a group Sx under composition of functions; if a: X X
and b: X X are permutations, the composite function ab: X X given by ab(x) =
a(b(x)) is a permutation. If the set X contains the first n positive numbers, the n!
permutations form a group, the symmetric group, Sn. For example, the arrangements of
the three numbers 123 form the group
S3 = { 123, 312, 231, 132, 321, 213 }.
1 2 M A T H E M A T I C A L P R E L I M I N A R I E S
If the vertices of an equilateral triangle are labelled 123, the six possible symmetry
arrangements of the triangle are obtained by three successive rotations through 120o
about its center of gravity, and by the three reflections in the planes I, II, III:
2 3
This group of “isometries“of the equilateral triangle (called the dihedral group, D3) has the
same structure as the group of permutations of three objects. The groups S3 and D3 are
said to be isomorphic.
According to Klein, plane Euclidean Geometry is the study of those properties of
plane rigid figures that are unchanged by the group of isometries. (The basic invariants are
length and angle). In his development of the subject, Klein considered Similarity
Geometry that involves isometries with a change of scale, (the basic invariant is angle),
Affine Geometry, in which figures can be distorted under transformations of the form
x´ = ax + by + c (1.15 a,b)
y´ = dx + ey + f ,
where [x, y] are Cartesian coordinates, and a, b, c, d, e, f, are real coefficients, and
Projective Geometry, in which all conic sections are equivalent; circles, ellipses, parabolas,
and hyperbolas can be transformed into one another by a projective transformation.
M A T H E M A T I C A L P R E L I M I N A R I E S 1 3
It will be shown that the Lorentz transformations – the fundamental transformations of
events in space and time, as described by different inertial observers – form a group.
1.6 Vectors
The idea that a line with a definite length and a definite direction — a vector — can
be used to represent a physical quantity that possesses magnitude and direction is an
ancient one. The combined action of two vectors A and B is obtained by means of the
parallelogram law, illustrated in the following diagram
A + B
The diagonal of the parallelogram formed by A and B gives the magnitude and direction of
the resultant vector C. Symbollically, we write
C = A + B (1.16)
in which the “=” sign has a meaning that is clearly different from its meaning in ordinary
arithmetic. Galileo used this empirically-based law to obtain the resultant force acting on
a body. Although a geometric approach to the study of vectors has an intuitive appeal, it
will often be advantageous to use the algebraic method – particularly in the study of
Einstein’s Special Relativity and Maxwell’s Electromagnetism.
1.7 Quaternions
In the decade 1830 – 1840, the renowned Hamilton introduced new kinds of
1 4 M A T H E M A T I C A L P R E L I M I N A R I E S
numbers that contain four components, and that do not obey the commutative property of
multiplication. He called the new numbers quaternions. A quaternion has the form
u + xi + yj + zk (1.17)
in which the quantities i, j, k are akin to the quantity i = –1 in complex numbers,
x + iy. The component u forms the scalar part, and the three components xi + yj + zk
form the vector part of the quaternion. The coefficients x, y, z can be considered to be
the Cartesian components of a point P in space. The quantities i, j, k are qualitative units
that are directed along the coordinate axes. Two quaternions are equal if their scalar parts
are equal, and if their coefficients x, y, z of i, j, k are respectively equal. The sum of two
quaternions is a quaternion. In operations that involve quaternions, the usual rules of
multiplication hold except in those terms in which products of i, j, k occur — in these
terms, the commutative law does not hold. For example
j k = i, k j = – i, k i = j, i k = – j, i j = k, j i = – k, (1.18)
(these products obey a right-hand rule),
i2 = j2 = k2 = –1. (Note the relation to i2 = –1). (1.19)
The product of two quaternions does not commute. For example, if
p = 1 + 2i + 3j + 4k, and q = 2 + 3i + 4j + 5k
pq = – 36 + 6i + 12j + 12k
M A T H E M A T I C A L P R E L I M I N A R I E S 1 5
qp = – 36 + 23i – 2j + 9k.
Multiplication is associative.
Quaternions can be used as operators to rotate and scale a given vector into a new
(a + bi + cj + dk)(xi + yj + zk) = (x´i + y´j + z´k)
If the law of composition is quaternionic multiplication then the set
Q = {±1, ±i, ±j, ±k}
is found to be a group of order 8. It is a non-commutative group.
Hamilton developed the Calculus of Quaternions. He considered, for example, the
properties of the differential operator:
= i(/x) + j(/y) + k(/z). (1.20)
(He called this operator “nabla”).
If f(x, y, z) is a scalar point function (single-valued) then
f = i(f/x) + j(f/y) + k(f/z) , a vector.
v = v1i + v2j + v3k
is a continuous vector point function, where the vi’s are functions of x, y, and z, Hamilton
introduced the operation
v = (i/x + j/y + k/z)(v1i + v2j + v3k) (1.21)
= – (v1/x + v2/y + v3/z)
+ (v3/y – v2/z)i + (v1/z – v3/x)j + (v2/x – v1/y)k
1 6 M A T H E M A T I C A L P R E L I M I N A R I E S
= a quaternion.
The scalar part is the negative of the “divergence of v” (a term due to Clifford), and the
vector part is the “curl of v” (a term due to Maxwell). Maxwell used the repeated operator
2, which he called the Laplacian.
1.8 3 – vector analysis
Gibbs, in his notes for Yale students, written in the period 1881 – 1884, and Heaviside, in
articles published in the Electrician in the 1880’s, independently developed 3-dimensional
Vector Analysis as a subject in its own right — detached from quaternions.
In the Sciences, and in parts of Mathematics (most notably in Analytical and Differential
Geometry), their methods are widely used. Two kinds of vector multiplication were
introduced: scalar multiplication and vector multiplication. Consider two vectors v and v´
v = v1e1 + v2e2 + v3e3
v´ = v1´e1 + v2´e2 + v3´e3 .
The quantities e1, e2, and e3 are vectors of unit length pointing along mutually orthogonal
axes, labelled 1, 2, and 3.
i) The scalar multiplication of v and v´ is defined as
v v´ = v1v1´ + v2v2´ + v3v3´, (1.22)
where the unit vectors have the properties
e1 e1 = e2 e2 = e3 e3 = 1, (1.23)
M A T H E M A T I C A L P R E L I M I N A R I E S 1 7
e1 e2 = e2 e1 = e1 e3 = e3 e1 = e2 e3 = e3 e2 = 0. (1.24)
The most important property of the scalar product of two vectors is its invariance
under rotations and translations of the coordinates. (See Chapter 1).
ii) The vector product of two vectors v and v´ is defined as
e1 e2 e3
v v´ = v1 v2 v3 ( where |. . . |is the determinant) (1.25)
v1´ v2´ v3´
= (v2 v3´ – v3v2´)e1 + (v3v1´ – v1v3´)e2 + (v1v2´ – v2v1´)e3 .
The unit vectors have the properties
e1 e1 = e2 e2 = e3 e3 = 0 (1.26 a,b)
(note that these properties differ from the quaternionic products of the i, j, k’s),
e1 e2 = e3 , e2 e1 = – e3 , e2 e3 = e1 , e3 e2 = – e1 , e3 e1 = e2 , e1 e3 = – e2
These non-commuting vectors, or “cross products” obey the standard right-hand-rule.
The vector product of two parallel vectors is zero even when neither vector is zero.
The non-associative property of a vector product is illustrated in the following
e1 e2 e2 = (e1 e2) e2 = e3 e2 = – e1
= e1 (e2 e2) = 0.
1 8 M A T H E M A T I C A L P R E L I M I N A R I E S
Important operations in Vector Analysis that follow directly from those introduced
in the theory of quaternions are:
1) the gradient of a scalar function f(x1, x2, x3)
f = (f/x1)e1 + (f/x2)e2 + (f/x3)e3 , (1.27)
2) the divergence of a vector function v
v = v1/x1 + v2/x2 + v3/x3 (1.28)
where v has components v1, v2, v3 that are functions of x1, x2, x3 , and
3) the curl of a vector function v
e1 e2 e3
v = /x1 /x2 /x3 . (1.29)
v1 v2 v3
The physical significance of these operations is discussed later.
1.9 Linear algebra and n-vectors
A major part of Linear Algebra is concerned with the extension of the algebraic
properties of vectors in the plane (2-vectors), and in space (3-vectors), to vectors in higher
dimensions (n-vectors). This area of study has its origin in the work of Grassmann (1809 –
77), who generalized the quaternions (4-component hyper-complex numbers), introduced
by Hamilton.
An n-dimensional vector is defined as an ordered column of numbers
xn = . (1.30)
M A T H E M A T I C A L P R E L I M I N A R I E S 1 9
It will be convenient to write this as an ordered row in square brackets
xn = [x1, x2, … xn] . (1.31)
The transpose of the column vector is the row vector
T = (x1, x2, …xn). (1.32)
The numbers x1, x2, …xn are called the components of x, and the integer n is the
dimension of x. The order of the components is important, for example
[1, 2, 3] [2, 3, 1].
The two vectors x = [x1, x2, …xn] and y = [y1, y2, …yn] are equal if
xi = yi (i = 1 to n).
The laws of Vector Algebra are
1. x+ y= y+ x . (1.33 a-e)
2. [x + y] + z = x + [y + z] .
3. a[x + y] = ax + ay where a is a scalar .
4. (a + b)x = ax + by where a,b are scalars .
5. (ab)x = a(bx) where a,b are scalars .
If a = 1 and b = –1 then
x + [–x] = 0,
where 0 = [0, 0, …0] is the zero vector.
The vectors x = [x1, x2, …xn] and y = [y1, y2 …yn] can be added to give their sum or
2 0 M A T H E M A T I C A L P R E L I M I N A R I E S
x + y = [x1 + y1, x2 + y2, …,xn + yn]. (1.34)
The set of vectors that obeys the above rules is called the space of all n-vectors or
the vector space of dimension n.
In general, a vector v = ax + by lies in the plane of x and y. The vector v is said
to depend linearly on x and y — it is a linear combination of x and y.
A k-vector v is said to depend linearly on the vectors u1, u2, …uk if there are scalars
ai such that
v = a1u1 +a2u2 + …akuk . (1.35)
For example
[3, 5, 7] = [3, 6, 6] + [0, –1, 1] = 3[1, 2, 2] + 1[0, –1, 1], a linear combination of
the vectors [1, 2, 2] and [0, –1, 1].
A set of vectors u1, u2, …uk is called linearly dependent if one of these vectors
depends linearly on the rest. For example, if
u1 = a2u2 + a3u3 + …+ akuk., (1.36)
the set u1, …uk is linearly dependent.
If none of the vectors u1, u2, …uk can be written linearly in terms of the remaining
ones we say that the vectors are linearly independent.
Alternatively, the vectors u1, u2, …uk are linearly dependent if and only if there is
an equation of the form
c1u1 + c2u2 + …ckuk = 0 , (1.37)
in which the scalars ci are not all zero.
M A T H E M A T I C A L P R E L I M I N A R I E S 2 1
Consider the vectors ei obtained by putting the ith-component equal to 1, and all
the other components equal to zero:
e1 = [1, 0, 0, …0]
e2 = [0, 1, 0, …0]

then every vector of dimension n depends linearly on e1, e2, …en , thus
x = [x1, x2, …xn]
= x1e1 + x2e2 + …xnen. (1.38)
The ei’s are said to span the space of all n-vectors; they form a basis. Every basis of an nspace
has exactly n elements. The connection between a vector x and a definite
coordinate system is made by choosing a set of basis vectors ei.
1.10 The geometry of vectors
The laws of vector algebra can be interpreted geometrically for vectors of
dimension 2 and 3. Let the zero vector represent the origin of a coordinate system, and
let the 2-vectors, x and y, correspond to points in the plane: P[x1, x2] and Q[y1, y2]. The
vector sum x + y is represented by the point R, as shown
R[x1+y1, x2+y2]
2nd component
x2 P[x1, x2]
y2 Q[y1, y2]
O[0, 0]
x1 y1 1st component
2 2 M A T H E M A T I C A L P R E L I M I N A R I E S
R is in the plane OPQ, even if x and y are 3-vectors.
Every vector point on the line OR represents the sum of the two corresponding vector
points on the lines OP and OQ. We therefore introduce the concept of the directed vector
lines OP, OQ, and OR, related by the vector equation
OP + OQ = OR . (1.39)
A vector V can be represented as a line of length OP pointing in the direction of the unit
vector v, thus
V = v.OP
A vector V is unchanged by a pure displacement:
= V2
where the “=” sign means equality in magnitude and direction.
Two classes of vectors will be met in future discussions; they are
1. Polar vectors: the vector is drawn in the direction of the physical quantity being
represented, for example a velocity,
2. Axial vectors: the vector is drawn parallel to the axis about which the physical quantity
acts, for example an angular velocity.
M A T H E M A T I C A L P R E L I M I N A R I E S 2 3
The associative property of the sum of vectors can be readily demonstrated,
We see that
V = A + B + C = (A + B) + C = A + (B + C) = (A + C) + B . (1.40)
The process of vector addition can be reversed; a vector V can be decomposed into the
sum of n vectors of which (n – 1) are arbitrary, and the nth vector closes the polygon. The
vectors need not be in the same plane. A special case of this process is the decomposition
of a 3-vector into its Cartesian components.
A general case A special case
V5 V
V1 V3
V1, V2, V3, V4 : arbitrary Vz closes the polygon
V5 closes the polygon
2 4 M A T H E M A T I C A L P R E L I M I N A R I E S
The vector product of A and B is an axial vector, perpendicular to the plane containing A
and B.
^ B y
A B

a unit vector , + n A
perpendicular to the A, B plane
A B = AB sinn = – B A (1.41)
1.11 Linear Operators and Matrices
Transformations from a coordinate system [x, y] to another system [x´, y´],
without shift of the origin, or from a point P[x, y] to another point P´[x´, y´], in the same
system, that have the form
x´ = ax + by
y´ = cx + dy
where a, b, c, d are real coefficients, can be written in matrix notation, as follows
x´ a b x
= , (1.41)
y´ c d y
x´ = Mx, (1.42)
M A T H E M A T I C A L P R E L I M I N A R I E S 2 5
x = [x, y], and x´ = [x´, y´], both column 2-vectors,
a b
M = ,
c d
a 2 2 matrix operator that “changes” [x, y] into [x´, y´].
In general, M transforms a unit square into a parallelogram:
y y´ [a+b,c+d]
[0,1] [1,1]

[0,0] [1,0] x
This transformation plays a key rôle in Einstein’s Special Theory of Relativity (see
later discussion).
1.12 Rotation operators
Consider the rotation of an x, y coordinate system about the origin through an angle :
y´ y
P[x, y] or P´[x´, y´]

O,O´ x x
2 6 M A T H E M A T I C A L P R E L I M I N A R I E S
From the diagram, we see that
x´ = xcos+ ysin
y´ = – xsin+ ycos
x´ cossinx
= .
y´ – sincosy
P´ = c()P (1.43)
c() = is the rotation operator.
The subscript c denotes a rotation of the coordinates through an angle +.
The inverse operator, c
–1(), is obtained by reversing the angle of rotation: +–.
We see that matrix product
–1() c() = c
T() c() = I (1.44)
where the superscript T indicates the transpose (rows columns), and
1 0
I = is the identity operator. (1.45)
0 1
This is the defining property of an orthogonal matrix.
If we leave the axes fixed and rotate the point P[x, y] to P´[x´, y´], then
M A T H E M A T I C A L P R E L I M I N A R I E S 2 7
we have
y´ P´[x´, y´]
y P[x, y]

O x´ x x
From the diagram, we see that
x´ = xcos– ysin, and y´ = xsin+ ycos
P´ = v()P (1.46)
v() = , the operator that rotates a vector through +.
1.13 Components of a vector under coordinate rotations
Consider a vector V [vx, vy], and the same vector V´ with components [vx’,vy’], in a
coordinate system (primed), rotated through an angle +.
y´ y
V = V´

vx´ 
O, O´ vx x
2 8 M A T H E M A T I C A L P R E L I M I N A R I E S
We have met the transformation [x, y] [x´, y´] under the operation c();
here, we have the same transformation but now it operates on the components of the
vector, vx and vy,
[vx´, vy´] = c()[vx, vy]. (1.47)
1-1 i) If u = 3x/y show that u/x = (3x/yln3)/y and u/y = (–3x/yxln3)/y2.
ii) If u = ln{(x3 + y)/x2} show that u/x = (x3 – 2y)/(x(x3 +y)) and u/y = 1/(x3 + y).
1-2 Calculate the second partial derivatives of
f(x, y) = (1/y)exp{–(x – a)2/4y}, a = constant.
1-3 Check the answers obtained in problem 1-2 by showing that the function f(x, y) in
1-2 is a solution of the partial differential equation 2f/x2 – f/y = 0.
1-4 If f(x, y, z) = 1/(x2 + y2 + z2)1/2 = 1/r, show that f(x, y, z) = 1/r is a solution of Laplace’s
2f/x2 + 2f/y2 + 2f/z2 = 0.
This important equation occurs in many branches of Physics.
1-5 At a given instant, the radius of a cylinder is r(t) = 4cm and its height is h(t) = 10cm.
If r(t) and h(t) are both changing at a rate of 2 cm.s–1, show that the instantaneous
increase in the volume of the cylinder is 192cm3.s–1.
1-6 The transformation between Cartesian coordinates [x, y, z] and spherical polar
coordinates [r, , ] is
M A T H E M A T I C A L P R E L I M I N A R I E S 2 9
x = rsincos, y = rsinsin, z = rcos.
Show, by calculating all necessary partial derivatives, that the square of the line
element is
ds2 = dr2 + r2sin2d2 + r2d2.
Obtain this result using geometrical arguments. This form of the square of the line
element will be used on several occasions in the future.
1-7 Prove that the inverse of each element of a group is unique.
1-8 Prove that the set of positive rational numbers does not form a group under division.
1-9 A finite group of order n has n2 products that may be written in an nn array, called
the group multiplication table. For example, the 4th-roots of unity {e, a, b, c} = {±1, ±i},
where i = –1, forms a group under multiplication (1i = i, i(–i) = 1, i2 = –1, (–i)2 = –1,
etc. ) with a multiplication table
e = 1 a = i b = –1 c = –i
e 1 i –1 –i
a i –1 –i 1
b –1 –i 1 i
c –i 1 i –1
In this case, the table is symmetric about the main diagonal; this is a characteristic feature
of a group in which all products commute (ab = ba) — it is an Abelian group.
If G is the dihedral group D3, discussed in the text, where G = {e, a, a2, b, c, d},
where e is the identity, obtain the group multiplication table. Is it an Abelian group?.
3 0 M A T H E M A T I C A L P R E L I M I N A R I E S
Notice that the three elements {e, a, a2} form a subgroup of G, whereas the three
elements {b, c, d} do not; there is no identity in this subset.
The group D3 has the same multiplication table as the group of permutations of
three objects. This is the condition that signifies group isomorphism.
1-10 Are the sets
i) {[0, 1, 1], [1, 0, 1], [1, 1, 0]}
ii) {[1, 3, 5, 7], [4, –3, 2, 1], [2, 1, 4, 5]}
linearly dependent? Explain.
1-11 i) Prove that the vectors [0, 1, 1], [1, 0, 1], [1, 1, 0] form a basis for Euclidean space
ii) Do the vectors [1, i] and [i, –1], (i = –1), form a basis for the complex space C2?
1-12 Interpret the linear independence of two 3-vectors geometrically.
1-13 i) If X = [1, 2, 3] and Y = [3, 2, 1], prove that their cross product is orthogonal to
the X-Y plane.
ii) If X and Y are 3-vectors, prove that XY = 0 iff X and Y are linearly dependent.
1-14 If
a11 a12 a13
T = a21 a22 a23
0 0 1
represents a linear transformation of the plane under which distance is an invariant,
show that the following relations must hold :
2 + a21
2 = a12
2 + a22
2 = 1, and a11a12 + a21a22 = 0.
M A T H E M A T I C A L P R E L I M I N A R I E S 3 1
1-15 Determine the 22 transformation matrix that maps each point [x, y] of the plane
onto its image in the line y = x3 (Note that the transformation can be considered as
the product of three successive operations).
1-16 We have used the convention that matrix operators operate on column vectors “on
their right”. Show that a transformation involving row 2-vectors has the form
(x´, y´) = (x, y)MT
where MT is the transpose of the 22 matrix, M.
1-17 The 22 complex matrices (the Pauli matrices)
1 0 0 1 0 –i 1 0
I = , 1 = , 2 = , 3 =
0 1 1 0 i 0 0 –1
play an important part in Quantum Mechanics. Show that they have the properties
12 = i3, 23 = i1, 31 = i2,
ik + ki = 2ikI (i, k = 1, 2, 3) where ik is the Kronecker delta. Here,
the subscript i is not –1.

2.1 Velocity and acceleration
The most important concepts in Kinematics — a subject in which the properties of
the forces responsible for the motion are ignored — can be introduced by studying the
simplest of all motions, namely that of a point P moving in a straight line.
Let a point P[t, x] be at a distance x from a fixed point O at a time t, and let it be at
a point P´[t´, x´] = P´[t + t, x + x] at a time t later. The average speed of P in the
interval t is
= x/t. (2.1)
If the ratio x/t is not constant in time, we define the instantaneous speed of P at time
t as the limiting value of the ratio as t 0:

vp = vp(t) = limit as t 0 of x/t = dx/dt = x = vx .
The instantaneous speed is the magnitude of a vector called the instantaneous
velocity of P:
v = dx/dt , a quantity that has both magnitude and direction. (2.2)
A space-time curve is obtained by plotting the positions of P as a function of t:
x vp´
vp P´
O t
34 K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N
The tangent of the angle made by the tangent to the curve at any point gives the value of
the instantaneous speed at the point.
The instantaneous acceleration, a , of the point P is given by the time rate-of-change
of the velocity
a = dv/dt = d(dx/dt)/dt = d2x/dt2 = x . (2.3)
A change of variable from t to x gives
a = dv/dt = dv(dx/dt)/dx = v(dv/dx). (2.4)
This is a useful relation when dealing with problems in which the velocity is given as a
function of the position. For example
v vP

O N Q x
The gradient is dv/dx and tan= dv/dx, therefore
NQ, the subnormal, = v(dv/dx) = ap, the acceleration of P. (2.5)
The area under a curve of the speed as a function of time between the times t1
and t2 is
[A][t1,t2] = [t1,t2] v(t)dt = [t1,t2] (dx/dt)dt = [x1,x2] dx = (x2 – x1)
= distance travelled in the time t2 – t1. (2.6)
K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N 35
The solution of a kinematical problem is sometimes simplified by using a graphical
method, for example:
A point A moves along an x-axis with a constant speed vA. Let it be at the origin O
(x = 0) at time t = 0. It continues for a distance xA, at which point it decelerates at a
constant rate, finally stopping at a distance X from O at time T.
A second point B moves away from O in the +x-direction with constant
acceleration. Let it begin its motion at t = 0. It continues to accelerate until it reaches a
maximum speed vB
max at a time tB
max when at xB
max from O. At xB
max, it begins to decelerate
at a constant rate, finally stopping at X at time T: To prove that the maximum speed of B
during its motion is
max = vA{1 – (xA/2X)}–1, a value that is independent of the time at which
the maximum speed is reached.
The velocity-time curves of the points are
A possible path for B
vA B
t = 0 tA tB
max T t
x = 0 xA xB
max X
The areas under the curves give X = vAtA + vA(T – tA)/2 = vB
maxT/2, so that
max = vA(1 + (tA/T)), but vAT = 2X – xA, therefore vB
max = vA{1 – (xA/2X)}–1 f(tB
36 K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N
2.2 Differential equations of kinematics
If the acceleration is a known function of time then the differential equation
a(t) = dv/dt (2.7)
can be solved by performing the integrations (either analytically or numerically)
a(t)dt = dv (2.8)
If a(t) is constant then the result is simply
at + C = v, where C is a constant that is given by the initial conditions.
Let v = u when t = 0 then C = u and we have
at + u = v. (2.9)
This is the standard result for motion under constant acceleration.
We can continue this approach by writing:
v = dx/dt = u + at.
Separating the variables,
dx = udt + atdt.
Integrating gives
x = ut + (1/2)at2 + C´ (for constant a).
If x = 0 when t = 0 then C´ = 0, and
x(t) = ut + (1/2)at2. (2.10)
Multiplying this equation throughout by 2a gives
2ax = 2aut + (at)2
= 2aut + (v – u)2
K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N 37
and therefore, rearranging, we obtain
v2 = 2ax – 2aut + 2vu – u2
= 2ax + 2u(v – at) – u2
= 2ax + u2. (2.11)
In general, the acceleration is a given function of time or distance or velocity:
1) If a = f(t) then
a = dv/dt =f(t), (2.12)
dv = f(t)dt,
v = f(t)dt + C(a constant).
This equation can be written
v = dx/dt = F(t) + C,
dx = F(t)dt + Cdt.
Integrating gives
x(t) = F(t)dt + Ct + C´. (2.13)
The constants of integration can be determined if the velocity and the position are known
at a given time.
2) If a = g(x) = v(dv/dx) then (2.14)
vdv = g(x)dx.
Integrating gives
38 K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N
v2 = 2g(x)dx + D,
v2 = G(x) + D
so that
v = (dx/dt) = ±(G(x) + D). (2.15)
Integrating this equation leads to
±dx/{(G(x) + D)} = t + D´. (2.16)
Alternatively, if
a = d2x/dt2 = g(x)
then, multiplying throughout by 2(dx/dt)gives
2(dx/dt)(d2x/dt2) = 2(dx/dt)g(x).
Integrating then gives
(dx/dt) 2 = 2g(x)dx + D etc.
As an example of this method, consider the equation of simple harmonic motion (see later
d2x/dt2 = –2x. (2.17)
Multiply throughout by 2(dx/dt), then
2(dx/dt)d2x/dt2 = –22x(dx/dt).
This can be integrated to give
(dx/dt)2 = –2×2 + D.
If dx/dt = 0 when x = A then D = 2A2, therefore
K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N 39
(dx/dt)2 = 2(A2 – x2) = v2,
so that
dx/dt = ±(A2 – x2).
Separating the variables, we obtain
– dx/{(A2 – x2)} = dt. (The minus sign is chosen because dx and dt have
opposite signs).
Integrating, gives
cos–1(x/A) = t + D´.
But x = A when t = 0, therefore D´ = 0, so that
x(t) = Acos(t), where A is the amplitude. (2.18)
3) If a = h(v), then (2.19)
dv/dt = h(v)
dv/h(v) = dt,
dv/h(v) = t + B. (2.20)
Some of the techniques used to solve ordinary differential equations are discussed
in Appendix A.
2.3 Velocity in Cartesian and polar coordinates
The transformation from Cartesian to Polar Coordinates is represented by the
linear equations
40 K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N
x = rcosand y = rsin, (2.21 a,b)
x = f(r, ) and y = g(r, ).
The differentials are
dx = (f/r)dr + (f/)dand dy = (g/r)dr + (g/)d.
We are interested in the transformation of the components of the velocity vector under
[x, y] [r, ]. The velocity components involve the rates of change of dx and dy with
respect to time:
dx/dt = (f/r)dr/dt + (f/)d/dt and dy/dt = (g/r)dr/dt + (g/)d/dt
x = (f/r)r + (f/)and y = (g/r)r + (g/). (2.22)
f/r = cos, f/= –rsin, g/r = sin, and g/= rcos,
therefore, the velocity transformations are
x = cosr – sin(r ) = vx (2.23)
y = sinr + cos(r ) = vy. (2.24)
These equations can be written
vx cos–sindr/dt
= .
vy sincosrd/dt
Changing –, gives the inverse equations
K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N 41
dr/dt cossinvx
rd/dt –sincosvy
vr vx
= c() . (2.25)
The velocity components in [r, ] coordinates are therefore
|v| = r = rd/dt |vr| = r =dr/dt
P[r, ]
r +, anticlockwise
O x
The quantity d/dt is called the angular velocity of P about the origin O.
2.4 Acceleration in Cartesian and polar coordinates
We have found that the velocity components transform from [x, y] to [r, ]
coordinates as follows
vx = cosr – sin(r ) = x
vy = sinr + cos(r ) = y.
The acceleration components are given by
ax = dvx/dt and vy = dvy/dt
We therefore have
ax = (d/dt){cosr – sin(r )} (2.26)
42 K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N
= cos(r – r 2) – sin(2r + r )
ay = (d/dt){sinr + cos(r )} (2.27)
= cos(2r + r ) + sin(r – r 2).
These equations can be written
ar cossinax
= . (2.28)
The acceleration components in [r, ] coordinates are therefore
|a| = 2r + r •••
|ar| = r – r 2
P[r, ]
r 
O x
These expressions for the components of acceleration will be of key importance in
discussions of Newton’s Theory of Gravitation.
We note that, if r is constant, and the angular velocity is constant then
a= r = r= 0, (2.29)

ar = – r 2 = – r2 = – r(v/r)2 = – v
2/r, (2.30)
and •
v= r = r. (2.31)
These equations are true for circular motion.
K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N 43
2-1 A point moves with constant acceleration, a, along the x-axis. If it moves distances x1
and x2 in successive intervals of time t1 and t2, prove that the acceleration is
a = 2(v2 – v1)/T
where v1 = x1/t1, v2 = x2/t2, and T = t1 + t2.
2-2 A point moves along the x-axis with an instantaneous deceleration (negative
a(t) –vn+1(t)
where v(t) is the instantaneous speed at time t, and n is a positive integer. If the
initial speed of the point is u (at t = 0), show that
knt = {(un – vn)/(uv)n}/n, where kn is a constant of proportionality,
and that the distance travelled, x(t), by the point from its initial position is
knx(t) = {(un–1 – vn–1)/(uv)n–1}/(n – 1).
2-3 A point moves along the x-axis with an instantaneous deceleration kv3(t), where v(t) is
the speed and k is a constant. Show that
v(t) = u/(1 + kux(t))
where x(t) is the distance travelled, and u is the initial speed of the point.
2-4 A point moves along the x-axis with an instantaneous acceleration
d2x/dt2 = – 2/x2
where is a constant. If the point starts from rest at x = a, show that the speed of
44 K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N
the particle is
dx/dt = – {2(a – x)/(ax)}1/2.
Why is the negative square root chosen?
2-5 A point P moves with constant speed v along the x-axis of a Cartesian system, and a
point Q moves with constant speed u along the y-axis. At time t = 0, P is at x = 0, and
Q, moving towards the origin, is at y = D. Show that the minimum distance, dmin,
between P and Q during their motion is
dmin = D{1/(1 + (u/v)2)}1/2.
Solve this problem in two ways:1) by direct minimization of a function, and 2) by a
geometrical method that depends on the choice of a more suitable frame of reference
(for example, the rest frame of P).
2-6 Two ships are sailing with constant velocities u and v on straight courses that are
inclined at an angle . If, at a given instant, their distances from the point of
intersection of their courses are a and b, find their minimum distance apart.
2-7 A point moves along the x-axis with an acceleration a(t) = kt2, where t is the time the
point has been in motion, and k is a constant. If the initial speed of the point is u,
show that the distance travelled in time t is
x(t) = ut + (1/12)kt4.
2-8 A point, moving along the x-axis, travels a distance x(t) given by the equation
x(t) = aexp{kt} + bexp{–kt}
where a, b, and k are constants. Prove that the acceleration of the point is
K I N E M A T I C S : T H E G E O M E T R Y O F M O T I O N 45
proportional to the distance travelled.
2-9 A point moves in the plane with the equations of motion
d2x/dt2 –2 1 x
= .
d2y/dt2 1 –2 y
Let the following coordinate transformation be made
u = (x + y)/2 and v = (x – y)/2.
Show that in the u-v frame, the equations of motion have a simple form, and that the
time-dependence of the coordinates is given by
u = Acost + Bsint,
v = Ccos3 t + Dsin3 t, where A, B, C, D are constants.
This coordinate transformation has “diagonalized” the original matrix:
–2 1 –1 0
1 –2 0 –3
The matrix with zeros everywhere, except along the main diagonal, has the
interesting property that it simply scales the vectors on which it acts — it does not
rotate them. The scaling values are given by the diagonal elements, called the
eigenvalues of the diagonal matrix. The scaled vectors are called eigenvectors. A
small industry exists that is devoted to finding optimum ways of diagonalizing large
matrices. Illustrate the motion of the system in the x-y frame and in the u-v frame.
3.1 The Galilean transformation
Events belong to the physical world — they are not abstractions. We shall,
nonetheless, introduce the idea of an ideal event that has neither extension nor duration.
Ideal events may be represented as points in a space-time geometry. An event is
described by a four-vector E[t, x, y, z] where t is the time, and x, y, z are the spatial
coordinates, referred to arbitrarily chosen origins.
Let an event E[t, x], recorded by an observer O at the origin of an x-axis, be
recorded as the event E´[t´, x´] by a second observer O´, moving at constant speed V
along the x-axis. We suppose that their clocks are synchronized at t = t´ = 0 when they
coincide at a common origin, x = x´ = 0.
At time t, we write the plausible equations
t´ = t
x´ = x – Vt,
where Vt is the distance travelled by O´ in a time t. These equations can be written
E´ = GE (3.1)
1 0
G = .
–V 1
C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 47
G is the operator of the Galilean transformation.
The inverse equations are
t = t´
x = x´ + Vt´
E = G–1E´ (3.2)
where G–1 is the inverse Galilean operator. (It undoes the effect of G).
If we multiply t and t´ by the constants k and k´, respectively, where k and
k´have dimensions of velocity then all terms have dimensions of length.
In space-space, we have the Pythagorean form x2 + y2 = r2 (an invariant under
rotations). We are therefore led to ask the question: is (kt)2 + x2 an invariant under G in
space-time? Direct calculation gives
(kt)2 + x2 = (k´t´)2 + x´2 + 2Vx´t´ + V2t´2
= (k´t´)2 + x´2 only if V = 0 !
We see, therefore, that Galilean space-time does not leave the sum of squares invariant.
We note, however, the key rôle played by acceleration in Galilean-Newtonian physics:
The velocities of the events according to O and O´ are obtained by differentiating
x´ = –Vt + x with respect to time, giving
v´= –V + v, (3.3)
48 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y
a result that agrees with everyday observations.
Differentiating v´ with respect to time gives
dv´/dt´= a´ = dv/dt = a (3.4)
where a and a´are the accelerations in the two frames of reference. The classical
acceleration is an invariant under the Galilean transformation. If the relationship
v´= v – V is used to describe the motion of a pulse of light, moving in empty space at
v = c 3 x 108 m/s, it does not fit the facts. For example, if V is 0.5c, we expect to obtain
v´ = 0.5c, whereas, it is found that v´ = c. Indeed, in all cases studied, v´ = c for all
values of V.
3.2 Einstein’s space-time symmetry: the Lorentz transformation
It was Einstein, above all others , who advanced our understanding of the nature of
space-time and relative motion. He made use of a symmetry argument to find the changes
that must be made to the Galilean transformation if it is to account for the relative motion
of rapidly moving objects and of beams of light. Einstein recognized an inconsistency in
the Galilean-Newtonian equations, based as they are, on everyday experience. The
discussion will be limited to non-accelerating, or so called inertial, frames
We have seen that the classical equations relating the events E and E´ are
E´ = GE, and the inverse E = G–1E´ where
1 0 1 0
G = and G–1 = .
–V 1 V 1
These equations are connected by the substitution V –V; this is an algebraic statement
of the Newtonian principle of relativity. Einstein incorporated this principle in his theory.
C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 49
He also retained the linearity of the classical equations in the absence of any evidence to
the contrary. (Equispaced intervals of time and distance in one inertial frame remain
equispaced in any other inertial frame). He therefore symmetrized the space-time
equations as follows:
t´ 1 –V t
= . (3.5)
x´ –V 1 x
Note, however, the inconsistency in the dimensions of the time-equation that has now
been introduced:
t´ = t – Vx.
The term Vx has dimensions of [L]2/[T], and not [T]. This can be corrected by introducing
the invariant speed of light, c — a postulate in Einstein’s theory that is consistent with the
result of the Michelson-Morley experiment:
ct´ = ct – Vx/c
so that all terms now have dimensions of length.
Einstein went further, and introduced a dimensionless quantity instead of the
scaling factor of unity that appears in the Galilean equations of space-time. This factor
must be consistent with all observations. The equations then become
ct´= ct – x
x´ = –ct + x , where =V/c.
These can be written
E´ = LE, (3.6)
50 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y
L = ,
and E = [ct, x] .
L is the operator of the Lorentz transformation.
The inverse equation is
E = L–1E´ (3.7)
L–1 = .
This is the inverse Lorentz transformation, obtained from L by changing –
(V –V); it has the effect of undoing the transformation L. We can therefore write
LL–1 = I (3.8)
Carrying out the matrix multiplications, and equating elements gives
2 – 22 = 1
= 1/(1 – 2) (taking the positive root). (3.9)
As V 0, 0 and therefore 1; this represents the classical limit in which the
Galilean transformation is, for all practical purposes, valid. In particular, time and space
intervals have the same measured values in all Galilean frames of reference, and
acceleration is the single fundamental invariant.
C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 51
3.3 The invariant interval: contravariant and covariant vectors
Previously, it was shown that the space-time of Galileo and Newton is not
Pythagorean under G. We now ask the question: is Einsteinian space-time Pythagorean
under L ? Direct calculation leads to
(ct)2 + x2 = 2(1 + 2)(ct´)2 + 42x´ct´
+2(1 + 2)x´2
(ct´)2 + x´2 if > 0.
Note, however, that the difference of squares is an invariant:
(ct)2 – x2 = (ct´)2 – x´2 (3.10)
2(1 – 2) = 1.
Space-time is said to be pseudo-Euclidean. The negative sign that characterizes Lorentz
invariance can be included in the theory in a general way as follows.
We introduce two kinds of 4-vectors
xμ = [x0, x1, x2, x3], a contravariant vector, (3.11)
xμ = [x0, x1, x2, x3], a covariant vector, where
xμ = [x0, –x1, –x2, –x3]. (3.12)
The scalar (or inner) product of the vectors is defined as
xμTxμ =(x0, x1, x2, x3)[x0, –x1, –x2, –x3], to conform to matrix multiplication
row column
52 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y
=(x0)2 – ((x1)2 + (x2)2 + (x3)2) . (3.13)
The superscript T is usually omitted in writing the invariant; it is implied in the form xx.
The event 4-vector is
Eμ = [ct, x, y, z] and the covariant form is
Eμ = [ct, –x, –y, –z]
so that the invariant scalar product is
EμEμ = (ct)2 – (x2 + y2 + z2). (3.14)
A general Lorentz 4-vector xμ transforms as follows:
x’μ = Lxμ (3.15)
–0 0
L = –0 0
0 0 1 0
0 0 0 1
This is the operator of the Lorentz transformation if the motion of O´ is along the x-axis of
O’s frame of reference, and the initial times are synchronized (t = t´ = 0 at x = x´ = 0).
Two important consequences of the Lorentz transformation, discussed in 3.5, are
that intervals of time measured in two different inertial frames are not the same; they are
related by the equation
t´ = t (3.16)
where t is an interval measured on a clock at rest in O’s frame, and distances are given by
l´ = l/(3.17)
C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 53
where l is a length measured on a ruler at rest in O’s frame.
3.4 The group structure of Lorentz transformations
The square of the invariant interval s, between the origin [0, 0, 0, 0] and an
arbitrary event x= [x0, x1, x2, x3] is, in index notation
s2 = xx= x´x´, (sum over = 0, 1, 2, 3). (3.18)
The lower indices can be raised using the metric tensor = diag(1, –1, –1, –1), so that
s2 = xx= x´x´v , (sum over and ). (3.19)
The vectors now have contravariant forms.
In matrix notation, the invariant is
s2 = xT x = x´T x´ . (3.20)
(The transpose must be written explicitly).
The primed and unprimed column matrices (contravariant vectors) are related by the
Lorentz matrix operator, L
x´ = Lx .
We therefore have
xT x = (Lx)T (Lx)
= xTLT Lx .
The x’s are arbitrary, therefore
LT L = . (3.21)
This is the defining property of the Lorentz transformations.
54 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y
The set of all Lorentz transformations is the set L of all 4 x 4 matrices that satisfies
the defining property
L = {L: LT L = ; L all 4 x 4 real matrices; = diag(1, –1, –1, –1}.
(Note that each L has 16 (independent) real matrix elements, and therefore belongs to
the 16-dimensional space, R16).
Consider the result of two successive Lorentz transformations L1 and L2 that
transform a 4-vector x as follows
xx´ x´´
x´ = L1x ,
x´´ = L2x´.
The resultant vector x´´ is given by
x´´ = L2(L1x)
= L2L1x
= Lcx
Lc = L2L1 (L1 followed by L2). (3.22)
If the combined operation Lc is always a Lorentz transformation then it must satisfy
T Lc = .
We must therefore have
C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 55
(L2L1)T (L2L1) =
T L2)L1 =
so that
T L1 = , (L1, L2 L)
Lc = L2L1 L . (3.23)
Any number of successive Lorentz transformations may be carried out to give a resultant
that is itself a Lorentz transformation.
If we take the determinant of the defining equation of L,
det(LT L) = det
we obtain
(detL)2 = 1 (detL = detLT)
so that
detL = ±1. (3.24)
Since the determinant of L is not zero, an inverse transformation L–1 exists, and the
equation L–1L = I, the identity, is always valid.
Consider the inverse of the defining equation
(LT L)–1 = –1 ,
L–1 –1(LT)–1 = –1 .
56 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y
Using = –1, and rearranging, gives
L–1 (L–1)T = . (3.25)
This result shows that the inverse L–1 is always a member of the set L.
The Lorentz transformations L are matrices, and therefore they obey the
associative property under matrix multiplication.
We therefore see that
1. If L1 and L2 L , then L2 L1 L
2. If L L , then L–1 L
3. The identity I = diag(1, 1, 1, 1) L
4. The matrix operators L obey associativity.
The set of all Lorentz transformations therefore forms a group.
3.5 The rotation group
Spatial rotations in two and three dimensions are Lorentz transformations in which
the time-component remains unchanged. In Chapter 1, the geometrical properties of the
rotation operators are discussed. In this section, we shall consider the algebraic structure
of the operators.
Let be a real 33 matrix that is part of a Lorentz transformation with a constant
1 0 0 0
0 (3.26)
L = 0 .
C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 57
In this case, the defining property of the Lorentz transformations leads to
1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0
0 0 -1 0 0 0 0 -1 0 0
0 T 0 0 -1 0 0 = 0 0 -1 0 (3.27)
0 0 0 0 -1 0 0 0 0 -1
so that
T = I , the identity matrix, diag(1,1,1).
This is the defining property of a three-dimensional orthogonal matrix. (The related two –
dimensional case is treated in Chapter 1).
If x = [x1, x2, x3] is a three-vector that is transformed under to give x´ then
x´Tx´ = xT T x = xTx = x1
2 + x2
2 + x3
2 = invariant under . (3.28)
The action of on any three-vector preserves length. The set of all 33 orthogonal
matrices is denoted by O(3),
O(3) = { : T = I, rij Reals}.
The elements of this set satisfy the four group axioms.
3.6 The relativity of simultaneity: time dilation and length contraction
In order to record the time and place of a sequence of events in a particular
inertial reference frame, it is necessary to introduce an infinite set of adjacent “observers”,
located throughout the entire space. Each observer, at a known, fixed position in the
reference frame, carries a clock to record the time and the characteristic property of every
58 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y
event in his immediate neighborhood. The observers are not concerned with non-local
events. The clocks carried by the observers are synchronized — they all read the same
time throughout the reference frame. The process of synchronization is discussed later. It
is the job of the chief observer to collect the information concerning the time, place, and
characteristic feature of the events recorded by all observers, and to construct the world
line (a path in space-time), associated with a particular characteristic feature (the type of
particle, for example).
Consider two sources of light, 1 and 2, and a point M midway between them. Let
E1 denote the event “flash of light leaves 1”, and E2 denote the event “flash of light leaves
2”. The events E1 and E2 are simultaneous if the flashes of light from 1 and 2 reach M at
the same time. The fact that the speed of light in free space is independent of the speed
of the source means that simultaneity is relative.
The clocks of all the observers in a reference frame are synchronized by correcting
them for the speed of light as follows:
Consider a set of clocks located at x0, x1, x2, x3, … along the x-axis of a reference
frame. Let x0 be the chief’s clock, and let a flash of light be sent from the clock at x0 when
it is reading t0 (12 noon, say). At the instant that the light signal reaches the clock at x1, it
is set to read t0 + (x1/c), at the instant that the light signal reaches the clock at x2, it is set
to read t0 + (x2/c) , and so on for every clock along the x-axis. All clocks in the reference
frame then “read the same time” — they are synchronized. From the viewpoint of all other
inertial observers, in their own reference frames, the set of clocks, sychronized using the
above procedure, appears to be unsychronized. It is the lack of symmetry in the
C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 59
sychronization of clocks in different reference frames that leads to two non-intuitive results
namely, length contraction and time dilation.
Length contraction: an application of the Lorentz transformation.
Consider a rigid rod at rest on the x-axis of an inertial reference frame S´. Because it is at
rest, it does not matter when its end-points x1´ and x2´ are measured to give the rest-, or
proper-length of the rod, L0´ = x2´ – x1´.
Consider the same rod observed in an inertial reference frame S that is moving with
constant velocity –V with its x-axis parallel to the x´-axis. We wish to determine the
length of the moving rod; we require the length L = x2 – x1 according to the observers in
S. This means that the observers in S must measure x1 and x2 at the same time in their
reference frame. The events in the two reference frames S, and S´ are related by the
spatial part of the Lorentz transformation:
x´ = –ct + x
and therefore
x2´ – x1´ = –c(t2 – t1) + (x2 – x1).
= V/c and = 1/(1 – 2).
Since we require the length (x2 – x1) in S to be measured at the same time in S, we must
have t2 – t1 = 0, and therefore
L0´ = x2´ – x1´ = (x2 – x1) ,
60 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y
L0´(at rest) = L (moving). (3.29)
The length of a moving rod, L, is therefore less than the length of the same rod measured
at rest, L0 because > 1.
Time dilation
Consider a clock at rest at the origin of an inertial frame S´, and a set of
synchronized clocks at x0, x1, x2, … on the x-axis of another inertial frame S. Let S´ move at
constant speed V relative to S, along the common x -, x´- axis. Let the clocks at xo, and xo´
be sychronized to read t0 , and t0´ at the instant that they coincide in space. A proper time
interval is defined to be the time between two events measured in an inertial frame in
which the two events occur at the same place. The time part of the Lorentz
transformation can be used to relate an interval of time measured on the single clock in
the S´ frame, and the same interval of time measured on the set of synchronized clocks at
rest in the S frame. We have
ct = ct´ + x´
c(t2 – t1) = c(t2´ – t1´) + (x2´ – x1´).
There is no separation between a single clock and itself, therefore x2´ – x1´ = 0, so that
c(t2 – t1)(moving) = c(t2´ – t1´)(at rest) (> 1). (3.30)
A moving clock runs more slowly than a clock at rest.
In Chapter 1, it was shown that the general 2 2 matrix operator transforms rectangular
coordinates into oblique coordinates. The Lorentz transformation is a special case of the
C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 61
2 2 matrices, and therefore its effect is to transform rectangular space-time coordinates
into oblique space-time coordinates:
x x´
E[ct, x] or E´[ct´, x´]
The geometrical form of the Lorentz transformation
The symmetry of space-time means that the transformed axes rotate through equal
angles, tan–1. The relativity of simultaneity is clearly exhibited on this diagram: two
events that occur at the same time in the ct, x -frame necessarily occur at different times in
the oblique ct´, x´-frame.
3.7 The 4-velocity
A differential time interval, dt, cannot be used in a Lorentz-invariant way in
kinematics. We must use the proper time differential interval, d, defined by
(cdt)2 – dx2 = (cdt´)2 – dx´2 (cd)2. (3.31)
The Newtonian 3-velocity is
vN = [dx/dt, dy/dt, dz/dt],
62 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y
and this must be replaced by the 4-velocity
Vμ = [d(ct)/d, dx/d, dy/d, dz/d]
= [d(ct)/dt, dx/dt, dy/dt, dz/dt](dt/d)
= [c, vN] . (3.32)
The scalar product is then
VμVμ = (c)2 – (vN)2 (the transpose is understood)
= (c)2(1 – (vN/c)2)
= c2. (3.33)
The magnitude of the 4-velocity is therefore Vμ= c, the invariant speed of light.
3-1 Two points, A and B, move in the plane with constant velocities |vA| = 2 m.s–1 and
|vB| = 22 m.s–1. They move from their initial (t = 0) positions, A(0)[1, 1] and B(0)[6, 2]
as shown:
y, m
2 B(0)
1 A(0) R(0)
0 1 2 3 4 5 6 7 8 x, m
C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y 63
Show that the closest distance between the points is |R|min = 2.529882..meters,
and that it occurs 1.40…seconds after they leave their initial positions. (Remember
that all inertial frames are equivalent, therefore choose the most appropriate for
dealing with this problem).
3-2 Show that the set of all standard (motion along the common x-axis) Galilean
transformations forms a group.
3-3 A flash of light is sent out from a point x1 on the x-axis of an inertial frame S, and it is
received at a point x2 = x1 + l. Consider another inertial frame, S´, moving with
constant speed V = c along the x-axis; show that, in S´:
i) the separation between the point of emission and the point of reception of the light
is l´ = l{(1 – )/(1 + )}1/2
ii) the time interval between the emission and reception of the light is
t´ = (l/c){(1 – )/(1 + )}1/2.
3-4 The distance between two photons of light that travel along the x-axis of an inertial
frame, S, is always l. Show that, in a second inertial frame, S´, moving at constant
speed V = c along the x-axis, the separation between the two photons is
x´ = l{(1 + )/(1 – )}1/2.
3-5 An event [ct, x] in an inertial frame, S, is transformed under a standard Lorentz
transformation to [ct´, x´] in a standard primed frame, S´, that has a constant speed V
64 C L A S S I C A L A N D S P E C I A L R E L A T I V I T Y
along the x-axis, show that the velocity components of the point x, x´ are related by
the equation
vx = (vx´ + V)/(1 + (vx´V/c2)).
3-6 An object called a K0-meson decays when at rest into two objects called -mesons
(±), each with a speed of 0.8c. If the K0-meson has a measured speed of 0.9c when it
decays, show that the greatest speed of one of the -mesons is (85/86)c and that its
least speed is (5/14)c.
Although our discussion of the geometry of motion has led to major advances in our
understanding of measurements of space and time in different inertial systems, we have
yet to come to the crux of the matter, namely — a discussion of the effects of forces on the
motion of two or more interacting particles. This key branch of Physics is called
Dynamics. It was founded by Galileo and Newton and perfected by their followers, most
notably Lagrange and Hamilton. We shall see that the Newtonian concepts of mass,
momentum and kinetic energy require fundamental revisions in the light of the Einstein’s
Special Theory of Relativity. The revised concepts come about as a result of Einstein’s
recognition of the crucial rôle of the Principle of Relativity in unifying the dynamics of all
mechanical and optical phenomena. In spite of the conceptual difficulties inherent in the
classical concepts, (difficulties that will be discussed later), the subject of Newtonian
dynamics represents one of the great triumphs of Natural Philosophy. The successes of the
classical theory range from accurate descriptions of the dynamics of everyday objects to a
detailed understanding of the motions of galaxies.
4.1 The law of inertia
Galileo (1544-1642) was the first to develop a quantitative approach to the study of
motion. He addressed the question — what property of motion is related to force? Is it
the position of the moving object? Is it the velocity of the moving object? Is it the rate of
change of its velocity? …The answer to the question can be obtained only from
66 N E W T O N I A N D Y N A M I C S
observations; this is a basic feature of Physics that sets it apart from Philosophy proper.
Galileo observed that force influences the changes in velocity (accelerations) of an object
and that, in the absence of external forces (e.g: friction), no force is needed to keep an
object in motion that is travelling in a straight line with constant speed. This
observationally based law is called the Law of Inertia. It is, perhaps, difficult for us to
appreciate the impact of Galileo’s new ideas concerning motion. The fact that an object
resting on a horizontal surface remains at rest unless something we call force is applied to
change its state of rest was, of course, well-known before Galileo’s time. However, the
fact that the object continues to move after the force ceases to be applied caused
considerable conceptual difficulties for the early Philosophers (see Feynman The
Character of Physical Law). The observation that, in practice, an object comes to rest
due to frictional forces and air resistance was recognized by Galileo to be a side effect, and
not germane to the fundamental question of motion. Aristotle, for example, believed that
the true or natural state of motion is one of rest. It is instructive to consider Aristotle’s
conjecture from the viewpoint of the Principle of Relativity —- is a natural state of rest
consistent with this general Principle? According to the general Principle of Relativity, the
laws of motion have the same form in all frames of reference that move with constant
speed in straight lines with respect to each other. An observer in a reference frame moving
with constant speed in a straight line with respect to the reference frame in which the
object is at rest would conclude that the natural state or motion of the object is one of
constant speed in a straight line, and not one of rest. All inertial observers, in an infinite
N E W T O N I A N D Y N A M I C S 67
number of frames of reference, would come to the same conclusion. We see, therefore,
that Aristotle’s conjecture is not consistent with this fundamental Principle.
4.2 Newton’s laws of motion
During his early twenties, Newton postulated three Laws of Motion that form the
basis of Classical Dynamics. He used them to solve a wide variety of problems including
the dynamics of the planets. The Laws of Motion, first published in the Principia in 1687,
play a fundamental rôle in Newton’s Theory of Gravitation (Chapter 7); they are:
1. In the absence of an applied force, an object will remain at rest or in its present state of
constant speed in a straight line (Galileo’s Law of Inertia)
2. In the presence of an applied force, an object will be accelerated in the direction of the
applied force and the product of its mass multiplied by its acceleration is equal to the
3. If a body A exerts a force of magnitude |FAB| on a body B, then B exerts a force of
equal magnitude |FBA| on A.. The forces act in opposite directions so that
FAB = –FBA .
In law number 2, the acceleration lasts only while the applied force lasts. The applied
force need not, however, be constant in time — the law is true at all times during the
motion. Law number 3 applies to “contact” interactions. If the bodies are separated, and
the interaction takes a finite time to propagate between the bodies, the law must be
68 N E W T O N I A N D Y N A M I C S
modified to include the properties of the “field “ between the bodies. This important
point is discussed in Chapter 7.
4.3 Systems of many interacting particles: conservation of linear and angular
Studies of the dynamics of two or more interacting particles form the basis of a key
part of Physics. We shall deduce two fundamental principles from the Laws of Motion; they
1) The Conservation of Linear Momentum which states that, if there is a direction in
which the sum of the components of the external forces acting on a system is zero, then
the linear momentum of the system in that direction is constant.
2) The Conservation of Angular Momentum which states that, if the sum of the moments
of the external forces about any fixed axis (or origin) is zero, then the angular momentum
about that axis (or origin) is constant.
The new terms that appear in these statements will be defined later.
The first of these principles will be deduced by considering the dynamics of two
interacting particles of masses ml and m2 wiith instantaneous coordinates [xl, y1 ] and [x2,
y2], respectively. In Chapter 12, these principles will be deduced by considering the
invariance of the Laws of Motion under translations and rotations of the coordinate
Let the external forces acting on the particles be F1 and F2 , and let the mutual
interactions be F21´ and F12´. The system is as shown
N E W T O N I A N D Y N A M I C S 69
F1 F2
m1 F21´
Resolving the forces into their x- and y-components gives
Fx12´ Fx2
Fy21´ m2
Fx1 Fy12´
m1 Fx21´
a) The equations of motion
The equations of motion for each particle are
1) Resolving in the x-direction
Fx1 + Fx21´ = m1 (d2x1/dt2) (4.1)
Fx2 – Fx12´ = m2(d2x2/dt2). (4.2)
Adding these equations gives
Fx1 + Fx2 + (Fx21´ – Fx12´) = m1(d2x1/dt2) + m2(d2x2/dt2). (4.3)
2) Resolving in the y-direction gives a similar equation, namely
70 N E W T O N I A N D Y N A M I C S
Fy1 + Fy2 + (Fy12´ – Fy12´) = m1(d2y1/dt2) + m2(d2y2/dt2). (4.4)
b) The rôle of Newton’s 3rd Law
For instantaneous mutual interactions, Newton’s 3rd Law gives |F21´| = |F12´|
so that the x- and y-components of the internal forces are themselves equal and opposite,
therefore the total equations of motion are
Fx1 + Fx2 = m1(d2x1/dt2) + m2(d2x2/dt2), (4.5)
Fy1 + Fy2 = m1(d2y1/dt2)+ m2(d2y2/dt2). (4.6)
c) The conservation of linear momentum
If the sum of the external forces acting on the masses in the x-direction is zero, then
Fx1 + Fx2 = 0 , (4.7)
in which case,
0 = m1(d2x1/dt2) + m2(d2x2/dt2)
0 = (d/dt)(m1vx1) + (d/dt)(m2vx2),
which, on integration gives
constant = m1vx1 + m2vx2 . (4.8)
The product (mass velocity) is the linear momentum. We therefore see that if there is
no resultant external force in the x-direction, the linear momentum of the two particles in
the x-direction is conserved. The above argument can be generalized so that we can state:
the linear momentum of the two particles is constant in any direction in which there is no
resultant external force.
N E W T O N I A N D Y N A M I C S 71
4.3.1 Interaction of n-particles
The analysis given in 4.3 can be carried out for an arbitrary number of particles, n,
with masses m1, m2, …mn and with instantaneous coordinates [x1, y1], [x2, y2] ..[xn, yn]. The
mutual interactions cancel in pairs so that the equations of motion of the n-particles are, in
the x-direction
Fx1 + Fx2 + … Fxn = m1x1 + m2x2 + … mnxn = sum of the x-components of (4.9)
the external forces acting on the masses,
and, in the y-direction
Fy1 + Fy2 + … Fyn = m1y1 + m1y2 + …mnyn = sum of the y-components of (4.10)
the external forces acting on the masses.
In this case, we see that if the sum of the components of the external forces acting
on the system in a particular direction is zero, then the linear momentum of the system in
that direction is constant. If, for example, the direction is the x-axis then
m1vx1 + m2vx2 + … mnvxn = constant. (4.11)
4.3.2 Rotation of two interacting particles about a fixed point
We begin the discussion of the second fundamental conservation law by cosidering
the motion of two interacting particles that move under the influence of external forces F1
and F2, and mutual interactions (internal forces) F21´ and F12´. We are interested in the
motion of the two masses about a fixed point O that is chosen to be the origin of Cartesian
coordinates. The system is illustrated in the following figure
72 N E W T O N I A N D Y N A M I C S
y F2
F12´ m2
m1 F21´
+ Moment

R2 R1 x
a) The moment of forces about a fixed origin
The total moment 1,2 of the forces about the origin O is defined as
1,2 = R1F1 + R2F2 + (R´F12´ – R´F21´) (4.12)
————— ————————
moment of moment of
external forces internal forces
A positive moment acts in a counter-clockwise sense.
Newton’s 3rd Law gives
|F21´| = |F12´| ,
therefore the moment of the internal forces obout O is zero. (Their lines of action are the
The total effective moment about O is therefore due to the external forces, alone. Writing
the moment in terms of the x- and y-components of F1 and F2, we obtain
1,2 = x1Fy1 + x2Fy2 – y1Fx1 – y2Fx2 (4.13)
b) The conservation of angular momentum
N E W T O N I A N D Y N A M I C S 73
If the moment of the external forces about the origin O is zero then, by
integration, we have
constant = x1py1 + x2py2 – y1px1 – y2px2.
where px1 is the x-component of the momentum of mass 1, etc..
Rearranging, gives
constant = (x1py1 – y1px1) + (x2py2 – y2px2). (4.14)
The right-hand side of this equation is called the angular momentum of the two particles
about the fixed origin, O.
Alternatively, we can discuss the conservation of angular momentum using vector
analysis. Consider a non-relativistic particle of mass m and momentum p, moving in the
plane under the influence of an external force F about a fixed origin, O:
r 
The angular momentum, L, of m about O can be written in vector form
L = r p. (4.15)
The torque, , associated with the external force F actin


Tinggalkan Balasan

Isikan data di bawah atau klik salah satu ikon untuk log in:


You are commenting using your account. Logout /  Ubah )

Foto Google+

You are commenting using your Google+ account. Logout /  Ubah )

Gambar Twitter

You are commenting using your Twitter account. Logout /  Ubah )

Foto Facebook

You are commenting using your Facebook account. Logout /  Ubah )

Connecting to %s