In three-space we define a common vector by specifying three coordinate values like r=(x,y,z) for position, v=(Vx,Vy,Vz) for velocity, etc. And if we write r=xi+yj+zk you know what is meant. But a more general way to write down the vector components might be xi=(x1,x2,x3). Then we can add a fourth component, time measured in metres ct, as either xo or x4. Sadly, it is done both ways (and there is worse to come) but we will use the convention x0=ct. Now let us write the position four vector in a number of different ways, all used at one time or another:
We use that notation which is most convenient at the time, be prepared. Note that we use greek symbols like µ to imply a four vector and roman like i to imply a "three-vector". Now for some convention that is followed fairly rigorously: First, you notice the elements are denoted by superscripts, x³ means the z-axis, not x-cubed. It is imperative you know what is happening here, you must know from context when we mean "y component" and not "squared". Subscripts will also be used, and Rµ=(x0,x1,x2,x3).
There are a number of "conventional" approaches to operational special relativity (SR), but I think it is a good idea to start right away with a convention used in Electrodynamics and general relativity (GR). What follows is an extremely simplistic approach: Two dimensional flat space, a more detailed account is HERE.
Consider a simple two dimensional frame with non collinear reference axes X1 and X2 ("X1 X2" like "X Y" axes). There are a number of ways to label the coordinates of a point p. The "usual" way is via route x¹ along X1 then x² parallel to the X2 axis. This gives the vector coordinates we learn about in first year physics. We call such labeling contravariant and denote the axes by superscripts. But there is another useful way, running perpindicular to the axes, such as route x1 (perpindicular to X2) and x2. We call this labelling covariant and denote the axes by subscripts.
If the axes are orthogonal, there is no difference between the contravariant and covariant coordinates. But if the axes are not orthogonal or if there is some "funny" relation in the system there is a very important physical (geometrical) reason to be able to obtain the covariant counterpart to the usual contravariant vector or tensor, the vector product: Pm times Qm and especially the norm, the length or size of the vector which must not depend on the coordinate system in use. Since this is not a course in differential geometry, I simply state the result: One always takes the product of a contravariant times a covariant entity. To obtain the norm of a four vector, you multiply the contravariant vector by its covariant counterpart. In SR this turns out to be very simple, and is often replaced by a "trick", but it is important to get started right!
So how does one obtain the covariant counterpart of a "usual" contravariant vector? From the figure above it would seem to depend somehow on the geometry, it is, and it is usually even more complicated than the simple non-orthogonal example above. The geometry is specified by a metric tensor gmn which, when multiplied with a contravariant vector Qm gives the covariant counterpart, Qm = ågmnQn where the sum is over n (and of course there are four of these sums, one for each of the four m components ... "the usual way".)
The notation used gave Einstein one of his "greatest discoveries" (his words). Indeed, everytime a contravariant and covariant index are repeated in a horrible string of tensors, such a multiplication is always implied. So why bother with the summation symbol? And now you are in on a little secret, the "Einstein Convention"! So if you see something like AmnBpsCps you know that there is actually a summation over the s index. The p index was repeated, but not "upstairs" so no summation.
Finally, it is conventional to use greek superscripts and subscripts for four component vectors and roman for spacelike components. (Why nothing special for the timelike component? There is -just use zero!)
The line element ds² (=dx²+dy²+dz² = dxidxi in classical mechanics) becomes ds² = dXmdXm = gmnXnXm.
That was easy, but now what about that metric tensor. In the regular cartesian three-space the metric tensor is simply the 3X3 identity matrix. so there is no difference between covariant and contravariant vectors and that is why you may never have learned that you are not supposed to multiply two contravariant vectors ...dxi=dxi... it doesn't matter. The four-vectors of SR are nearly as simple, except the timelike element of the metric is the negative of the spacelike elements. Again, there are two choices here, make the timelike element -1 and the spacelike +1, which might feel "comfortable", or make the timelike +1 and the spacelike -1,-1,-1. This is the preffered signature and will be used here. It doesn't matter as long as you are consistent, but I think the +--- signature is becoming standard so will use it below. The Lorentz metric tensor can be written gmn=
| | | 1 | 0 | 0 | 0 | | |
| | | 0 | -1 | 0 | 0 | | |
| | | 0 | 0 | -1 | 0 | |. |
| | | 0 | 0 | 0 | -1 | | |
If dS² is positive, the events are timelike separated and a light signal can be sent from one to the other. In all frames, one event will occur before the other, and there exists a frame in which both events occur in the same place.
If dS²=0 the event separation is null, such as the absorption and emission of a photon in vacuum.
| | | g | -gb | 0 | 0 | | |
| | | -gb | g | 0 | 0 | | |
| | | 0 | 0 | 1 | 0 | |. |
| | | 0 | 0 | 0 | 1 | | |
which is the "four vector way" of writing the more familiar
X'° = g[X°-bX¹] = ct' = g[ct-bx],
X'¹ = g[X¹-bX°] = x' = g[x-bct],
and of course y'=y and z'=z.
If you "boost" that X=(100,100,100,100) event by ß=Ö2/2, g=2 you get ct'=2(100-70.7), x'=2(100-70.7), y'=z'=100, ie,
X'=(29.3,29.3,100,100) and.... well, of course, the zero and one components chane by the same amount so (X')² = 20000 = (X)². The transformed vector has the same value in all frames, the components magically adjust themselves to make it so! This is true in the more general transform.
Now try another simple case: Xµ=(30,40,0,0) -- the square is -1400 and with the prvious boost it goes to (3.43,37.57,0,0) (X')²=11.77-1411.77=-1400. As it must!