How We Came to Know the Cosmos: Light & Matter

Discover How We Came to Know the Cosmos

Chapter 17. Schrödinger’s Wave Equation

17.1 Wave functions

In 1926, Erwin Schrödinger reasoned that if electrons behave as waves, then it should be possible to describe them using a wave equation, like the equation that describes the vibrations of strings (discussed in Chapter 1) or Maxwell’s equation for electromagnetic waves (discussed in Chapter 5).[1]

17.1.1 Classical wave functions

A wave equation typically describes how a wave function evolves in time. A function describes a relationship between two values. The function f(x) = x+1, for example, is a function because for every value of x you get a new value of f(x).

A wave function describes the behaviour of something that is waving. In the case of Maxwell’s equations, the wave function describes the behaviour of the electric and magnetic fields. In the case of a wave on a string, the wave function describes the displacement of the string. All waves can be described in terms of the sum of sin or cos waves (discussed in Chapter 2), with adjustments to the position of the peak, the wavelength, and the amplitude.

The position of the peak is changed by adding to or subtracting a number from x. The wave produced on a plot of x against y for y(x) = cos(x), for example, can be moved 90° to the right by subtracting 90 from x, or 90° to the left by adding 90 to x, as shown in Figure 17.1.

The wavelength can be changed by multiplying a number by x. The wavelength produced on a plot of x against y for y(x) = cos(x), for example, can be doubled by multiplying x by 1/2 or tripled by multiplying x by 1/3. It can be halved by multiplying it by two or split into thirds by multiplying it by three, as shown in Figure 17.2.

The amplitude can be changed by multiplying the result by a constant. The amplitude produced on a plot of x against y for y(x) = cos(x), for example, can be doubled by multiplying cos(x) by two. It can be halved by multiplying it by 1/2, as shown in Figure 17.3. For moving waves, these factors are affected by time as well as position, and so y(x) is denoted Ψ(x,t).

A plot of cos(x), cos(x+90 °), and cos(x-90 °).  cos(x-90 °) = sin(x). In the first plot, the peak is at 0. The second plot moves backwards by 90°, and the third plot moves forward by the same amount.

Figure 17.1
Image credit

A plot showing the effect of adding a constant to x before calculating.

A plot of cos(x), cos(1/2x), and cos(2x).  In the first plot, the wavelength is 360°. In the second plot, the wavelength is twice as long, and in the third, it is half as long.

Figure 17.2
Image credit

A plot showing the effect of multiplying x by a constant before calculating.

A plot of cos(x), 2cos(x), and 1/2cos(x).  In the first plot, the peak is at 1. In the second, the peak is at 2, and in the third, it is at 1/2.

Figure 17.3
Image credit

A plot showing the effect of multiplying the result by a constant after calculating.

The general equation for a moving wave is,

Ψ(x,t) = Acos(kx-ωt) (17.1)

A is equal to the amplitude. k is multiplied by x to determine the wavelength, and ωt determines where the peak lies.

The wavelength can be doubled by multiplying x by 1/2 or tripled by multiplying x by 1/3, more precisely,

K = λoriginal/λmeasured = 360°/λ = 2π/λ (17.2)

Here, one full cycle of a sin or cos wave is 360°, which is equal to 2π radians. ωt defines where the peak is, and so this depends on the wavelength, which defines how often a peak occurs, and the velocity of the wave, which defines where it is relative to time (t).

ω = 2πν = 2πν/λ = kν (17.3)

Here, ν is the frequency (discussed in Chapter 4).

The equation Ψ(x,t) = Acos(kx-ωt) can also be written using the numbers i and e, using Swiss mathematician Leonhard Euler’s formula,

eix = cos(x) + i sin(x) (17.4)

This gives Acos(kx-ωt)+ i sin (kx-ωt) = Aei(kx-ωt). The real part of this equation gives,

Ψ(x,t) = Aei(kx-ωt) (17.5)

17.1.2 Quantum wave functions

Schrödinger saw that for an object with E=hν (the Planck relation, where E equals energy and h is Planck’s constant), and λ = h/p (the de Broglie wavelength, where p is momentum), this equation can be rewritten as a quantum wave function.

Using k = 2π/λ, ω = 2πν, λ = h/p, E = hν, and ħ = h/2π gives,

Ψ(x,t) = Aei(px-Et)/ħ (17.6)

This is the quantum wave function. The Schrödinger equation shows how the quantum wave function changes over time.

The numbers e and i

The number i

i is equal to the square root of minus 1.

i = √-1   and   i × i = -1 (17.7)

The number i seems impossible, after all the square root of a number (e.g. 4) is equal to another number that, multiplied by itself, becomes the first (e.g. 2 × 2 = 4, and so the square root of 4 is 2), and any number that is multiplied by itself should be positive. i multiplied by i is taken to equal -1 because this assumption helped solve mathematical problems like cubic equations.

The Italian mathematician Rafael Bombelli was the first to introduce the laws for multiplying i and -i in 1572.[2] Although the symbol was not introduced until the 18th century,[3] Rene Descartes first referred to i as an imaginary number in 1637.[4]

That same year, Descartes[4] and Pierre de Fermat[5] independently devised the Cartesian coordinate system, which is used to plot points on a graph.

The number e

The German mathematician Gottfried Leibniz was one of the first people to consider that a new number was special - the number e.[6] e is related to the laws of logarithms, which were devised by the Swiss mathematician Jost Bürgi[7] and the Scottish mathematician John Napier in 1614.[8]

Logarithmic scales are used to show quantities that get rapidly larger. Bürgi and Napier showed that,

If   x = ay   then   loga(x) = y (17.8)

A quantity that increases from 10 to 100 to 1000, for example, uses a base of 10.

If x = ay then loga(x) = y, and so,

If 10 = 101 then log10(10) = 1

If 100 = 102 then log10(100) = 2

If 1000 = 103 then log10(1000) = 3...

After the invention of Cartesian coordinates, a graph could be drawn that allows quantities from one to one billion, for example, to be plotted on the same axis.

Logarithms to the base of 10 are common but any number can be used and Bürgi and Napier made tables of logarithms in different bases. One base that is of particular interest is the base of about 2.718. Euler first referred to this number as e in 1731.[9]

Plots of numbers from one to one billion. Firstly, on a normal plot showing that it is difficult to see the lower numbers on the same scale as the higher numbers. Secondly, on a logarithmic plot to the base of 10, and thirdly on a logarithmic plot to the base of e. It is much easier to see all the values on these plots.

Figure 17.4
Image credit

Plots showing the numbers in the table. The middle plot uses a logarithmic scale to the base of 10, and the bottom plot uses a logarithmic scale to the base of e.

In 1748, Euler showed that e is an irrational number that is fundamentally connected to many laws of mathematics.[10]

Euler showed that,

e = 1 + 1/1 + 1/1 × 2 + 1/1 × 2 × 3 + 1/1 × 2 × 3 × 4... (17.9)

...with the sequence going on forever.

Euler also showed that the number e is connected to the numbers i, π, 1 and 0, and that e and i are connected to trigonometry.

eiπ + 1 = 0 (17.10)
eix = cos(x) + i sin(x) (17.11)

Mathematicians describe results that connect two apparently unrelated concepts as ‘deep’.[11][12]


Differentiation is one branch of calculus (the other being integration). Calculus is a mathematical system developed by Isaac Newton and Gottfried Leibniz in the late 17th century.[13] Velocity (v) is equal to the distance (d) divided by the time taken (Δt), and so,

v = d/Δt = Δx/Δt (17.12)

Here, Δ should be read as ‘change in’ and x is position, where a change in position is equal to a distance. This means that someone’s velocity can be determined by plotting position against time. The velocity is equal to the gradient of the graph. This is equivalent to picking a time period (Δt), determining the change in position over that period (Δx), and then using v = Δx/Δt.

You cannot measure velocity in an instant using this method because both values would be 0.

A plot of x against t for the equation x = 2t. This creates a straight line, where the velocity is equal to the gradient.

Figure 17.5
Image credit

A plot of x against t for the equation x = 2t. The average velocity is 2 m/s.

This method is accurate if the person is moving at a constant velocity, producing a straight line, as shown in Figure 17.5. If the velocity is not constant, however, then you no longer know if the average velocity you have calculated is accurate. If you measured the velocity at t = 1 to be 2 m/s, and at t = 2 to be 4 m/s, you might assume the average velocity in this period was 3 m/s, for example. But what if the velocity went up to 100 m/s between t = 1.1 and t = 1.9? Then the average velocity is not represented by the equation at all.

You can get a more accurate measurement of the velocity at any particular time by making Δt as small as possible. This is almost the same as measuring the velocity in an instant and is achieved by differentiating the equation, as shown in Figure 17.6.

A plot of x against t for the equation x = e<sup>3t</sup>. This creates a curve, where an approximation of the average velocity can be found from the gradient.

Figure 17.6
Image credit

A plot of x against t for the equation x = e3t. The average velocity varies depending on the size of Δt.

When you differentiate an equation, you calculate what the result would be if you use the smallest Δx and Δt that you can. You can then calculate the almost-instantaneous velocity at any time. When Δx and Δt are very small, they are known as dx and dt. To differentiate x = 3t, for example, you can use,

Δx/Δt = (3 × (t+dt)) - (3 × t)/dt (17.13)
= (3 × (t+dt-t)/dt
= 3

Here t is the time at the beginning of dt.

In general,
d/dt(at) = a (17.14)
Other rules include,
d/dt(tn) = ntn-1 (17.15)
d/dt(en) = nen (17.16)

If you know dx/dt but want to work out what the original equation was before it was differentiated, then you can do this by reversing the process. This is known as integration.

17.1.3 The Schrödinger wave equation

Schrödinger showed how the quantum wave function changes over time using differentiation.

Ψ(x,t) = Aei(px-Et)/ħ (17.17)
dΨ(x,t)/dx = ip/ħΨ(x,t) (17.18)
d2Ψ(x,t)/dx2 = (ip/ħ)2Ψ(x,t) (17.19)
= (p/ħ)2Ψ(x,t)
dΨ(x,t)/dt = iE/ħΨ(x,t) (17.20)

The total energy (E) is equal to the kinetic energy (KE) plus the potential energy (PE) (discussed in Book I).

E = KE + PE (17.21)
KE = 1/2mv2 = m2v2/2m = p2/2m (17.22)
and so
dΨ(x,t)/dt = -i/ħ(p2/2m + PE(x))Ψ(x,t) (17.23)
iħdΨ(x,t)/dt = -p2/2mΨ(x,t) + PE(x)Ψ(x,t) (17.24)
Using d2Ψ(x,t)/dx2 = p2/ħ2Ψ(x,t),
iħdΨ(x,t)/dt = -ħ2/2m d2Ψ(x,t)/dx2 + PE(x)Ψ(x,t) (17.25)
HΨ(x,t) = EΨ(x,t) (17.26)
H = iħd/dt and E = -ħ2/2m d2/dx2 + PE(x) (17.27)

This is the time-dependent Schrödinger equation - or wave equation - for a single non-relativistic charged particle moving in an electric field. The time-independent Schrödinger equation is,

EΨ(x) = -ħ2/2m d2Ψ(x)/dx2 + PE(x)Ψ(x) (17.28)

The time-dependent Schrödinger equation describes all the features of the electron that we can measure and can be extended to include any other object under almost any other force.

The Schrödinger equation can be used to make the exact same predictions as Werner Heisenberg’s uncertainty principle (discussed in Chapter 16). It can calculate where electron waves will be situated within an atom, and predict where spectral lines will occur.

Schrödinger’s equation describes the world in terms of continuously evolving waves, and Heisenberg’s equation describes it in terms of particles that undergo instantaneous ‘jumps’ from one place to another without moving through the space in-between. Many physicists preferred Schrödinger’s approach because it was easier to visualise and used more familiar mathematics.

Schrödinger went on to show that his wave equation is equivalent to Heisenberg’s uncertainty principle,[14] although they both argued for the superiority of their own approach.[15] Niels Bohr, however, believed that both views were equally valid.[16]

17.1.4 Probability clouds and the Born rule

In classical wave equations, the wave function has a real meaning, it describes something that is physically waving, but Schrödinger’s wave equation had no physical interpretation.[17]

In 1926, Schrödinger believed that electron waves were always spread out across all of space and that the square of the wave function gave the charge density of the electron wave in any particular location.[18,19] This was a reasonable assumption since the wave appeared to be densest in the places where Bohr’s theory predicted electrons would be. Yet Schrödinger’s interpretation could not explain quantum tunnelling.

Max Born proposed a different interpretation that same year. Born stated that the square of the wave function does not represent the physical density of electron waves, but their probability density.[20] This is the probability of finding an electron in any particular state, that is, with any particular position, momentum, or energy, at any particular time. The de Broglie model of the atom (discussed in Chapter 15) was now replaced with the idea that electrons exist in a superpositional ‘probability cloud’.

17.2 Quantum superpositions

During the double-slit experiment, it’s the probability density that’s ‘waving’, and the interference pattern is produced by the superposition of possible paths the electron could take.

Anything that can be described by the Schrödinger equation can be described as being in a superpositional state, where it exists in all possible quantum states at once. A superposition is composed of all of the solutions to the Schrödinger equation and - since the Schrödinger equation is linear - there is often an infinite amount of solutions.

Linear equations are equations with the form a1x1 + a2x2 + ... + anxn = c, where c and are constants, and x1...xn vary. A linear equation with one variable, 3x = 9 for example, has one solution, x = 9/3 = 3. Linear equations with two or more variables have an infinite amount of solutions.

A linear equation with two variables, y = 3x+3 for example, has possible solutions x = 1, y = 6, x = 2, y = 9, x = 3, y = 12... etc. and produces a straight line when plotted on a graph. With three variables, 2x+3y-z = 9 for example, possible solutions include x = 1, y = 2, z = -4, x = 2, y = 2, z = 1, x = 2, y = 1, z = -2 etc. and the equation produces a plane when plotted.

If during the double-slit experiment, the position of the electron were measured, however, then a single result would be given with a probability of 100%. All other measurements would confirm this result, and an interference pattern would not form.

17.2.1 The collapse approach

Heisenberg interpreted the process of measurement as invoking a ‘collapse’ of the wave function, from a superpositional state into a single state, with a probability determined by Born’s rule. This is known as the Copenhagen interpretation or collapse approach to quantum mechanics.[21]

The collapse approach suggests that the universe must be objectively indeterminate because you cannot predict which state a superposition will collapse into, you can only assign a probability to each possibility. This implies that you cannot know the future of the universe, even if you knew all of the physical laws and everything about its current state. Schrödinger and Albert Einstein did not agree.[22]

17.3 The 1927 Solvay Conference on Physics

The search for the physical meaning behind these new equations was discussed at the 1927 Solvay Conference on Physics. This was attended by 29 scientists, including Erwin Schrödinger, Albert Einstein, Max Planck, Niels Bohr, Werner Heisenberg, Wolfgang Pauli, Louis de Broglie, Paul Dirac, Max Born, Marie Skłodowska-Curie, and Charles Thomson Rees Wilson, and Arthur Compton.[19]

In a joint paper delivered to the conference, Heisenberg and Born stated,

we consider quantum mechanics to be a closed theory, whose fundamental physical and mathematical assumptions are no longer susceptible of any modification.[23]

Schrödinger and Einstein disagreed, and argued that quantum mechanics is a statistical approximation of an underlying deterministic theory[22] (discussed in Chapter 18).

17.4 References

Back to top