Apples out of Applesauce: a Primer on Quantum Mechanics

This is the start of a small series of posts I'm going to make about quantum mechanics and operator algebras. Later on I'm going to write some more technical posts, but the point of this first post is just to serve as a quick introduction to the subject.a The inteneded audience for this introduction are people who are just starting out with quantum mechanics and as such I'm going to assume very little mathematical familiarity - only vector algebra and calculus. Please let me know if something is incorrect or unclear.

This first post will focus on introductory material: the idea of quantization, operators and measurement. Future posts will go more in-depth into various aspects of quantum information theory and flesh out mathematical formalism.

Let's take a walk in the idealized playground of physics. We look closely at an atom and notice the postive and negative charges. Our first assumption might be that these charges are spread out like the planets in the sky. As above, so below, right? We take some time to enjoy our new friend - the electron - but it becomes apparent that something's off. Our calculations don't seem to work out well; we can't really explain its interactions with light; it sometimes acts as a ghost, moving through walls; when alone it acts as if it has many friends! Is a new theory needed to explain the small?

When we measure the energy of an electron we notice that it seems to only take on certain specific values. That is to say, the energy of an electron is quantized. We also notice that our measurements seem to give out random results. Instead of using the laws that kepler found we used the sort of geometric crystalline laws kepler wanted. These two aspects, quantization and probability, explains most of the quantum 'weirdness'b that one hears about in clickbait ads.

We know that the energies of electrons are quantized, but we notice that the values that their energies can take depends on the system their in. How do we take the continuous world and make it discrete? It is intuitive how to get from discrete to continuous: to get applesauce we just grind and squash and chop up a bunch of apples. But how do we make apples out of applesauce? Fortunately for us, this question is not new.

Many ancient cultures noticed a certain concordance while plucking stringed instruments - a (perhaps sacharrine) sweetness when the length of two strings are in a ratio of 3:2. The question from before arises here: from a continuous string how do we get behavior that focuses on certain integers? We know the answer nowc - the plucked string produces a fundamental frequency $f$ and overtones of frequencies $2f, 3f, 4f, \dots$. We can show how this arises naturally from the wave equation, but for the purposes of this introduction I think it's better to work with a simpler to work with a simpler equation: $$f''(x) = \lambda^2 f(x)$$ $$f'(x)=0 \text{ when } x \notin (0,2\pi)$$ This is the equation for an oscillator of length $2 \pi$.

The first thing we'll do is write $\partial_x^2 f(x)$ for $f''(x)$. This rephrases the first equation as $\partial_x^2 f(x) = \lambda^2 f(x)$. From linear algebra we can notice that this takes the form of an eigenvalue equation. If we just solve the first equation on its own, we can see that $c \exp(\pm \lambda t)$ is a solution for any constant $c$. (Plug it in and check if you don't believe me!).

We call the second equation the boundary conditions. It tells us that our oscillator has length $2 \pi$ and is constant outside of that boundary. Barring some technical concerns about differentiability, we can actually rephrase this condition as $f(x) = f(x + 2\pi)$. Intuitively this turns out to be the same, becuase our solution is uniquely determined by what it does on the interval $(0,2\pi)$. We can think about this as taking our function and 'wrapping' it around the circle. This is now telling us that our oscillator exists on a circle - our function is $2 \pi$ periodic! Putting this with the previous fact we have that $f(x) = c \exp(\pm \lambda x) = c \exp(\pm \lambda (x+2\pi)) = f(x+2\pi)$. Rewriting we have that $\exp(\pm \lambda x) = \exp(\pm \lambda x)\exp(\pm 2 \pi \lambda)$. So $\exp(\pm 2 \pi \lambda) = 1$. The only values of $\lambda$ that can solve this equation are $\lambda = n i$ for an integer $n$.

So when $\lambda = n i$ we have that $f(x) = \exp(\pm nix)$ and when $\lambda \neq n i$ our equation has no solution at all. Rephrased in linear algebra terms, we can say that the operator $\partial_x^2$ defined on the circle of circumference $2 \pi$ has eigenvalues $\lambda_n^2 = - n^2$ with eigenvectors $\exp(\pm ni)$. Note what happened here - when we took an operator with the right boundaries, a condition on the continuous behavior of a function turned into a quantization condition! This will be our blueprint for why we want to talk about operators in quantum mechanics.

In particular, the eigenvalues of an operator will determine the possible outcomes of a measurement, and the eigenvectors will determine how likely that outcome is. But before we move on to the meat of the matter, we have a few more things to note about in the previous equation.

The first thing to note is that we can take the operator $\exp(a \partial_x)$. What does this do? Well if we formally calculate out the taylor series $\sum_n \frac{(a \partial_x)^n}{n!}f(x)$ and write it all out we actually end up with the taylor series for $f(x+a)$. (Try it out if you don't believe me!) That is to say, $\exp(a \partial_x)$ acts as a shift operator and shifts a function by the value $a$. If we want, we can rewrite the boundary condition as $1 = \exp(2\pi \partial_x)$. As an informal calculation, we could take the logarithm and write $2 n \pi i = 2 \pi \partial_x$ and so $(\partial_x)^2 = -n^2$. This is sort of calculation is not wrong, but also not entirely rigorous. To make it rigorous we have to start working in the framework of lie algebras and operator algebras. We can see here the immediate benefits of such an approach!

The second thing to note is fourier analysis. We know that we can create various many different functions by just adding up sines and cosines. By taking sums such as $\sum_n a_n \exp(inx)$ we can create nearly any periodic function. But can we create every function? Such a question isn't well-defined unless we define what it means for $\sum_n a_n \exp(inx)$ to converge to a function! With 3-dimensional vectors we can say that $v=w$ when $(v-w)\cdot (v-w) = \sum_i |v_i - w_i|^2 = \|v-w\|^2 = 0$. In a similar way, we say we can define a notion of equivalence for functions: $f \sim g$ when $\|f-g\|^2 = \int |f(x)-g(x)|^2 d x = 0$. For example, take a function $g(x)$ that's equal to $1$ everywhere except for $g(0)=0$. We cannot create this function just by adding up sines and cosines, but $g(x)\sim 0$ and we can certainly create the function $0$ by adding up sines and cosines! From now on by $f = g$ we will mean $f \sim g$. That is to say, whenever we talk about the function $f$ we will refer to the entire class of functions equivalent to $f$ by the above integral.

The next thing to note is that we can define a dot product for functions of this sort. $f(x) \cdot g(x) = \int f(x)\overline{g(x)} d x$ where the bar represents the complex conjugate. This is parallel to the notion of dot product for finite-dimensional vectors, $v \cdot w = \sum_i v_i \bar{w_i}$. We're at the point to make an important conceptual leap - we're going to think about functions as infinite-dimensional vectors. The space of functions with this dot product is generally called $L_2(X)$ (ell-2 of $X$) where $X$ is the domain of the functions. For our above equation, we can write our vector space as $L_2(\mathbb T)$ where $\mathbb T$ refers to the circle or if you're fancy, the one-dimensional torus. This is an example of what mathematicians call a hilbert space, an infinite-dimensional generalization of euclidean space. We restate fourier analysis as the fact that $\exp(inx)$ is a basis for our vector space. Morever, instead of writing $f \cdot g$ in quantum mechanics we generally use bra-ket notation: $\Braket{f|g} = f \cdot g$. By an operator, we refer to a funtion that takes an element in one hilbert space to an element in another hilbert space. (Just as a matrix is a function that takes a vector in one vector space to a vector in another vector space.)

We also restate a fact from linear algebra: A self-adjoint matrix has eigenvectors that form an orthonormal basis of the vector space. This is called the spectral theorem. Recall that for a matrix $A$, its adjoint is $A^*$ defined by the relation $\Braket{Av|w} = \Braket{v|A^* w}$. A self-adjoint matrix has $A = A^*$. All operators corresponding to observable phenomena in quantum mechanics are self-adjoint, and we'll see why this is an important part of the formalism when we talk about measurement.

The last thing to note is to recall the definition of a co-vector: For a vector $v$, we can write it's covector $v^*$ as a function $v^*(w) = \Braket{v|w}$. In the language of qunatum mechanics we will use the notation $\bra{v}$ for $v^*$ and $\ket{v}$ for $v$.

After all of this we are finally ready to talk about quantum mechanics! Let's get to it and talk about the rules of quantum mechanics.

(1) A particle is represented by a vector $\ket{\psi}\in \mathcal H$, where $\mathcal H$ is a hilbert space. This vector is called the wavefunction for the particle. We will always set $\Braket{\psi|\psi}=1$ dividing or multiplying $\psi$ by the necessary constants. This is called normalization.

(2) Every observable quantity has an associated operator whose eigenvalues represent the possible values we can measure. For a quantity like momentum, $p$, we represent its associated operator by $\hat{p}$. To get the possible values of momentum, we have to solve the equation $\hat{p}\ket{f} = p \ket{f}$ for the possible values of $p$ with appropriate boundary conditions. We often write the eigenvectors of an operator $\hat{p}$ as $\ket{p_n}$ and its eigenvalues as $p_n$ so that the above equation appears as $\hat{p}\ket{p_n} = p_n \ket{p_n}$.

(3) After a measurement, the wavefunction collapses onto the associated eigenvector to our result. So if after a measurement we get the result that the particle has momentum $p_n$, then the wavefunction after the measurement will necessarily be $\ket{\psi} = a_n \ket{p_n}$. This is a statement the measurement is fundamentally an act of transformation. The wavefunction before and after are necessarily different!

(4) To find the probability of a measurement, we have to project the wavefunction onto the eigenvector corresponding to the possible measured value. This forces any operator corresponding to an observable to be self-adjoint. For example, for $\hat{p}$, since $\hat{p}$ is self-adjoint we then have $\ket{p_n}$ is a basis of $\mathcal H$. So we can always write $\ket{\psi} = \sum_n a_n \ket{p_n}$. We can calculate $a_n = \Braket{\psi|p_n}$ since $\ket{p_n}$ forms an orthonormal basis. $|a_n|^2$ is the probability of measuring a momentum of $p_n$.d

(5) The operators corresponding to an observable are determined by their symmetries. Noether's theorem tells us that momentum is conserved when our physical system is invariant under position-shift, angular momentum is conserved when our physical system is invariant under rotation, and many other relations of the sort. In quantum mechanics these pairs can tell us how to define our operators. With momentum-position, we have $\hat{x}\hat{p} - \hat{p}\hat{x} = i\hbar$. We call $[\hat{x},\hat{p}]$ the commutator of $\hat{h}$ and $\hat{p}$. In general the commutator of any two noether-related observables has a commutator equal to $i \hbar$. We will see later that this also tells us that that these pairs of operators act as infitesimal generators, which means that $\exp(i \hat{p} a / \hbar)\ket{f(x)} = \ket{f(x+a)}$. Notice the similarity to what we've seen earlier! This should arguably be the definition of momentum. All of these commutator relationships are often called the canonical commutation relations or CCR. This in general is enough to determine the structure of all the things we care about in our system - except for time. (Refer to 3 for more information). In later posts we'll talk a lot more about why time is so much different than the other properties of quantum systems. This disparity leads to a split between the informational and the dynamical side of quantum mechanics.

(6) To deal with the desparity of time, we regard the wavefunction as depending on time so that we really have $\ket{\psi(t)}$. To determine $\ket{\psi(t)}$ from $\ket{\psi(0)}$ we can apply the time-evolution operator $U(t)$ so that $\ket{\psi(t)} = U(t) \ket{\psi(0)}$. To calculate the time-evolution operator we have to use the Schrodinger equation which depends on the hamiltonian of the system. The hamiltonian operator, $\hat{H}$ is the operator whose eigenvalues correspond to energy. (Due to convention, we write $\hat H$ rather than $\hat E$, but it is true that $\hat H \ket E = E \ket E$.) Now we can write the $i \hbar \frac{d}{dt}U(t) = \hat{H}U(t)$. Note here that time and energy have the same symmetry relation that momentum and position do, which is why the hamiltonian appears in the schrodinger equation. The symmetry relations are the fundamental concept - quantum mechanics a the end of the day is just repeatedly solving equations constained by these fundamental symmetry relationships. Moreover, these symmetry relations are in some fundamental sense, definitional: it is not just that energy is preserved by time translation symmetry, it is that energy fundamentally is that thing which measures deviation from time translation symmetry. Likewise with momentum and spatial translation.

(7) Whenever we work with a system of multiple particles, we work on the tensor product of their hilbert spaces. If you don't know what a tensor product is, don't worry - it is something we'll certainly talk about more in depth in later posts.

Let's use all of these rules (except #7) to calculate a simple example, a particle in a box!

We have an electron trapped in a one-dimensional box of length $\pi$. We treat the outside of this box has having infinite potential energy, so that $V(x) = \infty$ when $x \notin (0,\pi)$ and $V(x) = 0$ otherwise. The first step in these problems is to find our hamiltonian. We borrow from classical mechanics, the expression for kinetic energy: $p^2 / 2m$ and write it as an operator: $\hat{p}^2 / 2m$. Likewise we write our potential energy as an operator, $V(\hat{x})$. Now our hamiltonian is just our kinetic plus our potential energy: $\hat H = \hat p^2 / 2m + V(\hat x)$. Inside our box, we just have $\hat H = \hat p^2 / 2m$.

How do we write our operator $\hat{p}$? We just use the commutation relations $[\hat{x},\hat{p}] = i\hbar$. Note that $\hat p = -i \hbar \partial_x$ satisfies this relation.e Barring some technicalities, (Reference 3), any solution to these commutation relations will give us the right answer.

Now one way to solve this equation is a clever choice of basis. We know that $\hat H \ket{E_n} = E_n \ket{E_n}$ so we will take $\ket{E_n}$ and write our wavefunction at time 0 in terms of this basis. Let's write $-\frac{\partial_x^2 \hbar^2}{2m} \ket{\psi_n} = E_n \ket{E_n}$. Rewriting, we can say $\partial_x^2 \ket{\psi_n} = -(2m/\hbar^2) E_n \ket{E_n}$. This is exactly the equation of our oscillator! So we can write $-(2m/\hbar^2) E_n = -n^2$, which gets us that $n^2 \hbar^2 / 2m = E_n$. We just showed that the energies of a particle in a box are discretized! We can do a simliar boundary condition analysis to determine $\ket{E_n}$ like we did with the oscillator. We know $\ket{E_n} = \exp(\lambda x)$, the boundary conditions here determine that the function is $0$ at $x = 0$ and $0$ and $x = \pi$. We can also treat this function as $2 \pi$ periodic, but only considering odd functions so that we get $0$ at the boundaries. When we do this we get $\ket{E_n} = \sin (n x)$.

We can write $\ket{\psi(0)} = \sum_n a_n \ket{E_n}$. How do we get $\ket{\psi(t)}$? Well we know the schrodinger equation: $i \hbar \frac{d}{dt}U(t) = \hat{H}U(t)$. Since $H$ does not depend on time, we can just write $U(t)=\exp(-i \hat H t/ \hbar)$ (We can plug it in and check!) We can apply $U(t) a_n \ket{\psi_n} = a_n U(t) \ket{\psi_n}$. Note that on $\ket{E_n}$, we know that $\hat H \psi_n = E_n \ket{E_n}$. So we can just write $a_n U(t) \ket{\psi_n} = a_n \exp(-i \hat E_n t / \hbar) \ket{E_n}$. The frequency of our wave is given by $E_n / \hbar$.

Putting this altogether, we get that $\ket{\psi(t)} = \sum_n a_n \exp(-i E_n t/ \hbar) \ket{E_n} = \sum_n a_n \exp(-i E_n t/ \hbar) \sin (nx) = \sum_n a_n \exp(-i (n^2 \hbar^2 / 2m) t/ \hbar) \sin (nx)$.

Now let's say that we start out with the particle being spread out evenly across the box. That is to say, with the probability of finding the particle at any point along the box is the same. This is just $\ket{\psi} = (1/\pi)x$. At time t, to calculate the probability of finding the particle at energy $E_n$, we just take $\Braket{\psi|\exp(-i E_n t/ \hbar) \sin (nx)}^2$.

What happens after we measure? Once we do this measurement and read the result of $E_n$, the particle is now in state $\ket{E_n}$. We normalize this wavefunction so that $\Braket{E_n|E_n} = 1$. Now if we want to find the particles momentum afterward, we find $\ket{p_n}$ and calculate $\Braket{E_n|p_n}^2$. And so forth!

This is all quantum mechanics is at the end of the day, we just need to find the eigenvectors of the observable we care about, calculate the schrodinger equation to find the time-evolution operator, and the calculate the brakets to get the probability. For most problems there are no analytic solutions, but each of these steps are things that can be calculated numerically!

Hopefully this example and the rules were instructive. One take away is that quantum mechanics in some sense appears more ordered than classical mechanics - it takes our quantities that we see out in the world and abstracts them away into a constellation of symmetries and relations. The magic is taking the continuous world and finding a way to quantize it: seeing the playground of physics not as a set of a various swings and spinners already always existing but rather as a nebula of various quantities coagulating from continuity.

This may be a lot to absorb at once, but I hope by reading this you got a taste of what quantum mechanics is and why it is! It's easy to get lost in the symbols but I wanted to show here that the ideas at play are not too far out of the ordinary. The difficulty most people have in the subject is the vector algebra and calculus itself - but we can always reason by analogy, the infinite is not much different than the finite, after all. We can always make applesauce out of apples.

[a] The form of pedagogy here will take on a certain historical logic. This is not meant to be history as such - history is a labyrinth of branches but we will simply follow one path to see where it leads.

[b] We never talk about classical 'weirdness', do we? Quantum mechanics may seem abstract but it's anything but weird.

[c] Although it is true that pitched sounds can be split up into fundamentals and overtones, in mathematical fact there are many, many series of functions one could split up a sine wave into. The fact that we hear pitches and concordance in the way we do is fundamentally a cultural and physiological fact even if mathematics and physics can be used to analyze it. It's unclear if one can get from concordance to consonance with just the harmonic series, taking for exmaple the prominence of the minor triad in music (c.f Milton Babbitt, The Overtone Follies).

[d] In the messy real world, it's hard to prepare a particle so that it is just one wave function - instead we usually get an ensemble of wavefunctions which we write as a density matrix $\rho = \sum_n k_n \ket{\psi_n}\bra{\psi_n}$ where $\sum_n k_n$ adds up to 1. This enables us to write the probability of a measurement as $\bra{\lambda_n}\rho \ket{\lambda_n}$

[e] We do this explicitly by choosing a basis $\ket{x}$ so that $\hat x \ket{x} = x \ket{x}$ and then calculating $[\hat x,\hat p] \ket x$ with letting $\hat p = -i \hbar \partial_x$. One might wonder how we can choose the basis $\ket{x}$ when unlike $\ket{p_n}$ or $\ket{E_n}$, $\ket{x}$ is a continuous uncountably infinite set of vectors. There are a few ways to make this formal, one using rigged hilbert spaces, another using operator algebras and spectral measures. We'll cover this in a later blog post.


  1. Stephen Barnett, Quantum Information
  2. Gerard Murphy, C*-algebras and Operator Theory
  3. Does the canonical commutation relation fix the form of the momentum operator?
  4. Nicholas Wheeler, Generalized Quantum Information
  5. Ramamurti Shankar, Principles of Quantum Mechanics

NB: This is not an academic paper, and as such the references and citations are not exhaustive. I do not claim originality.