Showing posts with label Lots of math. Show all posts
Showing posts with label Lots of math. Show all posts

Friday, March 8, 2024

Stirling's Approximation

I've always been fascinated by Stirling’s approximation,

ln(n!) = n ln(n) − n,

where n! is the factorial. Russ Hobbie and I mention Stirling’s approximation in Appendix I of Intermediate Physics for Medicine and Biology. In the homework problems for that appendix (yes, IPMB does has homework problems in its appendices), a more accurate version of Stirling’s approximation is given as

ln(n!) = n ln(n) − n + ½ ln(2π n) .

There is one thing that’s always bothered me about Stirling’s approximation: it’s for the logarithm of the factorial, not the factorial itself. So today, I’ll derive an approximation for the factorial. 

The first step is easy; just apply the exponential function to the entire expression. Because the exponential is the inverse of the natural logarithm, you get

n! = en ln(n) − n + ½ ln(2π n)

Now, we just use some properties of exponents

n! = en ln(n) en e½ln(2π n)

n! = (eln(n))n e−n √(eln(2π n))

n! = nn en √(2π n

And there we have it. It’s a strange formula, with a really huge factor (nn) multiplied by a tiny factor (en) times a plain old modestly sized factor (√(2π n)). It contains both e = 2.7183 and π = 3.1416.

Let's see how it works.

n n! nn e−n √(2π n)   fractional error (%)
1 1 0.92214 7.8
2 2 1.9190 4.1
5 120 118.02 1.7
10 3.6288 × 106 3.5987 × 106 0.83
20 2.4329 × 1018 2.4228 × 1018 0.42
50 3.0414 × 1064 3.0363 × 1064 0.17
100   9.3326 × 10157   9.3249 × 10157 0.083

For the last entry (n = 100), my calculator couldn’t calculate 100100 or 100!. To get the first one I wrote

100100 = (102)100 = 102 × 100 = 10200.

The calculator was able to compute e−100 = 3.7201 × 10−44, and of course the square root of 200π was not a problem. To obtain the actual value of 100!, I just asked Google.

Why in the world does anyone need a way to calculate such big factorials? Russ and I use them in Chapter 3 about statistical dynamics. There you have to count the number of states, which often requires using factorials. The beauty of statistical mechanics is that you usually apply it to macroscopic systems with a large number of particles. And by large, I mean something like Avogadro’s number of particles (6 × 1023). The interesting thing is that in statistical mechanics you often need not the factorial, but the logarithm of the factorial, so Stirling's approximation is exactly what you want. But it’s good to know that you can also approximate the factorial itself. 

Finally, one last fact from Mr. Google. 1,000,000! = 8.2639 × 105,565,708. Wow!


Stirling’s Approximation

https://www.youtube.com/watch?v=IJ5N28-Ujno


Friday, October 27, 2023

The Helmholtz Coil and the Maxwell Coil

To do magnetic resonance imaging, you need a static magnetic field that is uniform and a switchable magnetic field that has a uniform gradient. How do you produce such fields? In this post, I explain one of the simplest ways: using a Helmholtz coil and a Maxwell coil.

Both of these are created using circular coils. The magnetic field Bz produced by a circular coil can be calculated using the law of Biot and Savart (see Chapter 8 of Intermediate Physics for Medicine and Biology)

where μ0 is the permeability of free space (the basic constant of magnetostatics), I is the coil current, N is the number of turns, R is the coil radius, and z is the distance along the axis from the coil center.

The Helmholtz Coil

The Helmholtz coil consists of two circular coils in parallel planes, having the same axis and the same current in the same direction, that are separated by a distance d. Our goal will be to find the value of d that gives the most uniform magnetic field. By superposition, the magnetic field is 


 
To create a uniform magnetic field, we will perform a Taylor expansion of the magnetic field about the origin (z = 0). We will need derivatives of the magnetic field. The first derivative is


(The reader will have to fill in the missing steps when calculating these derivatives.) At z = 0, this derivative will go to zero. In fact, because the magnetic field is even about the z axis, all odd derivatives will be zero, regardless of the value of d.

The second derivative is

At z = 0, the two terms in the brackets are the same. Our goal is to have this term be zero, implying that the second order term in the Taylor series vanishes. This will happen if

or, in other words, d = R. This famous result says that for a Helmholtz pair the coil separation should equal the coil radius.

A Helmholtz coil produces a remarkably uniform field near the origin. However, it is not uniform enough for use in most magnetic resonance imaging machines, which typically have a more complex set of coils to create an even more homogeneous field. If you need a larger region that is homogeneous, you could always just use a larger Helmholtz coil, but then you would need more current to achieve the desired magnetic field at the center. A Helmholtz pair isn’t bad if you want to use only two reasonably sized coils.

The Maxwell Coil

The Helmholtz coil produces a uniform magnetic field, whereas the Maxwell coil produces a uniform magnetic field gradient. It consists of two circular coils, in parallel planes having the same axis, that are separated by a distance d, but which have current in the opposite directions. Again, our goal will be to find the value of d that gives the most uniform magnetic field gradient. The magnetic field is


The only difference between this case and that for the Helmholtz coil is the change in sign of the second term in the bracket. If z = 0, the magnetic field is zero. Moreover, the magnetic field is an odd function of z, so all even derivatives also vanish. The first derivative is


This expression gives us the magnitude of the gradient at the origin, but it doesn’t help us create a more uniform gradient. The second derivative is


This derivative is zero at the origin, regardless of the value of d. So, we have to look at the third derivative.


At z = 0, this will vanish if
or, in other words, d = √3 R = 1.73 R. Thus, the two coils have a greater separation for a Maxwell coil than for a Helmholtz coil. The Maxwell coil would be useful for producing the slice selection gradient during MRI (for more about the need for gradient fields in MRI, see Chapter 18 of IPMB).

Conclusion

Below is a plot of the normalized magnetic field as a function of z for the Helmholtz coil (blue) and the Maxwell coil (yellow). As you can see, the region with a uniform field or gradient is small. It depends on what level of accuracy you need, but if you are more than half a radius from the origin you will see significant deviations from homogeneity.
 

Russ Hobbie and I never discuss the Helmholtz coil in Intermediate Physics for Medicine and Biology. We don’t mention the Maxwell coil by name, but Problem 33 of Chapter 18 analyzed a Maxwell pair even if we don’t call it that.

The Maxwell coil is great for producing the magnetic field gradient dBz/dz needed for slice selection in MRI, but how do you produce the gradients dBz/dx and dBz/dy needed during MRI readout and phase encoding? That, my friends, is a story for another post.

Friday, September 29, 2023

Decay Plus Input at a Constant Rate Revisited

In Chapter 2 of Intermediate Physics for Medicine and Biology, Russ Hobbie and I discuss the problem of decay plus input at a constant rate.
Suppose that in addition to the removal of y from the system at a rate –by, y enters the system at a constant rate a, independent of y and t. The net rate of change of y is given by

Then we go on to discuss how you can learn things about a differential equation without actually solving it.

It is often easier to write down a differential equation describing a problem than it is to solve it… However, a good deal can be learned about the solution by examining the equation itself. Suppose that y(0) = 0. Then the equation at t = 0 is dy/dt = a, and y initially grows at a constant rate a. As y builds up, the rate of growth decreases from this value because of the –by term. Finally when a by = 0, dy/dt is zero and y stops growing. This is enough information to make the sketch in Fig. 2.13.

The equation is solved in Appendix F. The solution is
… The solution does have the properties sketched in Fig. 2.13, as you can see from Fig. 2.14.
Figure 2.13 looks similar to this figure
Sketch of the initial slope a and final value a/b of y when y(0) = 0. In this figure, a=b=1.

 And Fig. 2.14 looks like this

A plot of y(t) using Eq. 2.26, with a=b=1.

However, Eq. 2.26 is not the only solution that is consistent with the sketch in Fig. 2.13. Today I want to present another function that is consistent with Fig. 2.13, but does not obey the differential equation in Eq. 2.25.

Let’s examine how this function behaves. When bt is much less than one, the function becomes y = at, so it’s initial growth rate is a. When bt is much greater than one, the function approaches a/b. The sketch in Fig. 2.13 is consistent with this behavior.

Below I show both Eqs. 2.26 and 2.26’ in the same plot.

A plot of y(t) using Eq. 2.26 (blue) and Eq. 2.26' (yellow), with a=b=1.

The function in Eq. 2.26 (blue) approaches its asymptotic value at large t more quickly than the function in Eq. 2.26’ (yellow).

The moral of the story is that you can learn a lot about the behavior of a solution by just inspecting the differential equation, but you can’t learn everything (or, at least, I can’t). To learn everything, you need to solve the differential equation. 

By the way, if Eq. 2.26’ doesn’t solve the differential equation in Eq. 2.25, then what differential equation does it solve? The answer is

 How did I figure that out? Trial and error.

Friday, July 14, 2023

A Short Course in Vector Calculus

Want a short course in vector calculus? You can find one in Intermediate Physics for Medicine and Biology.

Divergence

The divergence is defined in Section 4.1 in IPMB, when discussing the continuity equation. The divergence is one way to differentiate a vector field. I our case, the vector field is the current density (or some other type of flux density), j. Its divergence is defined as 


When you take the divergence of a vector (a quantity that has both magnitude and direction), you get a scalar (a quantity that has magnitude but no direction). In electrostatics, the electrical charge is conserved, implying that the divergence of the electrical current density is zero.

Curl

The curl is defined in Section 8.6, when analyzing electromagnetic induction. It is another way to differentiate a vector,


The symbols , ŷ, and are unit vectors, and the vertical lines indicate that you follow the rules for determinants when expanding this expression. The curl appears often when analyzing the magnetic field. In our case, the curl of the electric field equations the negative of the time derivative of the magnetic field (Faraday’s law of induction).

Gradient

The gradient is a way to differentiate a scalar field to get a vector. 

 

You can think of the gradient, ∇, as representing the vector ∂/∂x + ŷ ∂/∂y + ∂/∂z. The divergence is then found by taking the dot product of the gradient with a vector, and the curl is found by taking the cross product of the gradient with the vector. In electrostatics, V represents of the electric potential (a scalar) and E represents the electric field (a vector). The two are related by

Laplacian

The Laplacian, ∇2, is just the dot product of the gradient operator with itself. In other words 

 

You can apply the Laplacian to a vector, but it is more commonly applied to a scalar (such as electrical potential, temperature, or concentration). The Europeans use ∆ to represent the Laplacian, but that’s just weird and we Americans know better than that.

Other Coordinate Systems

We have written the divergence, curl, gradient, and Laplacian in Cartesian coordinates. These operators are more complicated in other coordinate systems. Appendix L of IPMB provides expressions for these operators in cylindrical coordinats of spherical coordinates.

The Divergence Theorem

The divergence theorem says that the volume integral of div J is equal to the surface integral of the normal component of J. We don’t dwell on this theorem in IPMB, but we do ask the reader to derive it in Homework Problem 4 of Chapter 4.

Stokes’ Theorem

We don’t discuss Stokes’ Theorem in IPMB, but I’ve pointed out how we might include a homework problem about it in a previous blog post. Stokes’ theorem says that the line integral of a vector around a closed loop is equal to the surface integral of the curl of that vector of an area bounded by the loop.

div, grad, curl, and all that, by h. m. schey.
div, grad, curl, and all that,
by h. m. schey.
So, almost all the big concepts of vector calculus are presented in IPMB. If, however, you want a little more detail, Russ and I recommend the wonderful book div, grad, curl, and all that, by Harry Schey. I learned vector calculus from the first edition of that book as an undergraduate physics major at the University of Kansas. Schey died five years ago, but his book lives on.

Friday, July 7, 2023

Integral of the Bessel Function

Have you ever been reading a book, making good progress with everything making sense, and then you suddenly stop at say “wait… what?”. That happened to me recently as I was reading Homework Problem 31 in Chapter 12 of Intermediate Physics for Medicine and Biology. (Wait…what? I’m a coauthor of IPMB! How could there be any surprises for me?) The problem is about calculating the two-dimensional Fourier transform of 1/r, and it supplies the following Bessel function identity 

An equation for the integral of the Bessel function J0(kr).

The function J0 is a Bessel function of the first kind of order zero. What surprised me is that if you let x = kr, you get that the integral of the Bessel function is one,

An equation for the integral of the Bessel function J0(x), which equals one.

Really? Here’s a plot of J0(x).

A plot of the J0(x) Bessel function versus x.

It oscillates like crazy and the envelope of those oscillations falls off very slowly. In fact, an asymptotic expansion for J0 at large x is

An asymptotic expression for the J0 Bessel function at large argument.

The leading factor of 1/√x decays so slowly that its integral from zero to infinity does not converge. Yet, when you include the cosine so the function oscillates, the integral does converge. Here’s a plot of

An expression for the integral of the Bessel function J0(x') from 0 to x.

A plot of the integral of the J0 Bessel function.

The integral approaches one at large x, but very slowly. So, the expression given in the problem is correct, but I sure wouldn’t want to do any numerical calculations using it, where I had to truncate the endpoint of the integral to something less than infinity. That would be a mess!

Here’s another interesting fact. Bessel functions come in many orders—J0, J1, J2, etc.—and they all integrate to one.

Who’s responsible for these strangely-behaved functions? They’re named after the German astronomer Friedrich Bessel but they were first defined by the Swiss mathematician Daniel Bernoulli (1700–1782), a member of the brilliant Bernoulli family. The Bernoulli equation, mentioned in Chapter 1 of IPMB, is also named for Daniel Bernoulli. 

There was a time when I was in graduate school that I was obsessed with Bessel functions, especially modified Bessel functions that don’t oscillate. I’m not so preoccupied by them now, but they remain my favorite of the many special functions encountered in physics.

Friday, June 23, 2023

The Partition Function

Any good undergrad statistical mechanics class analyzes the partition function. However, Russ Hobbie and I don’t introduce the partition function in Intermediate Physics for Medicine and Biology. Why not? Its a long story.

Russ and I do discuss the Boltzmann factor. Suppose you have a system that is in thermal equilibrium with a reservoir at absolute temperature T. Furthermore, suppose your system has discrete energy levels with energy Ei, where i is just an integer labeling the levels. The probability Pi of the system being in level i, is proportional to the Boltzmann factor, exp(–Ei/kBT),

Pi = C exp(–Ei/kBT),

where exp is the exponential function, kB is the Boltzmann constant and C is a constant of proportionality. How do you find C? Any probability must be normalized: the sum of the probabilities must equal one,

Σ Pi = 1 ,

where Σ indicates a summation over all values of i. This means

Σ C exp(–Ei/kBT) = 1,

or

C = 1/[Σ exp(–Ei/kBT)] .

The sum in the denominator is the partition function, and is usually given the symbol Z,

Z = Σ exp(–Ei/kBT) .

In terms of the partition function, the probability of being in state Pi is simply

Pi = (1/Z) exp(–Ei/kBT) .

An Introduction to Thermal Physics, superimposed on Intermediate Physics for Medicine and Biology.
An Introduction to Thermal Physics,
by Daniel Schroeder.
Here is what Daniel Schroeder writes in his excellent textbook An Introduction to Thermal Physics,
The quantity Z is called the partition function, and turns out to be far more useful than I would have suspected. It is a “constant” in that it does not depend on any particular state s [he uses “s” rather than “i” to count states], but it does depend on temperature. To interpret it further, suppose once again that the ground state has energy zero. Then the Boltzmann factor for the ground state is 1, and the rest of the Boltzmann factors are less than 1, by a little or a lot, in proportion to the probabilities of the associated states. Thus, the partition function essentially counts how many states are accessible to the system, weighting each one in proportion to its probability.
To see why it’s so useful, let’s define β as 1/kBT. The Boltzmann factor is then

exp(–βEi)

and the partition function is

Z = Σ exp(–βEi) .

The average energy, <E>, is

<E> =
Ei exp(–βEi)]/[Σ exp(–βEi)] .

The denominator is just Z. The numerator can be written as the negative of the derivative of Z with respect to
β, dZ/dβ (try it and see). So, the average energy is

<E> = – (1/Z)
dZ/dβ .

I won’t go on, but there are other quantities that are similarly related to the partition function. It
s surprisingly useful.

Is the partition function hidden in IPMB? You might recognize it in Eq. 3.37, which determines the average kinetic energy of a particle at temperature T (the equipartition of energy theorem) It looks a little different, because there
s a continuous range of energy levels, so the sum is disguised as an integral. You can see it again in Eq. 18.8, when evaluating the average value of the magnetic moment during magnetic resonance imaging. The partition functions there, but its nameless.

Why didn
t Russ and I introduce the partition function? In the Introduction of IPMB Russ wrote: “Each subject is approached in as simple a fashion as possible. I feel that sophisticated mathematics, such as vector analysis or complex exponential notation, often hides physical reality from the student.” Like Russ, I think that the partition function is a trick that makes some equations more compact, but hides the essential physics. So we didnt use it.

Friday, June 9, 2023

Is Quantum Mechanics Necessary for Understanding Magnetic Resonance?

Is Quantum Mechanics Necessary for Understanding Magnetic Resonance? superimposed on Intermediate Physics for Medicine and Biology.
Hanson, L.,
Is Quantum Mechanics Necessary for
Understanding Magnetic Resonance?

Concepts Magn.Reson., 32:329–340, 2008
In Chapter 18 of Intermediate Physics for Medicine and Biology, Russ Hobbie and I discuss magnetic resonance imaging. Like many authors, we derive an expression for the magnetization of the tissue using quantum mechanics. Must we use quantum theory? In an article published in Concepts in Magnetic Resonance, Lars Hanson asks the same question: “Is Quantum Mechanics Necessary for Understanding Magnetic Resonance?” (Volume 32, Pages 329–340, 2008). The abstract is given below.
Educational material introducing magnetic resonance typically contains sections on the underlying principles. Unfortunately the explanations given are often unnecessarily complicated or even wrong. Magnetic resonance is often presented as a phenomenon that necessitates a quantum mechanical explanation whereas it really is a classical effect, i.e. a consequence of the common sense expressed in classical mechanics. This insight is not new, but there have been few attempts to challenge common misleading explanations, so authors and educators are inadvertently keeping myths alive. As a result, new students’ first encounters with magnetic resonance are often obscured by explanations that make the subject difficult to understand. Typical problems are addressed and alternative intuitive explanations are provided.
How would IPMB have to be changed to remove quantum mechanics from the analysis of MRI? Quantum ideas first appear in the last paragraph of Section 18.2 about the source of the magnetic moment, where we introduce the idea that the z-component of the nuclear spin in a magnetic field is quantized and can take on values that are integral multiples of the reduced Planck’s constant, ℏ. Delete that paragraph.

Section 18.3 is entirely based on quantum mechanics. To find the average value of the z-component of the spin, we sum over all quantum states, weighted by the Boltzmann factor. The end result is an expression for the magnetization as a function of the magnetic field. We could, alternatively, do this calculation classically. Below is a revised Section 18.3 that uses only classical mechanics and classical thermodynamics.
18.3 The Magnetization

The MR [magnetic resonance] image depends on the magnetization of the tissue. The magnetization of a sample, M, is the average magnetic moment per unit volume. In the absence of an external magnetic field to align the nuclear spins, the magnetization is zero. As an external magnetic field B is applied, the spins tend to align in spite of their thermal motion, and the magnetization increases, proportional at first to the external field. If the external field is strong enough, all of the nuclear magnetic moments are aligned, and the magnetization reaches its saturation value.

We can calculate how the magnetization depends on B. Consider a collection of spins of a nuclear species in an external magnetic field. This might be the hydrogen nuclei (protons) in a sample. The spins do not interact with each other but are in thermal equilibrium with the surroundings, which are at temperature T. We do not consider the mechanism by which they reach thermal equilibrium. Since the magnetization is the average magnetic moment per unit volume, it is the number of spins per unit volume, N, times the average magnetic moment of each spin: M=N<μ>, where μ is the magnetic moment of a single spin.

To obtain the average value of the z component of the magnetic moment, we must average over all spin directions, weighted by the probability that the z component of the magnetic moment is in that direction. Since the spins are in thermal equilibrium with the surroundings, the probability is proportional to the Boltzmann factor of Chap. 3, e–(U/kBT) = eμBcosθ/kBT, where kB is the Boltzmann constant. The denominator in Eq. 18.8 normalizes the probability:


The factor of sinθ arises when calculating the solid angle in spherical coordinates (see Appendix L). At room temperature μB/(kBT) ≪ 1 (see Problem 4), and it is possible to make the approximation ex ≈ 1 + x. The integral in the numerator then has two terms:
The first integral vanishes. The second is equal to 2/3 (hint: use the substitution u = cosθ). The denominator is

The first integral is 2; the second vanishes. Therefore we obtain
The z component of M is



which is proportional to the applied field.
The last place quantum mechanics is mentioned is in Section 18.6 about relaxation times. The second paragraph, starting “One way to analyze the effect…”, can be deleted with little loss of meaning; it is almost an aside.

So, to summarize, if you want to modify Chapter 18 of IPMB to eliminate any reference to quantum mechanics, then 1) delete the last paragraph of Section 18.2, 2) replace Section 18.3 with the modified text given above, and 3) delete the second paragraph in Section 18.6. Then, no quantum mechanics appears, and Planck’s constant is absent. Everything is classical, just the way I like it.

Friday, February 24, 2023

A Simple Mathematical Function Representing the Intracellular Action Potential

In Problem 14 of Chapter 7 in Intermediate Physics for Medicine and Biology, Russ Hobbie and I chose a strange-looking function to represent the intracellular potential along a nerve axon, vi(x). It’s zero everywhere except in the range −a < x < a, where it’s
 

 
Why this function? Well, it has several nice properties, which I’ll leave for you to explore in this new homework problem.
Section 7.4

Problem 14 ¼. For the intracellular potential, vi(x), given in Problem 14
(a) show that vi(x) is an even function,
(b) evaluate vi(x) at x = 0,
(c) show that vi(x) and dvi(x)/dx are continuous at x = 0, a/2 and a, and
(d) plot vi(x), dvi(x)/dx, and d2vi(x)/dx2 as functions of x, over the range −2a < x < 2a.
This representation of vi(x) has a shape like that of an action potential. Other functions also have a similar shape, such as a Gaussian. But our function is nice because it’s non-zero over only a finite region (−a < x < a) and it’s represented by a simple, low-order polynomial rather than a special function. An even simpler function for vi(x) would be triangular waveform, like that shown in Figure 7.4 of IPMB. However, that function has a discontinuous derivative and therefore its second derivative is infinite at discrete points (delta functions), making it tricky (but not too tricky) to deal with when calculating the extracellular potential (Eq. 7.21). Our function in Problem 14 ¼ has a discontinuous but finite second derivative.

The main disadvantage of the function in Problem 14 ¼ is that the depolarization phase of the “action potential” has the same shape as the repolarization phase. In a real nerve, the upstroke is usually briefer than the downstroke. The next new homework problem asks you to design a new function vi(x) that does not suffer from this limitation.
Section 7.4

Problem 14 ½. Design a piecewise continuous mathematical function for the intracellular potential along a nerve axon, vi(x), having the following properties. 
(a) vi(x) is zero outside the region −a < x < 2a
(b) vi(x) and its derivative dvi(x)/dx are continuous. 
(c) vi(x) is maximum and equal to one at x = 0. 
(d) vi(x) can be represented by a polynomial bi + ci x + di x2, where i refers to four regions: 
        i = 1,    −a < x < −a/2 
        i = 2, −a/2 < x < 0 
        i = 3,      0 < x < a
        i = 4,      a < x < 2a.
Finally, here’s another function that I’m particularly fond of.
Section 7.4

Problem 14 ¾. Consider a function that is zero everywhere except in the region −a < x < 2a, where it is

(a) Plot vi(x) versus x over the region −a < x < 2a,
(b) Show that vi(x) and its derivative are each continuous. 
(c) Calculate the maximum value of vi(x).
Simple functions like those described in this post rarely capture the full behavior of biological phenomena. Instead, they are “toy models” that build insight. They are valuable tools when describing biological phenomena mathematically.

Friday, November 25, 2022

Reduced Mass

In Section 14.4 of Intermediate Physics for Medicine and Biology, Russ Hobbie and I discuss molecular energy levels. In particular, we examine translational, rotational, and vibrational levels. When analyzing rotational levels, we consider a simple diatomic molecule and divide the motion into two parts: a uniform translation of the center of mass, and a rotation about the center of mass. We show that the rotational energy can be written as ½2, where ω is the rotational angular frequency and I is the moment of inertia. The moment of inertia is I = [m1 m2/(m1 + m2)] R2, where m1 and m2 are the mass of the two atoms making up the diatomic molecule, and R is the distance between them. In quantum mechanics, the spacing of rotational energy levels depends on I.

Later in the same section, Russ and I consider vibrational motion. However, we don’t do a detailed analysis for a diatomic molecule, like we did for rotational motion. In this blog post, I will remedy that situation and present the analysis of vibrations of a diatomic molecule. Our goal is to derive an expression for the vibration frequency in terms of the masses of the two atoms and the spring constant connecting them.

Let’s do the analysis in one dimension. Consider two atoms with mass m1 and m2 connected by a spring with spring constant k. The position of m1 is x1, and the position of m2 is x2.

First, write down Newton’s second law for each atom.

    m1 d2x1/dt2  = − k (x1x2 ) ,

    m2 d2x2/dt2  =    k (x1 x2 ) .

Next, define two new variables, as we did for rotational motion: x, the position of the center of mass, and X, the distance between the two masses

     x = [m1/(m1+m2)] x1 + [m2/(m1+m2)] x2 ,

    X = x1x2 .

Then, rewrite Newton’s second law in terms of x and X. After some algebra, we get two equations

     (m1+m2) d2x/dt=  0 

     [m1m2/(m1+m2)] d2X/dt=   − kX

The first equation represents a free particle of mass M, where

     M = m1+m2 ,

and the second equation represents a bound particle with spring constant k and mass m (often called the reduced mass)

     m = m1m2/(m1+m2) . 

The angular frequency of the vibration is therefore

     ω = √(k/m)

(If you don’t follow that last step, see Appendix F of IPMB).

We have reached our goal: the angular frequency of the vibration, ω, written in terms of k, m1, and m2

     ω = √[k (m1+m2)/m1m2]

In quantum mechanics, the energy levels depend on ω, and therefore on the reduced mass m.

If m1 >> m2 then m is approximately m2. Likewise, if m2 >> m1 then m is approximately m1. For example, if you want the vibration frequency of hydrogen chloride (HCl), the reduced mass is close to the mass of the hydrogen atom.

If m1 = m2 = μ (like for molecules such as O2 and N2), then the reduced mass m is equal to μ/2. It’s that factor of two in the denominator that’s the surprise.

Quantum Physics of Atoms, Molecules, Solids, Nuclei and Particles, by Eisberg and Resnick, superimposed on Intermediate Physics for Medicine and Biology.
Quantum Physics of Atoms,
Molecules, Solids, Nuclei and Particles
,
by Eisberg and Resnick.

The relationship between m, m1, and m2 can be written

    1/m = 1/m1 + 1/m2 .

This looks just like the equation for adding resisters in parallel

If you want to learn more, I suggest looking at Chapter 12 of Quantum Physics of Atoms, Molecules, Solids, Nuclei, and Particles, by Robert Eisberg and Robert Resnick, often cited in IPMB.

Friday, September 2, 2022

Numerical Integration

A homework problem in Chapter 14 of Intermediate Physics for Medicine and Biology states
Problem 28. Integrate Eq. 14.33 over all wavelengths to obtain the Stefan-Boltzmann law, Eq. 14.34. You will need the integral
An integral from Homework Problem 28 in Intermediate Physics for Medicine and Biology.
Equation 14.33 is Planck’s blackbody radiation law and Eq. 14.34 specifies that the total power emitted by a blackbody.

Suppose Russ Hobbie and I had not given you that integral. What would you do? Previously in this blog I explained how the integral can be evaluated analytically and perhaps you’re skilled enough to perform that analysis yourself. But it’s complicated, and I doubt most scientists could do it. If you couldn’t, what then?

You could integrate numerically. Your goal is to find the area under the curve shown below.
A plot of the function to be integrated, as a function of x.
Unfortunately x ranges from zero to infinity (the plot shows the function up to only x = 10). You can’t extend x all the way to infinity in a numerical calculation, so you must either truncate the definite integral at some large value of x or use a trick.

A good trick is to make a change of variable, such as
When x equals zero, t is also zero; when x equals infinity, t is one. The integral becomes
The integral from the homework problem, expressed in terms of t rather than x.
Although this integral looks messier than the original one, it’s actually easier to evaluate because the range of t is finite: zero to one. The integrand now looks like this: 

A plot of the function to be integrated, as a function of t. 
The colored stars in these two plots are to guide the reader’s eye to corresponding points. The blue star at t = 1 is not shown in the first plot because it corresponds to x = ∞.

We can evaluate this integral using the trapezoid rule. We divide the range of t into N subregions, each extending over a length of Δt = 1/N. Ordinarily, we have to be careful dealing with the two endpoints at t = 0 and 1, but in this case the function we are integrating goes to zero at the endpoints and therefore contributes nothing to the sum. The approximation is shown below for N = 4, 8, and 16. 

Plots of three different approximations of the integral, for N=4, 8, and 16.
 
The area of the purple rectangles approximates the area under the red curve This approximation gets better as N gets bigger. In the limit as N goes to ∞, you get the integral.

I performed the calculation using the software Octave (a free version of Matlab). The program is:
N=8; 
dt=1/N; 
s=0; 
for i=1:N-1 
     t=i*dt; 
     s=s+dt*t^3/((exp(t/(1-t))-1)*(1-t)^5); 
     endfor
I found the results shown below. The error is the difference between the numerical integration and the exact result (π4/15 = 6.4939…), divided by the exact result, and expressed as a percent difference.

   N    I % error
    2    1.1640    –82
    4    6.2823      –3.26
    8    6.6911        3.04
  16    6.5055        0.178
  32    6.4940        0.000282
  64    6.4939        0.00000235
128    6.4939        0.00000000174

These results show that you can evaluate the integral accurately without too much effort. You could even imagine doing this by hand if you didn’t have access to a computer—using, say, N = 16—and getting an answer accurate to better than two parts per thousand.

For many purposes, a numerical solution such as this one is adequate. However, 6.4939… doesn’t look as pretty as π4/15. I wonder how many people could calculate 6.4939 and then say “Hey, I know that number; It’s π4/15”!

Friday, September 10, 2021

Is Shot Noise Also White Noise?

In Chapters 9 and 11 of Intermediate Physics for Medicine and Biology, Russ Hobbie and I discuss shot noise.

9.8.1 Shot Noise

The first (and smallest) limitation [on our ability to measure current] is called shot noise. It is due to the fact that the charge is transported by ions that move randomly and independently through the channels....

11.16.2 Shot Noise

Chapter 9 also mentioned shot noise, which occurs because the charge carriers have a finite charge, so the number of them passing a given point in a circuit in a given time fluctuates about an average value. One can show that shot noise is also white noise [my italics].
Introduction to Membrane Noise, by Louis DeFelice, superimposed on Intermediate Physics for Medicine and Biology.
Introduction to Membrane Noise,
by Louis DeFelice
How does one show that shot noise is white noise (independent of frequency)? I’m going to follow Lou DeFelice’s explanation in his book Introduction to Membrane Noise (cited in IPMB). I won’t give a rigorous proof. Instead, I’ll first state Campbell’s theorem (without proving it), and then show that the whiteness of shot noise is a consequence of that theorem.

Campbell’s Theorem

To start, I’ll quote DeFelice, but I will change the names of a few variables.
Suppose N impulses i(t) arrive randomly in the time interval T. The sum of these will result in a random noise signal I(t). This is shown qualitatively in Figure 78.1.

Below is my version of Fig. 78.1.

A diagram illustrating the sum of N impulses, i(t), each shown in red, arriving randomly in the time interval T. The blue curve represents their sum, I(t), and the green dashed line represents the average, <I(t)>. Adapted from Fig. 78.1 in Introduction to Membrane Noise by Louis DeFelice.
A diagram illustrating the sum of N impulses, i(t), each shown in red, arriving randomly in the time interval T. The blue curve represents their sum, I(t), and the green dashed line represents the average, <I(t)>. Adapted from Fig. 78.1 in Introduction to Membrane Noise by Louis DeFelice.

DeFelice shows that the average of I(t), which I’ll denote <I(t)>, is

Equation for the average of I(t).
Here he lets T and N both be large, but their ratio (the average rate that the impulses arrived) remains finite.

He then shows that the variance of I(t), called σI2, is
Equation for the variance of I(t).
Finally, he writes

In order to calculate the spectral density of I(t) from i(t) we need Rayleigh’s theorem [also known as Parseval’s theorem]…
Parseval's theorem
where î(f) is the Fourier transform of i(t) [and f is the frequency].

He concludes that the spectral density SI(f) is given by

Equation for the spectral density of I(t).

These three results (for the average, the variance, and the spectral density) constitute Campbell’s theorem.

Shot Noise

Now, let’s analyze shot noise by using Campbell’s theorem assuming the impulse is a delta function (zero everywhere except at t = 0 where it’s infinite). Set i(t) = q δ(t), where q is the charge of each discrete charge carrier.

First, the average <I(t)> is simply Nq/T, or the total charge divided by the total time. 

Second, the variance is the integral of the delta function squared. When any function is multiplied by a delta function and then integrated over time, you get that function evaluated at time zero. So, the integral of the square of the delta function gives the delta function itself evaluated at zero, which is infinity. Yikes! The variance of shot noise is infinite.

Third, to get the spectral density of shot noise we need the Fourier transform of the delta function. 

Equation for the spectral density of shot noise.
The key point is that SI(f) is independent of frequency; it’s white.

DeFelice ends with

This [the expression for the spectral density] is the formula for shot noise first derived by Schottky (1918, pp. 541-567) in 1918. Evidently, the variance defined as
Equation for the variance in terms of the spectral density.
is again infinite; this is a consequence of the infinitely small width of the delta function.
As DeFelice reminds us, shot noise is white because the delta function is infinitely narrow. As soon as you assume i(t) has some width (perhaps the time it takes for a charge to cross the membrane), the spectrum will fall off at high frequencies, the variance won’t be infinite (thank goodness!), and the noise won’t be white. The bottom line is that shot noise is white because the Fourier transform of a delta function is a constant.

Conclusion

Perhaps you’re thinking I haven’t helped you all that much. I merely changed your question from “why is shot noise white” to “how do I prove Campbell’s theorem.” You have a point. Maybe proving Campbell’s theorem can be the story of another post.

I met Lou DeFelice in 1984, when I was a graduate student at Vanderbilt University and he came to give a talk. In the summer of 1986, my PhD advisor John Wikswo and I traveled to Emory University to visit DeFelice and Robert DeHaan. During that trip, Wikswo and I were walking across the Emory campus when Wikswo decided he knew a short cut (he didn’t). He left the sidewalk and entered a forest, with me following behind him. After what seemed like half an hour of wandering through a thicket, we emerged from the woods at a back entrance to the Yerkes Primate Research Center. We’re lucky we weren’t arrested.

DeFelice joined the faculty at Vanderbilt in 1995, and we both worked there in the late 1990s. He was a physicist by training, but spent most of his career studying electrophysiology. Sadly, in 2016 he passed away.