Friday, May 14, 2010

Single-Pool Exponential Decomposition Models: Potential Pitfalls in their Use in Ecology Studies

In Section 11.2 of the 4th edition of Intermediate Physics for Medicine and Biology, Russ Hobbie and I discuss fitting data using nonlinear least squares. Our first example in this section is a fit using a single exponential decay, y(x) = a e–bx, where a and b are to be determined. We suggest that the reader
“take logarithms of each side of the equation

log y = log ab x log e

v = a' – b' x.

This can be fit by the linear [least squares] equation, determining constants a' and b' using Eqs. 11.5.”
This method works fine for ideal data, but in almost any real application the data will be corrupted by noise. In that case, fitting a linear equation to the logarithm of the data may not be wise. I discussed this issue last year in the May 22 entry to this blog, but wish to explore it in more detail this week.

Recently, Russ coauthored a paper about his recent results on this topic, published in the journal Ecology (Volume 91, Pages 1225–1236). In collaboration with his daughter Sarah Hobbie (Associate Professor in the Department of Ecology, Evolution and Behavior at the University of Minnesota), and her former postdoc E. Carol Adair (currently with the National Center for Ecological Analysis and Synthesis at the University of California Santa Barbara), Russ examined “Single-Pool Exponential Decomposition Models: Potential Pitfalls in their Use in Ecology Studies.” The abstract to the paper is given below.
The importance of litter decomposition to carbon and nutrient cycling has motivated substantial research. Commonly, researchers fit a single-pool negative exponential model to data to estimate a decomposition rate (k). We review recent decomposition research, use data simulations, and analyze real data to show that this practice has several potential pitfalls. Specifically, two common decisions regarding model form (how to model initial mass) and data transformation (log-transformed vs. untransformed data) can lead to erroneous estimates of k. Allowing initial mass to differ from its true, measured value resulted in substantial over- or underestimation of k. Log-transforming data to estimate k using linear regression led to inaccurate estimates unless errors were lognormally distributed, while nonlinear regression of untransformed data accurately estimated k regardless of error structure. Therefore, we recommend fixing initial mass at the measured value and estimating k with nonlinear regression (untransformed data) unless errors are demonstrably lognormal. If data are log-transformed for linear regression, zero values should be treated as missing data; replacing zero values with an arbitrarily small value yielded poor k estimates. These recommendations will lead to more accurate k estimates and allow cross-study comparison of k values, increasing understanding of this important ecosystem process.
The authors performed a massive review of the literature, reading and analyzing nearly 500 papers about litter decomposition, most of which fit data to an exponential decay, e–kt. The bottom line is that doing a linear least squares fit to the logarithm of the data can cause significant errors. Much better is to use a nonlinear least squares fit. The manuscript concludes “We suggest that careful selection of fitting methods, as we have described above, will lead to more accurate and comparable k estimates, thereby increasing our understanding of this important ecosystem process.” Of course, my favorite thing about their paper is that it cites the 4th edition of Intermediate Physics for Medicine and Biology!

One pitfall can be illustrated by considering measurements of the voltage across a resistor in an RC circuit. The voltage decays with an RC time constant. However thermal, or Johnson, noise is also present (see Section 9.8). Once the voltage decays to less than the Johnson noise, the measured voltage fluctuates between positive and negative values. If you take the logarithm of the voltage, the negative values are undefined. In other words, you can’t do a linear least squares fit to the logarithm of the data if the data can be negative. The problem remains even when the data is nonnegative (as in Hobbie’s paper) if the data can be zero. However, if you make a nonlinear least squares fit of the data itself (rather than the logarithm of the data) the problem vanishes.

In order to explain these observations in Intermediate Physics for Medicine and Biology, Russ and I (mainly Russ) wrote an Addendum available at the book’s website. It lists what changes are needed to properly explain least squares fitting of exponential data. Enjoy!

1 comment:

  1. Very interesting,.. not the particular problem I am facing, but I might gain some insight by working through this one.

    ReplyDelete