Log in

No account? Create an account

Previous Entry | Next Entry

Global Warming - How to Lie with Statistics

I usually try to avoid getting involved in debates about global warming these days - it tends to suck up huge amounts of my research time when I try to get things right while the other "debaters" just slap together a few random links they got from Google within half a minute. However, sometimes I just cannot resist, and the following examination is the latest result.

It started out with this debate. I was given a link to this article by one Christopher Monckton as evidence that global temperatures in the last decade had not in fact increased, but decreased. So I took a look at the provided graph and was fascinated - by the sheer amount of blatant manipulation I encountered:

Fortunately, the author provided a source for his data - the HadCRUT3 data set. So I sat out to recreate the graph, and managed to do so.:

First of all, while the used data did show a cooling trend, my linear fit (done with GNUPlot) produced a cooling of

-0.00156427 °C/month.

Extrapolated for an entire decade, like the author has done, this would translate into:

-0.00156427*12*10 °C/decade = -0.1877124 °C/decade

This is not even half as much as the 0.4 °C/decade the author claimed. But wait, it gets better!

It seems that the author of the article, has deliberately started using the 2001-2008 HadCRUT3 data set with one of the hottest months in this period - which happens to be January 2002 (so he didn't use any 2001 data after all, despite the caption of the graph) - and then ended using the data with the very coldest month in this period, which was the abnormally cold January 2008:

With such a self-selected data set to confirm his bias, is it any wonder that he got a significant cooling trend?

And what's with using only six years and one month to extrapolate a "cooling per decade" value - especially when the same data set goes back for far more than a decade and a real value could be easily calculated?

No wonder that this guy apparently doesn't have any peer-reviewed papers to his name - with such blatant attempts at cooking the data, the reviewers would laugh him out of town.

So what would temperature trends over the last decade actually look like?

Well, I've used the same data set for the period from April 1998 to March 2008 (the last entry), and the following graph is the result:

The linear fit produced a warming of 0.0349044 °C for the entire decade - not much, but as this long-term graph generated from the same data set shows, 1998 was an abnormally warm year while the last winter was particularly harsh:

To sum it up, global temperatures have indeed increased during the last decade, if not as strongly as in the time before that. We will have to continue to watch the long-term trends of global temperatures - and be wary of anyone who attempts to cook the data for his own agenda.


Apr. 18th, 2008 05:29 pm (UTC)
A few questions
Is "global temperature" the same thing as "the least-squares best fit linear trend to the global temperature"? Which have you tested claims about?

On what basis is a linear trend plus noise taken as the appropriate statistical model? Is the data really likely to be linear plus iid Normal errors, as using this statistic would suggest?

In the final graphs, why are the areas under the curve shaded red and blue? Does the area mean something? What smoothing have you used for the black lines, and why is it appropriate?

Given that the L-S linear trend you get varies so much with the selection of end points, how much meaning can be assigned to any of them? Is any answer right or wrong?

When the author wrote the note and constructed the graph, were March 08's numbers out yet?

Aug. 2nd, 2008 03:30 pm (UTC)
Re: A few questions
Re: "A few questions" (Anonymous)

Global average temperature since about 1975 is indistinguishable from a linear trend plus random noise. The noise is not iid, it's autocorrelated, and it appears not to be normally distributed. But that is *not* an implication of using linear least-squares regression, that analysis gives an unbiased estimate of the trend rate even when the random part of the data is *not* iid normal.

The final graph comes from HadCRU, not from Jurgen, so if you want to know what smoothing method is used you'll have to ask them.

To attach "meaning" to trends from L-S regression, one has to compute the probable error of the trend rate. Doing so requires compensating for the fact that the random part of the data is *not* iid, rather it's autocorrelated, which inflates the probable errors. In most geophysical analyses, the random process is approximated as an "AR1" process, but for global temperature it's demonstrably *not* AR1, and the probable error in a trend estimate from least-squares regression is even greater than the estimate from assuming an AR1 model.

I don't know whether or not March 2008 data were available when the erroneous graph was produced. But certainly 2001 data were available. And just as certainly, -0.1877124 is not even approximately -0.4.


The Standard
Jürgen Hubert
Arcana Wiki

Latest Month

August 2011


Page Summary

Powered by LiveJournal.com
Designed by Jared MacPherson