Trend That Is Not A Trend: Changes in Data Definition or Measurement Technology

This chart illustrates a data analysis mistake that is absolutely endemic to many of the most famous climate charts.  Marc Morano screencapped this from a new EPA web site  (update:  Actually originally from Pat Michaels at Cato)

The figure below is a portion of a screen capture from the “Heat-Related Deaths” section of the EPA’s new “Climate Change Indicators” website. It is labeled “Deaths Classified as ‘Heat-Related’ in the United States, 1979–2010.”

click to enlarge

The key is in the footnote, which says

Between 1998 and 1999, the World Health Organization revised the international codes used to classify causes of death. As a result, data from earlier than 1999 cannot easily be compared with data from 1999 and later.

So, in other words, this chart is totally bogus.  There is an essentially flat trend up to the 1998 switch in data definition and an essentially flat trend after 1998.  There is a step-change upwards in 1998 due to the data redefinition.  This makes this chart useless unless your purpose is to fool generally ignorant people that there is an upwards trend, and then it is very useful.  It is not, however, good science.

Other examples of this step change in a metric occurring at a data redefinition or change in measurement technique can be found in

  • The hockey stick  (and here)
  • Ocean heat content  (sorry, can't find the link but the shift from using thermometers in pails dipped from ships to the ARGO floats caused a one time step change in ocean heat content measurements)
  • Tornadoes
  • Hurricanes
28 Comments
Inline Feedbacks
View all comments

Climate Audit covered the bucket/float issue quite comprehensively.

Or rather, bucket/engine intake.

And it wasnt a step change, the algorithm used to merge was just based on faulty assumptions, which led to too steep a climb IIRC. A more gentle climb over a longer period was real though - buckets measure cooler than engine intakes.

I used to work at a small airport. One of the duties of that job was to log the temperature, barometric pressure and wet / dry bulb thermometer readings several times a day. The logbooks went back to 1929 using the same thermometer. When they installed an Automated Weather Station they threw the logbooks in the trash. I would love to have those books now to see what the temperature trend would have been.

Coyote, correct me if I am wrong, but does one not have to round all data to the least significant integer?

Thus, if one uses data from a thermometer outside MIT that says today at noon in Boston it is 77.227654319 degrees Fahrenheit, and also uses a reading at the Wichita airport at noon on a different day that is 79 degrees, plus or minus a degree, one has to round the MIT observation, to 77 degrees before drawing any conclusions.

Similarly, if one uses ice cores or tree rings as a proxy for part of your data, one needs to determine the precision of those temperature measurements, allowing for their reliability. Let's assume that those figures are precise to the nearest ten. That means you have to use 80 degrees for both MIT and Wichita, not 80.00000000.

So when I read in ExxonMobil's Lamp that they do not mind agreeing that global temperatures have risen about a half a degree or so, I will buy that as a reasonable assumption. When I read that Coyote believes it too, given that he appears to be skeptical of just about everything, who am I to argue?

Nor will argue that a CO2 molecule permits x amount of radiation to reach the earth's surface, and traps y heat from escaping.

I will also buy the assertion that CO2 is approx approximately 400 parts per million today.

However, start making comparisons with what the air's CO2 was over Greenland was twenty years ago or 200 years ago, and then say it was 400 parts per million then, the next question is, how do you know? How do you test the ice core, and how do you know how much the snow absorbed CO2 at what rate, and how far did the snow fall, and how useful are Greenland ice cores for anything except for saying how much precipitation the ice core site got?

Sorry, I did not read your post sooner. And airports were not around much before 1930. What was the temperature in Wink, Texas, on July 4? Very hot.

I think 1934 or 1936 still set the record for heat deaths in the US. Before air conditioning hot spells were serious business.

That's probably why Conservatives can't get wrapped up AGW - even as the world get warmer we'll have more ways to deal with it as technology gets better.

Harry: Thus, if one uses data from a thermometer outside MIT that says today at noon in Boston it is 77.227654319 degrees Fahrenheit, and also uses a reading at the Wichita airport at noon on a different day that is 79 degrees, plus or minus a degree, one has to round the MIT observation, to 77 degrees before drawing any conclusions.

When dealing with large ensembles of data, that is not correct.

Butler County Airport in Western Pennsylvania was built to train pilots for World War I it opened in 1917.

When dealing with large ensembles of data? What is a large ensemble, and what is an ensemble of temperature data?

The problem (one problem) is that the further back one goes in the temperature record, the data gets less precise.

So, we have climatologists armed with an ensemble of terabytes of temperature readings, some fraction of which are precise to a hundredth of a degree, from instruments calibrated to the same standard. I would expect them to include less precise data, as in airport data, where it was not critical to be precise to a tenth of a degree to get airplanes off the ground, so some of that data had to depend on how tall the guy reading the thermometer was. So if you are going to use that data in your ensemble to make it large, you still have to round the rest of the data to that degree of precision; alternatively, you can throw out the airport data and be left with a smaller ensemble.

Now, one can ignore the rounding problem and say it does not matter when predicting that the ice caps will melt by 2075. Hey, it's a free country. But at least it would be fair to be clear about you do science.

Furthermore, even imprecise temperature data can be useful. Every seed catalog and garden book has a map showing zones appropriate for planting everything in the book. I do not know whether those maps have been moved north a bit in response to IPCC reports, but then those maps are not used to predict melting ice caps and rising oceans.

Finally, while there is abundant data -- maybe not abundant enough -- about the temperature in North America, there is much less data in many large places in the world.

Harry: The problem (one problem) is that the further back one goes in the temperature record, the data gets less precise.

Sure.

Harry: So if you are going to use that data in your ensemble to make it large, you still have to round the rest of the data to that degree of precision; alternatively, you can throw out the airport data and be left with a smaller ensemble.

That is incorrect.

Here's a basic question from statistics. If a single measurement is ±0.5, if we take a hundred measurements, what is our standard error?

Reply is above.

That does not answer the question, Z. Not rounding to the least significant integer flunks the lab, unless you specify clearly what version of science you are using.

It was a question. If a single measurement is ±0.5, if we take a hundred measurements, what is our standard error?

This relates directly to how statisticians deal with ensembles of data that have varying amounts of error.

So how do statisticians deal with the rounding problem? By assigning homework? And what are the definitions of "large" and "ensemble"?

Zach, if you do not want to go to the trouble of answering, that is OK. If the answer is that statisticians do not round their data, and instead state the margin of error, however they determine it, OK. My chemistry and physics professors said to round to the least significant integer, and the lab assistant helped us regardless of what ensemble we were wearing. Now, some of our weighing was done with the scales in glass cases with the little weights, so maybe the elements of the scientific method have changed. I'd appreciate anyone's answer on these questions.

What I suspect is that many "climate scientists" who speak for the IPCC rely on conclusions about the same data, a big "ensemble" of data that is difficult to check. It is an epistemological problem where past temperature is but one variable. Another problem is that many of the parties are not disinterested people, so it is reasonable to ask whether they are rigging the numbers.

Harry: So how do statisticians deal with the rounding problem?

There's a lot of methods, but you might want to start with simple cases, such as multiple measurements. The standard error decreases proportional to the square root of the number of measurements. So, if we have individual measurements which are ±0.5, then a hundred measurements (assuming no systematic bias and a standard distribution about the true value) will have a standard error of ±0.05.

I follow you, but that does not negate the principle of rounding to the least significant integer. Say. you have two thermometers that are calibrated with one another, and one thermometer is calibrated to .5 degrees and the other is more precise, say, to a thousandth of a degree and you take a hundred measurements, fifty from each thermometer. You still have to round all your data to the nearest half degree. The more observations you add, the more confident one becomes about whether it is, say, 70.5 degrees, as opposed to 70.0, but you cannot infer from your data that it was somewhere halfway in between.

Now, if you want a more precise reading, use the more precise thermometer and take a hundred readings. and do not use the other thermometer.

So, nobody is arguing about whether the earth has warmed over the past century by something like a half degree centigrade, in part because that is beside the point.

The argument the climate alarmists make is that an increase of roughly 400 parts per million of CO2, a product of combustion resulting from burning carbon fuels --in particular the man-made part that runs machines and heats and cools homes-- will produce disastrous consequences soon, and sooner than ignorant people think. Even though the energy trapped by this trace gas may be small, its effect is amplified when water evaporates and condenses to form clouds, which are a big influence over the weather. This argument accounts for the coincidence between fossil fuel use and an observed rise in temperature. Therefore we have to End Coal, tax energy more to raise the price, subsidize Tesla, live in smaller houses, paint the slate roof white, and live a much more humble life.

As Coyote and others have observed, there are problems with both the computer models that predict the global climate, which attempt to explain the past, about which we have fuzzy ideas of how warm it was, and then infer what it will be years hence. One of the big assumptions is the magnitude of the feedback in cloud formation from barely detectable rises in surface temperature.

As questions about AGW arise, the alarmists fit whatever happens into their (crackpot?) model if they can, or simply dismiss whatever does not fit, often simultaneously attacking skeptics as being deniers of the reality of climate change.

So it matters how they round the numbers, among other things.It matters whether vegetation grows faster in an environment where the CO2 is 450 ppm.

Harry: You still have to round all your data to the nearest half degree.

It's usual to report in tenths of a degree with an uncertainty of ±0.5.

Harry: you cannot infer from your data that it was somewhere halfway in between.

Yes, actually you can. Even when reporting rounded figures, the mean will be some value in the middle, and the standard error will decrease with increasing numbers of measurements. So even though each measurement is reporting ±0.5, multiple measurements will decrease that uncertainty.
http://en.wikipedia.org/wiki/Standard_error

It has been 15 years since I took statistics in college but there was one line from the text book that caused me to laugh out loud and I will never forget:

There are three types of lies: lies, damned and statistics.

With or without the activity of man, the world's climate has always been in a constant state of flux and always will. To put man's activity in perspective, in 1815 a volcano in the Indian Ocean erupted and spewed all sorts of chemicals into the atmosphere. The year 1816 was known as the "year without a summer" as a result of this eruption. Global temperatures were significantly reduced as a result of this eruption. In June 1816 it snowed in Albany, NY. This single volcanic eruption did more to alter global climate than 250 years of the industrial revolution. And that is using the most drastic claims of the effect of man made activity on climate over that period of time.

The ironic thing about Feline's comment is that there are fewer climate related deaths today as a result of man's activity.

I suppose you can infer that, but a statistical inference is not the same thing as a fact. And yes, many theometer data are available precise to a tenth of a degree. So when presenting the historical record for thousands of locations on land and sea, you have to use thermometers that were precise to tenths of a degree. That then would legitimately to comment on the temperature for a region and points in between, assuming you knew what was going on with the rest of the weather in between those points. The operative verb here is "infer". But I am not going to argue that all inferences are unreasonable, and am pleased to discover you agree that one's data have to be rounded to the least significant integer.

What you cannot do is use a proxy like tree rings, whose size depends on several variables, including temperature and available water and soil fertility, to arrive at how hot it was that year, plus or minus a tenth of a degree, just because no better data are available. You have to account for all of that, including the precision of your instrumentation. As Dirty Harry said, a man has to know his limitations.

By the way, you have not explained what an ensemble is to statisticians. Is it a short-sleeved shirt with a white pocket protector combined with Levi's and Chucks with no socks and a pearl necklace?

Harry: I suppose you can infer that, but a statistical inference is not the same thing as a fact.

Oh gee whiz. Statistics underlies all measurements, from astronomy to zoology.

Harry: So when presenting the historical record for thousands of locations on land and sea, you have to use thermometers that were precise to tenths of a degree.

No. As we have pointed out many times, the standard error decreases by the square root of the number of measurements, and it is quite possible to combine records with differing degrees of accuracy and precision.

Harry: But I am not going to argue that all inferences are unreasonable, and am pleased to discover you agree that one's data have to be rounded to the least significant integer.

That is not correct. Pick up a standard text in statistics and look up standard error.

Harry: I suppose you can infer that, but a statistical inference is not the same thing as a fact.

Statistics informs all measurement, from astronomy to zoology.

Harry: So when presenting the historical record for thousands of locations on land and sea, you have to use thermometers that were precise to tenths of a degree.

No, you don't. As explained above, the standard error decreasing by the square root of the number of measurements.

No, you are wrong, Z, and you miss my point about inferences, and perhaps I have not been clear.

If a sportsman/statistician were to fire a rifle from point A to point B, he goes to the range and sights his rifle. Depending on how steady he is, it may take ten or a thousand times to know where he should aim at a known distance to hit the bulls eye. If he wants to, he can derive an equation that will tell him with great precision where that bullet is, theoretically, along its path, thanks to Newton. All the while, his practical experience improves his aim and confirmation that Newton was right about how things work.

When he goes home, he pulls out a book on the trajectories of blunderbusses, hoping to improve on what he learned at the range, and discovers a large ensemble of data just in volume 23. So he puts all that data into his computer to find out precisely where to aim a blunderbuss as well as he aims his deer rifle.

Please do not refer me to any statistics textbook to argue that one can refine the blunderbuss data, and don't say we do not have enough data to refine. Talk about tree rings and what you do to make sense of them, and how you avoid the post hoc ergo propter hoc fallacy. If the latter problem is unfamiliar to you , I might explain it further and not refer you to a logic textbook.

What I suspect is that you are not what people refer to as a skeptic.

And what is a large ensemble? Has it occurred to you that it may be an elusive word loosely defined? I would like to know what it means to you. If you woke up in the hospital in a body cast and the doctor said, what we have here is a large ensemble of data about you, and therefore we know exactly what your ten problems are, would you ask him to explain himself further?

Harry: Please do not refer me to any statistics textbook to argue that one can refine the blunderbuss data, and don't say we do not have enough data to refine.

Good example. A gun will not fire exactly the same every time. Let's say we want to determine the true center of a gun's firing pattern, and we've noticed that at a certain distance the shot ranges a centimeter or so around the bulls eye.

To test this, we fire a single shot from a fixed rifle. It may be on the bulls eye, or a centimeter or two away. We really can't say whether this is the true sight of the rifle or not. However, if we fire many shots, they will form a pattern around a fixed point. The pattern of shots will provide a high confidence of the exact center of the rifle pattern (the mean), and the spread (the standard deviation). So even though each measurement may be ±1 centimeter, after many shots, our estimate can be very precise.

Absent biasing, measurements form a standard deviation around the true value. Because of this, we can make a number of important claims about measurement. We can combine measurements with differing accuracy to obtain a standard error which is less than any single measurement.

Harry: And what is a large ensemble? Has it occurred to you that it may be an elusive word loosely defined?

Standard deviation is *precisely* defined in statistics. And yes, it does depend on N, the number of measurements.
http://www.zachriel.com/blog/standard-deviation-formula.gif

This is critical to understanding any science that relies upon quantitative measurement (which is nearly all of science).

Standard deviation is clearly defined, but "large" and "ensemble" are loose words that can mean anything.

So we have much more precision,with more data and better instrumentation today than we had years ago, and we can improve on how hot it was at Wichita airport ever since it was an airport. I am not sure what that tells you about the weather in Dodge City or Harper.

What I would like to know is whether you can take tree-ring data from a hundred years ago, and derive any meaningful information from them at all that can be further refined with statistics. Are the trees all pin oaks? Or are they apple trees and orange trees (mixing apples and oranges)? How do you control the many other variables, including whether the pin oak was growing five feet away from another tree?

These data are speculative at best, and should not be used for any argument about the global temperature and the trend over the past hundred years. One can improve one's data by getting more thermometer data from more places.

For that reason we Deniers do not argue at all that it has warmed about a half a degree in the last century, or that the climate changes. That is the straw man that alarmists use to discredit critics. Rather, Deniers argue there is some debate about where we are going, and when the IPCC puts out a forecast to a tenth of a degree, it is fair to ask why they do not use a less precise number.

The proper response to that question is not that 99 percent agree with us, so shut up and get with the program, nor is it to digress into particle physics theory. Rather, one needs to understand and explain one's limitations.

Harry: Standard deviation is clearly defined, but "large" and "ensemble" are loose words that can mean anything.

Climate data is a large ensemble by any reasonable definition. We provided the explicit relationship regardless of the size of the sample.

We just wanted to correct the misimpression that it was not possible to combine data from many sources in order to garner a more precise estimate, even if some of the data has large error bars. We discussed standard statistical methods, but there are much better tools for large ensembles, such as a Monte Carlo or kriging approach.