Stupid Math Tricks

James Hansen, head of NASA's GISS and technical adviser on An Inconvenient Truth, wrote recently

Thus there is no need to equivocate about the summer heat waves in Texas in 2011 and Moscow in 2010, which exceeded 3σ – it is nearly certain that they would not have occurred in the absence of global warming. If global warming is not slowed from its current pace, by mid-century 3σ events will be the new norm and 5σ events will be common.

This statement alone should be enough for any thoughtful person who here-to-fore has bought in to global warming hysteria out of vague respect for "science" to question their beliefs.

First, he is basically arguing that a 3σ event proves (makes it "nearly certain") that some shift has occurred in the underlying process.  In particular, he is arguing that one single sample's value is due to a mean shift in the system.  I don't have a ton of experience in process control and quality, but my gut feel is that a 3σ event can be just that, a 3σ event.  One should expect a 3σ event to occur, on average, once in every 300 samples of a system with a normal distribution of outcomes.

Second, and a much bigger problem, is that Hansen is gaming the sampling process.  First, he is picking an isolated period.  Let's say, to be generous, that this 3σ event stretched over 3 months and was unprecedented in the last century.  But there are 400 3-month periods in the last hundred years.  So he is saying in these two locations there was a 3σ temperature excursion once out of 400 samples.  Uh, ok.  Pretty much what one would expect.

Or, if you don't like the historic approach, lets focus on just this year.  He treats Moscow and Texas like they are the only places being sampled, but in fact they are two of hundreds or even thousands of places on Earth.  Since he does not focus on any of the others, we can assume these are the only two that have so-called 3σ temperature events this summer.

It's hard to know how large to define "Texas"  (since the high temperatures did not cover the whole state) or "Moscow" (since clearly the high temperatures likely reached beyond the suburbs of just that city).

Let's say that the 3σ event occurred in a circular area 500km in diameter.  That is an area of 196,250 sq km each.  But the land surface area of the Earth (we will leave out the oceans for now since heat waves there don't tend to make the headlines) is about 150 million sq km.   This means that each of these areas represent about 1/764th of the land surface area of the Earth.  Or said another way, this summer there were 764 500km diameter land areas we could sample, and 2 had 3σ events.  Again, exactly as expected.

In other words, Hansen's that something unusual is going on in the system is that he found two 3σ events that happened once every 300 or 400 samples.  You feeling better about the science yet?

Luboš Motl has a more sophisticated discussion of the same statement, and gets into other issues with Hansen's statement.

Postscript:  One other issue -- the mean shift in temperatures over the last 30 years has been, at most, about 0.5C  (a small number compared to the Moscow temperature excursion from the norm).  Applying that new mean and the historic standard deviation, my guess is that the Moscow event would have still been a 2.5σ event.  So its not clear how an event that would have been unlikely even with global warming but slightly more unlikely without global warming tells us much of anything about changes in the underlying system, or how Hansen could possible assign blame for the even with near certainty to anthropogenic CO2.

22 Comments

  1. AMB:

    I'm interested in Hansen's statement that "5σ events will be common". Per the article you linked to by Luboš Motl, Hansen seems to be assuming a normal distribution.

    Correct me if I'm wrong, but 5σ cannot be "more common". The mean can shift, the distribution can migrate towards the "warmer" part of the graph, but 5 standard divs from the mean will always have the same area under its curve.

    In other words, what qualifies as a 5σ event will be hotter and hotter, but such events, by definition, cannot become "common".

    Unless he means that temperatures that are 5σ events for today's distribution will become more likely. In which case he's just being a little sloppy with his language.

    Caveat: I'm not a mathematician and haven't touched the applicable stats books for years, so please do correct me if I'm misunderstanding the math here.

  2. Roy:

    Tried, but could not resist this (silly, I know) comment: statistics occupies a realm beyond 5th grade math. You gotta be smarter than a 5th grader (well, OK, actually, more educated) to understand standard deviations. Even college students in, say, accounting, who at some place see probability and statistics (Operations Research and Statistics) often have no more than memorized application skills in contrast to comprehension. (Yes, I can relate anecdotes.) This line of observation leads me to expect that even one sigma much less three sigma will prove a very hard concept for the general public to follow. They won't do the memorization so that they can do rote application; they certainly won't comprehend. That does not excuse Hansen, who probably speaks at least calculus and should himself have had mental gears grinding when he headed down the 3 sigma 5 sigma path. But it does highlight, maybe even explain, the challenge faced by those who would publically question Hansen's analysis.

  3. RandomReal[]:

    Well, I learned something today. The effect that Warren describes is called the "look elsewhere effect". I think I first encountered it in 5th grade or so, when my teacher asked "Do you think that two people in this room have the same birthday?" With 30 students, the teacher was pretty sure of a positive result. Later in high school, we actually got the math behind it. So Hansen et al. would point to the two students with the same birthday and say "Look how remarkable! It can't be a coincidence!" But, it's not coincidence; it's expected.

  4. richard:

    Warren,

    A 3s events means that -assuming a Normal distribution, which is very reasonable in most cases- an event occurs once every ~300 times.

    So if you have 300 independent observations, it is highly likely that one of them will fall out of the population: A 3s event.

    The question is now: Do we have 300 independent observations. I think that over the course of a year, over the world as a whole we have that. (Lets say that events are independent if they are a week and 2000 km apart. That would split up the world and time in much more 'regions')

    I think it would be surprising if we did not have a 3s event.

    Just my 2c.

  5. Don:

    One small nit: He was talking about Texas 2011,and Moscow 2010, so two locations (assuming he used a single area of Texas) in 2 years, not the same year.

    In otherwords, it's worse than your analysis.

  6. Evil Red Scandi:

    "You keep using that Greek letter. I do not think it means what you think it means."

  7. DrTorch:

    Hansen's a liar. His work would be considered fraud according to the ethics policy at my institution.

    Since he's doing this on public funds, he should be in jail.

  8. Mark:

    There closer to 1200 3 month periods, if you sample on the month boundaries in the last 100 years. Makes it really easy to pick and choose. If Jan Feb Mar, doesn't work for your stats, switch it to Feb, Mar, Apr. There are actually infinite 3 month periods, but if Alarmists started counting from 1:45 pm on the 17th of some month and then 10:03:14 am on the 8th of another month - people would just start ignoring them.

  9. Eric:

    Maybe this is a stupid question, but: Isn't the "accepted" 0.5C rise over 30 years within the margin of error?

  10. Anon:

    Eric,

    Yes, but hush. You are not a respected climate scientist.

  11. andre:

    Dr. Torch - I think it was Mark Twain (either him or Mencken), who said "first there are lies, then there are damned lies, then there are statistics."

    One of the best ways to use statistics to technically tell the truth while knowingly and intentionally getting people to believe something that is at best not supported by the very statistics you are quoting or at worst, totally refuted by them, is summed up by an old saying - "if you can't dazzle them with brilliance, baffle them with bullsh t".

    Basically, Hansen's statement first says "I am smarter than you and here's the proof", followed by some mathematical terms that most people do not understand. That is followed by the creation of an impression that is not really supported by the facts. He knows full well that there are people who will see through his "lies". He does not care. Even if one of the non-believers manages to reach the public, they will be skeptical having already been baffled with the bullsh t.

    This is no different that Algore "proving" global warming with the two graphs (global temp and CO2 over the centuries). Any mathematically educated person, after a cursory examination of the graphs, sees that CO2 concentration lags the temp graph by several centuries. But 90% of those seeing Algore do not, and rely totally on his explanation.

  12. Bram:

    The most infuriating part - I'm paying Hansen's salary.

  13. andre:

    AMB - you are correct. But IF I were to defend Hansen (which I'm not particularly interested in doing, since I do believe his intention is to lie), I could say that you are splitting semantic hairs. It's much easier to say it the way he said it when speaking to an audience that is mostly non-mathematical in inclination, than to try to explain a shifting distribution curve. Besides, if he even mentions a bell-shaped curve, he will probably be called a racist.

    Now one might legitimately question that if he was so interested in lay people understanding what he is saying, why would he even mention sigmas. Answer is in the above post.

  14. Max:

    I can compare this to statistical research on material fatigue. We also do statistical analysis on sample data and even though we have much more thourough understanding of our tools and materials than any climate scientists. I will tell you that we cannot reproduce any statistic with less than 3 data points. There is just no possibility to do any kind of meaningful (r-correlation satisfied) mean-shift at all. More likely the sample point was actually a 3σ-event rather than a mean-event. The other question would be, do they really assume a Gaussian distribution or an exponential distribution. Do they even have an idea about the distribution.

    To make any meaningful assumption, you'd have to have the same amount of sample points that fit a new distribution that shifts to the right or left of the mean. This is of course, very very difficult. It gets worse if you have a 100 sample points, because then 1 sample point won't shift the mean by anything, especially when it is an outlier. The tricky part with probability is that it doesn't mean that there isn't necessarily a chance that there will be events outside the 3σ distribution. It only means it might be unlikely IF the underlying assumption is correct!
    In mechanics I feel confident that experts can judge whether something was a mistake due to the testing rig or the process. Can climate scientists claim the same? Can they claim that their sample of a few decades is actually capturing the situation through thousands of years, which are unobserved and had entirely different climates than today? I doubt that very much.

  15. IGotBupkis, Unicorn Fart Entrepreneur:

    >>> Hansen’s a liar.

    Nawww, he's a prognosticator. He's predicting that exactly what we expect to happen, and, by the science, should happen, is going to happen.

    He's couching it in Chicken Little language, however.

    Reminds me of a two-page piece I once encountered on the extreme dangers to the ecosystem and overall human health represented by DiHydrogen Monoxide exposure and how it was appearing everywhere in our environment these days.

    Every word of it was utterly true.

    You can take the sheer unadulterated truth about the most mundane things and, by proper conjugation of adjectives (I am determined, You are stubborn, He/She/It is bullheaded) and adverbs, turn it into the most FUD-inducing diatribe to anyone not smart enough to realize what is being done... which is not less than about 25-40 % of the population.

  16. IGotBupkis, Unicorn Fart Entrepreneur:

    >> if Alarmists started counting from 1:45 pm on the 17th of some month and then 10:03:14 am on the 8th of another month – people would just start ignoring them.

    Only if people knew that was how the data was being twisted to fit the agenda. And
    a) Who is going to notice that? Not the media.
    b) If someone (say, Watts) notices, who is going to tell the people that? Not the media.

  17. Shane:

    This is a human heuristic that is exploited in the trading world. The name of the heuristic is called suvivorship bias. Simply put we need too look at all of the potential weather patterns across the globe (or money managers). If we only pick the top performers we will not be able to say really anything about those performers. Was it luck or a trend we can not know. But our humanness will fool us with this randomness telling us that this is real information when it is really not.

  18. ben:

    A 3-sigma event should happen once every 370 samples...

  19. caseyboy:

    This is why I follow this blog. Technical information presented concisely and with logic. I don't think Dr. Deming would approve of Mr Hansen's statistic process measurements in this instance for all the reasons that Coyote provides.

  20. Flatland:

    Hansen's whole analysis is completely bogus. Analysis like this requires a normal distribution which clearly there is not. In order to get a normal distribution, he would be relying on a model to determine a difference. (I've done similar things working as an engineer. However, my models are much more reliable.) If the model does not properly account for non man made sources of variation, the analysis is flawed.

  21. Pat:

    This is the purest example of "propaganda" infiltration in science. The agenda is "control" of human behavior. The technique being used is unquestionably "very, very, very dubius science" where the suppositions and premises are faulty ... more importantly, the conclusions are faulty. This is what happens when this is clearly an "agenda driven conclusion" ... driven by a compendium of nefarious folks. Even the media has been comprimised so that dissention will never see the light of day, they have bought the 'party line". The main driver is of course ... Authoritarian government. Welcome to Amerika, Komrade.

  22. Smokey:

    Check out wattswupwiththat.com for a total deconstruction of the climate charlatan James Hansen. Hansen feeds at the public trough, using our tax payments to advance his AGW debunked propaganda.