If you plot a linear trend on a cyclical function, you are guaranteed to make a complete jackass out of yourself.
This stupidity is the standard technique of peer-reviewed climate science.
If you plot a linear trend on a cyclical function, you are guaranteed to make a complete jackass out of yourself.
This stupidity is the standard technique of peer-reviewed climate science.
That is one of your best Tony.
You can do even better creating a “trend” and “forecast” so you can get more funding if you cherry pick or omit a few end points, getting tired of those conveniently ignoring the last two years of sea ice extent or area, or reporters believing BS where forecasts become reality which is polar opposite of reality.
But if you do it in a room of people with the same agenda as you, then it’s all okay, right? It could be worse, I used to work with people who kept trying to do regression with ordinal variables.
Can we have that one explained to those of us that slept through statistics at Uni?
The periodic sine function in this example oscillates around the horizontal x-axis. It is bound by -1 and 1, i.e. it never becomes less than -1 or more than 1, regardless how far left or right you follow it.
Take a piece of paper, extend the function graph left and right outside of the shown window and ask yourself:
Does the rising trend line as calculated on the single period section of the graph shown in the picture have any meaning?
You may have slept through statistics but if your answer is “no” you understand the problem better than many climatologists. And if you know how to use an Excel spreadsheet, you have a leg up even on the Director of the Climatic Research Unit at the University of East Anglia.
+1
CW, I got Steve’s original post just fine and your explanation is excellent.
It was the “regression with ordinal variables” bit from DavidS that I had a problem with.
Wikipedia was not much help, surprisingly enough …
I know what happened, TS. The comment nesting is so minuscule on my gadget that I didn’t recognize where it was hanging. It shows correctly on this display.
TS
Try
http://www.ats.ucla.edu/stat/mult_pkg/whatstat/nominal_ordinal_interval.htm
Not something my stat courses covered. we called ‘Categorical’ Attributes. – New Math strikes again.
Yeah, the “regression with ordinal variables” bit kinda went over my head.
Sorry about that. The misuse of statistics is a sore point. In short, Regression attempts to use one (or more) variable(s) to predict another by comparing how the they “vary”. It can properly do this only with “interval” variables in which each point is equally distant from the next such as with temperature, i.e. the difference between 2 degrees and 3 degrees is supposedly the same in size as the difference between 3 degrees and 4 and so on. Such variables can have means, standard deviations, etc calculated for them. Ordinal variables are those that have rank but lack consistent intervals. Thus, if for example, you rate something as High, Medium, Low, or Absent, you can label those as 3,2,1, and 0 but there is no clear quantification of how much more “high” is than “medium” or if that difference is the same as the difference between “medium” and “low”. You could still calculate a mean but it is mathematically invalid. Going on to analyze such ordinal variables with regression or another form of analysis that requires a valid mean would likewise be mathematically invalid. Nonetheless, there are people who attempt to do this and publish the results just as there are people applying straight lines to non linear relationships. (which was the original “Remedial” point being made)
Interesting David. So energy itself is very similar. By this I mean a measurement of MPH is consistent; 5 to 6 MPH is an increase of 1 MPH (duh) Likewise 99 to 100 MPH, is also an increase of 1 MPH, (again duh) However the energy required to move a car from 99 to 100 MPH is far greater then the energy required to move a car from 5 to 8 MPH.
Temperature is similar to energy. The doubling power of CO2 to raise T is likely to diminish. It takes immense energy to accelerate earth’s hydrological cycle.
Ah! It all clicks now. (Spoon + Knife + Fork)/3=Spnirk
Daavid A Temperature is indeed a tricky variable when one digs beneath it. After all, 10 degrees C is not really twice 5 degrees C, since 0 C is set to the freezing point of water and thus does not represent the absence of temperature in the same way that 0 dollars represents the absence of money. The Kelvin scale theoretically does start at a true zero point might but even that might not be not perfectly interval in nature if one instead looks, as you suggest, at the amount of energy required to raise the temperature of any particular substance up by each degree.
If I remember it right, ordinal data has arbitrarily assigned values, e.g. evaluating a climate denial survey and assigning the participants to five groups of Flat Earth Ignoramuses, Soldiers, Underbosses, Bosses and Genocidal Climate Criminals, and then representing these groups by integers 1 through 5.
Since these numbers are arbitrary and don’t express any real numerical values, only special methods of ordinal regression can lead to meaningful results of the evaluation of such a denier survey, e.g. how to compute the concentration camps’ capacities for Bosses and Underbosses, and how many holding cells will be needed at the Climate Nuremberg for Genocidal Climate Criminals when everybody is finally rounded up.
If a statistician wants to step forward and explain it, I will eagerly stand down.
Nice try but it breaks down with “ordinal regression” which is as good a statistical oxy-moron as one is likely to hear.
http://www.youtube.com/watch?v=dFaAIylVHfI
Damn Scots, they ruined regression, too!
http://www.strath.ac.uk/aer/materials/5furtherquantitativeresearchdesignandanalysis/unit6/ordinalregression
Colorado W I was trying to keep this mercifully short but you seem to want a long discussion… Ordinal Regression is indeed an oxy-moron (not the only one in statistics). What is shown in your link and labelled as “Ordinal Regression” is NOT regression with an ordinal variable. They are using it as a label for a hybrid process that differs significantly from regression both in concept and method. It is based on the conversion of an ordinal variable broken into a series of variables coded as 0 and 1 and the calculation of a regression coefficient for each of them. One of the 4 categories is used as a baseline and thus its coefficient is 0. The other coefficients show how each of the others 3 separately differ from that. In their example this sleight of hand is also employed with the variable gender which is a nominal variable but with 2 categories allowing it to be easily converted into categories of 0 and 1. While this shows differences between the various categories and the category used as a baseline, it does not change the fact the categories in the variable are not interval in nature. Furthermore, it is certainly not the use of an ordinal variable directly in a regression equation. Note the use of the words “pseudo R2” and “largely linear relationship” in their discussion. Essentially, they are attempting to approximate what regression does and calling it “Ordinal Regression” which is by definition a misleading label.
Thanks, David.
I appreciate the time you took to take apart at the example and explain what they are doing there. I hope it is as valuable for others as it is for me.
The Scots clip was not a pushback against your “statistical oxy-moron” comment since I was puzzled why statisticians would label the analysis of such data as “regression” but I am glad it motivated you to expand on your remarks.
I understood your original comment at the level where I saw the silliness of trying to do regression on ordinal variables. In my attempt to explain your remark I focused on the character of the data. I nearly put the blasted “ordinal regression” term—which I’ve seen people use here and there—in quotation marks because I didn’t have a clue how it could be done and that’s why I used the purposely vague “special methods” qualification to distinguish it from regression as commonly understood. I did assume, however, that it was because of my ignorance of specialized statistical terms and that they have some reason why they would choose the label.
I’m glad you took mercy on me and I’m grateful I learned something new again.
correction: … take apart the example …
Colorado W, thanks, I am often too sensitive on such topics. Decades of being neck deep in it has its effects. You correctly allude to an important point, i.e. statistics are ruined by statisticians who write manuals and textbooks that attempt to impress other statisticians (who are usually the reviewers of them) but do not effectively communicate to people who are required to use them and yet are not in a position to be spend their life being obsessed over every small aspect thereof.
If you plot with a cubic function you get good results for exactly one period
Excellent example.
Since 97% of punters do not understand maths, is it any wonder that they get away with this fraud?
Stats
Math for the agenda driven.
Old story, contemporary application.
I have a lot of respect for statistics…. when applied correctly. Unfortunately now that we have computers and any fool can plug in numbers and get ‘An Answer’ statistics is much more likely to be used wrong. If you had to figure out the standard deviation by doing the calculations by hand you were much more likely to give it a pass unless you really really need to use statistics and you knew how to do it correctly.
Since I was dealing in chemistry the type of statistics DavidS is talking about I never used. We were just taught never to try to apply statistics to ‘attribute data’ things like hair color, eye color, gender.
race?
Now, Now rah I was being very P.C.
OMG… Race is fair game. PC or not, overlooking demographic variables is asking to miss part of the picture.
Wow they told you that? It would seem restricting even in chemistry. It is not really different than comparing say two chemical processes, one with a catalyst and one without, although two categories is always simpler to analyze than 3 or more.
Remember I am from the slide rule era. We were lucky to learn the word statistics.
Yes, I still have a couple of slide rules from the early days. Your point about having to take care when you have to do it the hard way is so valid (be it by hand, calculator, slide rule or a stack of punch cards). The PC and the Graphic User interface are two edged swords.
I know someone who was denied tenure ostensibly on the basis of the average opinion of letters of recommendation.
“If you plot a linear trend on a cyclical function, you are guaranteed to make a complete jackass out of yourself.”
If you look closely at the graph, you will notice that even though they may be making jackasses of themselves, they are making $.
And if we are very lucky soon they will be making roads.
http://1u88jj3r4db2x4txp44yqfj1.wpengine.netdna-cdn.com/wp-content/uploads/2011/05/chaingang1.jpg
But Steve/Tony has got the trend line in the wrong place for Climate work, it has to be on the Up Slope to get the maximum affect.
My personal favourite is climate scientists who express changes of temperature in degrees C as percentages, ie from 10deg C to 11deg. C is a 10% increase in temperature.
That is indeed a good one!
Or better yet the warming from the beginning of the year (2014) to the end of the year was a whopping 9%!!
+0.291 (January 2014) to +0.320 (December 2014)
http://www.drroyspencer.com/2015/01/uah-global-temperature-update-for-december-2014-0-32-deg-c/
The largest trend in GISS LOTI is from a starting point of 1992, 1.6°C per century. They then use a larger estimate of CI because its not a linear relationship to say it is consistent with 3°C per century.
DavidS says:
Colorado W, thanks, I am often too sensitive on such topics. Decades of being neck deep in it has its effects….
David we do not mind an explanation. Around here we are also ‘sensitive’ to the topic of the misuse of statistics.
As I have mentioned I have a (very) slight knowledge of statistics. Just enough to have me run for help from my more knowledgeable friends.
I would like the opinion of someone who does know stat on Krigging. This is the method used by the ClimAstrologists to smear data from one station over 500 km (I think)
This is the website of a geologist who trashes the idea as useless in geology where it originated. http://www.geostatscam.com/about.htm
I think the same reasoning applies. The surface is not uniform.
……..
Interesting, I’ve been meaning to look further into the specifics of smoothing and smearing as it is currently done with climate data. This might be as good a time as any. It will take some time, perhaps a few days to read with precision through Merks’ pages on the subject. My initial reading is that he has a solid mathematical base for what he is saying. In fact the level of scandal he suggests in mining is rather shocking. Basic sampling methods say that there is a heavy burden of proof on any sampling method whose goal is to take big shortcuts and still make valid generalizations. That much isn’t rocket science, the looser you work, the more error creeps in. The possibility of both hugely unrepresentative results and/or fraud are strong.
Thanks David. After reading through his work, I though he made a good point.
Gail, assessing the precise level of error in the smearing used in the climate data is going to require that I get into specific data sets and see precisely what has been done. I intend to do that but it will take considerable time considering I am already looking at the Urban Island effect with other data. I will be happy to share what I come up with.
I feel safe making the following generalizations that will should not be any surprise to any serious observer.
1) If the data locations are fairly dense and uniformly distributed
throughout the study area, the estimates should be good assuming the area itself is relatively homogeneous in terms of the relevant characteristics.
2) If the data locations fall in a few clusters with large gaps in
between, the estimates will be unreliable. (obviously)
3) Each estimation will underestimate the highs
and overestimate the lows.
Given that, it is already possible to say without looking at the data that the “smearing over” we see in temperature maps (such as with Bolivia) using sites that are in geographically and climatically very different regions is completely bogus from the start.
David, thanks that is pretty much how I see it. Relatively homogeneous is the key and the earth is anything but.
Only an ivory tower type who has never been outside a city could think so.