Algorithm For Counting Records

Every time I post a graph showing temperature records, all kinds of wild theories are made claiming that early years show more records simply because they are early.  This is complete nonsense. The algorithm below is the basis of the counting.

max = -1000.0
max_year_list = []

for year in range(1895, 2018) :
    if (temperature[year] > max) :
        # Start a new list if the old record is beaten
        max_year_list = [year]
        max = temperature[year]
    else if (temperature[year] == max) :
        # Append to the list if the old record is tied
        max_year_list.append(year)

The only way a record gets counted is if it is highest temperature from 1895-2018.

This entry was posted in Uncategorized. Bookmark the permalink.

55 Responses to Algorithm For Counting Records

  1. steve case says:

    On that note I was just going through the Wisconsin Climatology office website:

    http://www.aos.wisc.edu/~sco/clim-history/state/wihitemp.html

    Graphed out, it looks like this:

  2. Griff says:

    That’s absolutely the case. The longer a temp record runs, the less chance of getting a new record from a site

    • Rah says:

      Hey! Griff denys there is global warming! If there were GW then the there would be an increase in recent day time highs and high night time lows no matter how long the record of the reporting station was. AGW is an extraordinary claim and requires extraordinary evidence. The only things extraordinary the alarmists produce are their lies and excuses.

    • spike55 says:

      Griff shows his absolute IGNORANCE yet again.

      Did you ever even pass primary school, you poor mindless twerp ??

      You are just posting idiotic nonsense to try to get some attention, aren’t you, trollette.

      Must hurt you to be so PATHETIC, DESPERATE and LONELY

    • Jack Miller says:

      I’m having a difficult time thinking that you are really serious about that remark, you absolutely missed the point of what Tony is demonstrating here.

    • Griff

      Tony’s graphs always include TIES!!

      (You would be right if ties were not included)

      • arn says:

        That”s the thing Griff is not willing to understand.

        (when you discover a planet today one of the next 2 days will be the coldest and one the hottest ever by default
        (but no scientist would consider them records at all as they’d 99.9%turn out as average in the long run)and the number of hottest and coldest days will decline from 100% of the first 2 days to >0.1% the longer the timeframe and the more data we get.

        But this number would dramatically increase with Global Warming.(this number is not increasing just as sea level rise is not.the only thing that is increasing is the number of fake records that are thrown at us by the MSM in a regular basis (to keep us emotionalised )which in most cases turn out not be records at all but result of ignoring older temp. records)

    • Griff

      Tony’s graphs always include TIES!!

      You would be right if ties were not included

    • pmc47025 says:

      Griff, there are no “new” records. A record is a documentary of something that happened in the past. For example, if it was 150F on July 23 and 151F on July 24, the July 23 temperature could not counted as a record max.

    • tonyheller says:

      In a 125 year long record, 1936 has a 1:125 chance of being the hottest. 2016 also has a 1:125 chance of being the hottest. There is no time bias to my graphs.

      Your point is completely irrelevant in the context of this discussion.

      If this were the year 1900 and I was posting graphs from 1895-1900, then 1895 would have a lot more records than it does in the current graph. However, this is the year 2018 Griff

  3. wolvesjoe says:

    If the temperature cycle is governed by natural variability, then yes the amount of records will diminsh over time.

    If the temperature cycle is governed by increasing amounts of CO2, then there should be new records being set on a regular basis.

    Is it anymore complicated than that?

  4. MGJ says:

    Thanks for clearing that up Tony.
    I think it is a nice example of where rational thought can overcome (false) intuition.

  5. Scott Scarborough says:

    No,

    If there is no trend, the odds of the highest value being in the second half of the record are exactly as high as the odds of it being in the first half of the record. I don’t understand why this is hard for people to understand.

  6. Phil. says:

    Every time I post a graph showing temperature records, all kinds of wild theories are made claiming that early years show more records simply because they are early.

    No problem with your algorithm, the problem arises when you post something like this:
    “The number of daily and all-time record temperatures (maximums and minimums) in the US has been at or near a record low in 2018.”
    and implying that it means something.
    Take for example the Wisconsin data posted above, it has a mean of 101 and std of about 6.
    Once you’ve hit the all-time record of 114 the probability of tying or breaking that record would be ~1.5% assuming a normal distribution of variation.
    If the mean has increased for some reason in the meantime by 1ºF then the probability increases to ~2%.
    The low occurrence of new records in a given year under such circumstances tells us nothing.
    In addition quoting the statistic for a number of stations (for example the Wisconsin set) implies that those stations have existed throughout the period without change.

    • Gator says:

      Let’s see if we can find an instance where Phil goes to alarmist sites and has a “problem” with any of their wildly alarmist and fantastical statements.

      • Anon says:

        Gator,

        I have spent some time today to make sure I understand Phil and he makes sense. As scientists, we always do want want to make sure that what we put out holds water, that was the foundation of the peer review process. Your ideas need to survive attacks from the skeptics.

        That said:

        However, yeah, if the CAGW community were subject to the same rigor, they would never survive it. And because of that, they rely on smearing, suppressing and silencing dissenters.

        It is my opinion that this is the primary reason why they are losing the debate politically.

        How Government Twists Climate Statistics
        Former (OBAMA) Energy Department Undersecretary Steven Koonin on how bureaucrats spin scientific data.

        https://www.wsj.com/video/opinion-journal-how-government-twists-climate-statistics/80027CBC-2C36-4930-AB0B-9C3344B6E199.html

        After seeing that video (and numerous others like it) I stopped teaching CAGW. I could not, in good conscience, continue to teach students this stuff, it just is NOT ETHICAL.

        The response has often been: “Don’t you think preserving the environment is more important than your ethics? ”

        *I don’t even know how to respond to that, or how, by what mental gymnastics, they even formulated the question. (face palm)

        • Gator says:

          Agreed, I would just like to have an honest person on the other side of the debate for once. Phil is not honest with us, or himself, and that is why they are losing the debate. Well that, and being horribly wrong.

        • Disillusioned says:

          Anon said, “The response has often been: “Don’t you think preserving the environment is more important than your ethics? ”

          *I don’t even know how to respond to that, or how, by what mental gymnastics, they even formulated the question. (face palm)”

          Yeah, they are masters of manipulation using rhetorical fallacies. Guess which one that was…

          https://informationisbeautiful.net/visualizations/rhetological-fallacies/

          • Anon says:

            Dis…

            Reading that just short-circuited my mind. It still does not help me understand how 1+2 = 2, but I guess it helps, sort of. :(

          • Disillusioned says:

            I think it fits neatly with the very last one. The Straw Man Fallacy.

            I feel ya, my friend. I have them in my life too. These people are reprehensible. They have called me a racist. I voted for Obama (ignorantly) in 2008.

          • Disillusioned says:

            I guess that attack was the one five doors up the street: The Ad Hominem Fallacy.

            They thrive on derision and the use of fallacies.

          • Anon says:

            Dis. I was an HRC “early” voter in 2016. I just thought Game Show Host vs Senator & SoS for President. Is there a choice? Then Wikileaks broke in October and I thought, what have I done? (face palm)

            I swear to God, I had no idea what Breitbart was, anything about Climate Deniers, Nigel Farage, etc.

            But as every media source I had been consuming said HRC’s chance of winning was 97%, I knew that the “truth” must be elsewhere… and so my migration began.

            *My best algorithm for finding new content now is to watch what the MSM attacks and denigrates, and immediately go and check it out. I discovered Jordan Peterson that way. (lol)

          • neal s says:

            I might suggest visiting The Conservative Tree House

            https://theconservativetreehouse.com/

          • Disillusioned says:

            Anon,
            To paraphrase from that old Virginny Slims commercial, “you’ve come a long way baby.” (And in a very short period of time.) Good on you.

          • Anon says:

            neal s, “The Last Refuge” (lol) Thanks!

    • pmc47025 says:

      Phil, your point on a possible change in the number of reporting stations seems valid. A % record max/min vs total samples might be interesting.

      The rest of your post makes no sense. There are no “new records” in a data set that already exists. A temperature of 180F in 1936 would not be counted as record max if 2018 had a temperature of 200F. As Tony said, there is no time bias.

      • Phil. says:

        Perhaps you should have read this:
        “No problem with your algorithm, the problem arises when you post something like this:
        “The number of daily and all-time record temperatures (maximums and minimums) in the US has been at or near a record low in 2018.”
        and implying that it means something.”

        • Gator says:

          Perhaps you should learn to fairly criticize both sides, and not silently enable ridiculous alarmist statements and their wildly incorrect projections.

          Still no issue with all the failed predictions by the grantologists? Ten years of ice free Arctic predictions, that never came close, and not one criticism from Phil

          • Phil. says:

            Perhaps you should learn to fairly criticize both sides,

            Like you do?

          • Gator says:

            Well smartass, if you can find wildly wrong predictions made by those of us who do not deny natural climate change, I will gladly address them.

            Now back to you, Phil.

            Perhaps you should learn to fairly criticize both sides, and not silently enable ridiculous alarmist statements and their wildly incorrect projections.

            Still no issue with all the failed predictions by the grantologists? Ten years of ice free Arctic predictions, that never came close, and stillnot one criticism from Phil.

        • pmc47025 says:

          Phil, I mostly agree with that based on the likely (unspecified) change in number of reporting stations and valid samples. A % min/max instead of an absolute count would be better.

          Your time bias implication is completely invalid.

          • Phil. says:

            Your time bias implication is completely invalid.

            You appear to still misunderstand the point.
            Tony makes a similar point, he says:
            “In a 125 year long record, 1936 has a 1:125 chance of being the hottest. 2016 also has a 1:125 chance of being the hottest.”
            So if you compile a new dataset by adding 2018 to the old dataset you’d expect a certain number of records to be broken. As I pointed out above based on the stats of the Wisconsin data there’s about a 1.5% of the record being broken.

            Tony introduced the idea of new records when he says:
            “The number of daily and all-time record temperatures (maximums and minimums) in the US has been at or near a record low in 2018.”

          • pmc47025 says:

            The point is… Max US temperatures were higher before dangerous CO2 levels than after. Use only USHCN stations with at least 60 TMAX entries per summer for every year from 1928 thru 2017, average the daily temperatures for those 345 stations, and graph the number of days over 91F, it looks like this:

          • pmc47025 says:

            Or, maybe Tony’s updated graphs would be more convincing:
            https://realclimatescience.com/2018/08/latest-graphing-conspiracy-theory/

            Or, maybe the EPA heatwave index graph is a better representation:

          • Phil. says:

            The point is… Algorithm For Counting Records.

          • pmc47025 says:

            The algorithm accurately counts the number of records set or tied in each year (with no time bias). When applied to a continuously reporting USHCN station set the results are solid evidence that the US 2018 summer (almost over) will have a lower TMAX record count than most preceding years. 1936 (before dangerous CO2 levels) by far had the highest.

          • Phil. says:

            The algorithm accurately counts the number of records set or tied in each year (with no time bias). When applied to a continuously reporting USHCN station set the results are solid evidence that the US 2018 summer (almost over) will have a lower TMAX record count than most preceding years.

            Which is exactly what you’d expect from statistics.
            Each station has made 125 selections from its distribution, the all-time record has the lowest probability. Add another year, the probability of exceeding the record is lower than the probability of setting that record in the first place.
            Say you have a station with a mean of 101 and a standard deviation of 6, after 125 years the record is 114. The probability of reaching that value is ~1.5%, so out of 125 tries that’s reasonable, now what’s the probability that you’ll set a new record (+1). It’s about 1%, (tying it is again 1.5%).
            What the graph shows you is that 1936 was a freak year, that’s all. Notice that the daily records don’t show a similar distribution.

          • pmc47025 says:

            If you “add another year” to the data set, a broken all time station record tMax in the added year would reduce the tMax count from a previous year. A tie in the added year would not affect previous year counts.

            If the included stations were reporting every year, and the tMax values were increasing over time, one would expect higher tMax counts in the later years, which is not the case.

          • pmc47025 says:

            Update: “A tie in the added year would not affect previous year counts, but, it would be counted in the “added” year.

          • Phil. says:

            If the included stations were reporting every year, and the tMax values were increasing over time, one would expect higher tMax counts in the later years, which is not the case.

            I’m afraid you’re missing the point, as the maximum value in the distribution increases the chance of exceeding it goes down, so you’d expect fewer new records with time.
            I just ran a simulation of a thousand samples from a Gaussian distribution, mean 101, sd 6.
            series 1:
            record: year set
            108:1
            111:15
            112:21
            119:22
            121:344
            never beaten
            Series 2
            102:1
            107:6
            112:10
            115:12
            117:29
            never beaten
            Series 3
            104:1
            107:8
            114:9
            117:33
            118:615
            121:679
            never beaten
            Series 4
            98:2
            102:3
            103:4
            104:9
            105:18
            106:23
            107:37
            119:39
            121:164
            never beaten

            In series 2 the ties were in yr 82, 171, 251, 811,
            To emulate Tony’s plot all-time records among the first 100 yrs of the 4 series were set in yrs:
            22, 29, 33, 39 (only 1 tie)
            It took another 64 years before another all-time record was set!

          • pmc47025 says:

            Phil says “as the maximum value in the distribution increases the chance of [exceeding] it goes down”.

            And I agree. However, a [tie] would have the same chance as the existing maximum value right? The algorithm counts ties.

            In your simulation, does “never beaten” really “never beaten or tied”?

          • Phil. says:

            And I agree. However, a [tie] would have the same chance as the existing maximum value right? The algorithm counts ties.

            In your simulation, does “never beaten” really “never beaten or tied”?

            Never beaten in 1,000 attempts, I didn’t have the patience to go through all 4,000 data points looking for ties.
            However, in the first 100 years of all four series there was only one tie.
            In series 2 I did check all 1,000 points and there were 4 ties in the whole series in yr 82, 171, 251, 811, the all-time record for that series being set in yr 29. In series 1, 3 no ties in first 300 yrs, and in 4 no ties before new record set in yr 164. After ~100yrs new records and ties become scarce.

            HTH

          • pmc47025 says:

            So, for series 2, applying the algorithm would result in?
            117:29 Record max
            117:82 Record max
            117:171 Record max
            117:251 Record max
            117:811 Record max

            If your simulation doesn’t discard the entries before the first maximum, and, does not count ties, it doesn’t simulate the algorithm. The data set can be “scanned” from first entry to last, or last entry to first and the results would be the same.

          • pmc47025 says:

            I did a quick and dirty 250 station TP simulation. Each station temperature data set is a cosine wave with a 10 year period:

          • Phil. says:

            So, for series 2, applying the algorithm would result in?
            117:29 Record max
            117:82 Record max
            117:171 Record max
            117:251 Record max
            117:811 Record max

            If your simulation doesn’t discard the entries before the first maximum, and, does not count ties, it doesn’t simulate the algorithm. The data set can be “scanned” from first entry to last, or last entry to first and the results would be the same.

            Yes, exactly. I only applied it for the first hundred for all four, which gave: 22, 29, 33, 39, 82*
            *=tie
            Gives the same result no matter which direction.

          • Phil. says:

            Each station temperature data set is a cosine wave with a 10 year period:
            All in sync too!

    • Anon says:

      Phil,

      I think I understand most of what you posted. However, if the mean ticked up by 1ºF , changing the probability to ~2%, this would imply that in years with the 1ºF increase will more likely record a record high temperature? (and conversely, less likely to record a low temperature record) … Is that it?

      • Phil. says:

        Correct, but because the change in the mean is small compared to the variation the change in probability is hard to detect.
        As Paul Homewood pointed out things change a little if ties aren’t included. If in the Wisconsin data you have to beat the previous record by 1ºF to set a new record the probability is 1%.
        Similar arguments would apply to minimum data. One thing that I noticed looking at the Wisconsin data is that there appeared to be a ‘floor’ to the annual maximum, 94/5, so the distribution may be asymmetric (lognormal?).

  7. Anon says:

    Phil, Thanks…

    /If in the Wisconsin data you have to beat the previous record by 1ºF to set a new record the probability is 1%./

    I’ve go this too.

    Okay, but where does the CO2 – Temp. linkage come in then? If the linkage is linear, would that not be nudging the mean up each year? Thus effecting the subsequent probabilities in turn? Sorry to be so plodding here. I am just trying to fully grasp what you are saying. It is making sense so far.

    • Phil. says:

      No problem.
      If the linkage is linear, would that not be nudging the mean up each year?

      Yes, that’s right, that’s why I added in the 1ºF to illustrate what the overall effect would be, as you say the effect will be cumulative over time.

  8. spike55 says:

    WOW, look at all those temperature records in the 1930’s,

    Yet to be cancelled out.

    And even if they are tied…

    It means NO WARMING IN NEARLY 80 YEARS

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.