Download link

I have made quite a few updates to my GHCN code, which is primarily targeted at the US HCN database. The release includes seven files, compressed into file ghcn.tgz

getdaily  GHCN.cpp  GHCN.h  ghcnd-stations.txt  Main.cpp  Makefile  stations.txt

getdaily is a tcsh script for downloading and organizing the latest HCN daily temperature database from NOAA. It takes about an hour to run over a very fast link. It generates 49 .txt files – one for the US and one for each of Obama’s 57 states.

The code is C++ and consists of three files : Main.cpp GHCN.cpp  GHCN.h. It should  compile without warnings on g++ or Visual Studio compilers. The Makefile works on any gnu system, like Linux. On Cygwin, the suffix “.exe” gets added. To use Visual Studio, you will have to create a project and import the three code files.

There are a lot of new options

ScreenHunter_171 Apr. 13 09.03

stations.txt must be in the same directory as the executable. Typical usage would be

./ghcn.exe NY.txt > NY.csv

That reads all NY records and outputs the results to a NY.csv file

The rest comes in the category of “either you can figure it out now, or you need two years of education to figure it out

My critics will enjoy looking through this and seeing that (unlike them) I am not doing anything to bias the output data one way or another. That is because (unlike them) I am an actual scientist.

8 Responses to GHCN Code

  1. Kate Blodgett says:

    I love this…I know exactly how to do this now. So easy. Thank you and I mean it.

  2. Phil Jones says:

    If we could only see Mikey Mann’s code… Or NOAA’s….

    • ron says:

      Mann doesnt use a code. He takes satellite maps and fills in the missing data with red crayons

      • Actually, the crayons have been the province of Hansen, Steig, and Schmidt.

        Mann’s notorious by-product was the Hockey Schtick, which included such “delights” as:

        – Hide the Decline, in which declining tree-ring proxy data were truncated and replaced with a spliced line of fraudulent temperature numbers, without disclosing this action;

        – Spurious principal components analysis in which red noise fed into the code produces a HS every time;

        – Plugging Tiljander’s lake varve data into the algorithm upside-down, then denying it, then submitting a letter of retraction in a different journal and falsely claiming to have been exonerated of the charge of using data upside-down;

        – Getting journal editors fired if they published, or tried to publish, any work that called his work into question;

        – The infamous use of a single, outlier tree series in Yamal, Russia (YAD061) in such way that the entire blade of the HS that is in excess of the Medieval Warm Period was caused by this one tree;

        – Eliminating the vast majority of the available 20th Century tree-ring data from his analysis, because they were “not the droids he’s looking for”;

        – The continued incorporation of American stripbark Bristlecone pines long after he knew, or should have known, that they were completely useless as a source of temperature signal; and finally, last but not least,

        – Sectioning (i.e., cutting down) one of the oldest known living trees in existence on the absurd grounds that he could not work with this tree’s rings by the standard means of taking a sample core via hand drill.


  3. nsomos says:

    Bless you for supporting Cygwin.

    Cygwin was a lifesaver for me when for the first time in
    decades, I found myself at a job with no access to unix or linux,
    but windows only. It enabled me to be productive in a way that
    would have been nearly impossible without something like Cygwin.

  4. Tel says:

    Steve, I’m putting together some code that allows import of GHCN into the “R” program. I’m too old and lazy to write statistical packages by hand these days. After studying your code, here’s a few points:

    The wget program has a recursive mode with lots of nice features, like checking timestamps (can re-start the download partway through the process and skip files already completed). It also has bandwidth management, adjustable random delays between downloads to avoid hammering your link.

    My personal choice is perl for data conversion, simply because perl has excellent string manipulation libraries and you end up with a very short conversion program. It isn’t ultra-fast, but all thing considered not too bad. Probably C++ will beat perl for speed if you have a long time to spend tuning the code. Perl will win if programmer time is at a premium and you just do the obvious first level optimizations.

    The converted file is large, but that’s in an ASCII format for R to read… internally R will convert to a binary format and allow you to save your workspace on exit. There are some tricks built into R that apply compression (e.g. tokenizing the station ID’s and converting dates to simple integers) so your saved workspace file is much smaller than the original ASCII and loads fast when you go back into the program.

    This should run equally well on Microsoft, Linux, Apple, etc. Both perl and R have been widely ported. The Microsoft Windows port of R takes advantage of many of the local windowing features so it “feels” kind of native.

    So to ask a question here, I see you cut up the fields into a value plus: MFLAG, QFLAG, SFLAG each of which is a single character. Are those explained anywhere? Looking at your code it seems at first glance that you skip over these flags and don’t do anything with them. Am I right? Or is there something subtle at work here?

    Once I get the import working cleanly I’ll put some R scripts together to extract some averages, graphs, etc. Anyone who hasn’t tried R should at least give it a go. Very good for post-processing engineering data, financial data, or scientific plotting.

  5. Corak says:

    People don’t think!

Leave a Reply

Your email address will not be published. Required fields are marked *