[Update 14th January – we are still keeping the links to the database registation down as there have been a lot of changes and updates made in the last week. PLease be patient]
People have been saying it for ages “Why use Fortran/Python/C for climate data?”, put the station files in a database – a searchable, relational database. As E.M. Smith put it “flat files are so 1970…”. There are now several such emerging databases, I believe. Here’s the gen on one of them.
Back in October, frustrated by my limited ability with GISS data, I downloaded the GHCN v2.mean.Z file, which is an input for GIStemp. I had the idea of putting it into MS Access so that I could at least search for files quickly, but I never got much past importing the few initial large tables – only partly due to other priorities. However, I got a lot of encouragement from others and about a month later an email from TonyB introducing me to computer professional KevinS, who also had an interest in creating a database. After a few exchanges with Kevin a plan formed.
My own lack of knowledge and ability was plain and limited my vision; the database created and its associated utilities now go way beyond what I thought possible. I should stress my input has been very small, but I have been made to feel useful despite Kevin doing all the work 😉
The Technical Stuff
The source for GISS (and it seems Hadley CRU) climate analysis are the Global Historical Climate Network files – aggregated files of temperature data from stations worldwide. Kevin initially downloaded the GHCN v2.mean.Z and v2.mean_adj.Z files from here and imported and normalised this raw mean and adjusted mean temperature data and all the country code and station inventory data in an MS Access database. He then emulated GISS methods for combining overlapping series to form a single series for each WMO Station. GISS combined/unadjusted and combined/homogenised data has now been added in its own database (this is the updated set after GISS started using USHCN data in mid November). Eventually other data sets can be added.
The database (“TEKtemp”) is freely available: anyone who is interested in becoming a user may register here*. [Update 6th Jan. the link to the TEKtemp database was incorrect and has been fixed] *[Update 8th Jan. link temporarily removed while some of the data is reworked.]
Kevin has since adapted his own software to chart and tabulate these raw/adjusted temperatures – it can plot graphs and create tables rapidly from the data stored in the database; this too will be made available once the ‘toolbox’ is complete. Having done self checks that the coding is not in error, we are in the middle of ‘unit testing’ to demonstrate there are no serious errors. After that we can be confident of any results that TEKtemp produces for nearly 4,500 WMO stations. Not a bad achievement in the space of a couple of weeks.
I should say a little about data quality. In trying to emulate the seasonal and annual means, we realised there was something funny with the way both GHCN and GISS calculate annual mean temperatures when there is data missing for one or more individual months. We will have a specific post on this as there were many averages we should have been but were not able to replicate correctly. This implies deviation from the published method. Kevin is fastidious about using unmanipulated data as far as possible and his response to this has been to exclude from his analysis any year in which there is one or more missing month of data. This has altered trends slightly, but does use only the actual data available for the station and not some ‘filled in product’.
In drawing temperature trends for each WMO station, he has fitted a first order polynomial (i.e. linear
regression) trend to the combined series for all available data post-1880. For adjusted data he has fitted a trend line if there are at least 50 years (not necessarily consecutive years) of data. Maps for NOAA/GHCN data are here. Maps for GISS/GHCN data are here.
Analysis and Development
So far, aside from previous posts here and here, we have been looking initially at the distribution and magnitude of the adjustments made by GHCN. Some of them really are unbelieveable. Kevin has also used the database to produce unique color-coded zoomable and clickable maps of worldwide temperature trends for GHCN raw and adjusted data using DIY Maps (again freeware). Link to a specific post will be added here when available.
Climate analysis has just got colorful again!