Canada – Top of the Hockey League (Part 1)

The Shape of the Data

How many thermometers do you need to report accurately on climate change in a country the size of Canada?  Let me remind you that is 3.5 million square miles – or 6.7% of the land area of the earth, and covering latitudes from 45N to 85N.  If I said 500 would that sound about right?  As long as they are representative of changes in the whole area and track the changing climate, perhaps we could manage with less – half that perhaps? A third? Ten percent?

From over 600 individual temperature series and more than 540 combined series with records of more than 20 years, the thermometer record in Canada peaked in approx. 1975 (see map, left), but has since been decimated by station dropout.

By 2009 there are less than 30 locations reporting temperature that are used by the Global Historical Climate Network (GHCN) prepared by the U.S. National Climate Data Center (NCDC); this data is also used as the input to NASA’s GIStemp program.
You can see the locations of the stations on the map (left) and the most obvious ‘hole’ is the lack of stations above latitude 60N.  Yukon, Nunavut and Northwest Territories make up 39% of Canada, but between them have only four stations: Dawson and Whitehorse (Y), Eureka and Coral Harbour (NT).

However, much of what is strange about Canada’s temperature record in GHCN is not immediately obvious. E.M.Smith’s marathon effort of examining records by country produced a “hair graph” for Canada using his dT method. Here is a simplified version of it:

This shows both the gradual increase in numbers of temperature stations included in the GHCN record and the change in the data over time.  Note how, when the number of thermometers in the station record falls of a cliff in 1990, there is a massive increase in the rate of change dT.  It just suddenly takes off.  It is also worth looking at E.M.Smith’s original graph (here) as he also plots the monthly data (the “hair”); in 1990 Canada  gets a ‘haircut’ – suddenly the temperature record has less variability.  This dT method is useful for looking at the data, although the data is not weighted in any way in this analysis.

Now it’s not just one sudden loss of stations.  There is a gradual rise in the station count, but some of the stations are very short-lived and dropout after 20-30 years of reporting (GIStemp has a cut off of 20 years so that stations reporting for any period shorter than this will not be used in the final GIStemp output).You can see the effect of this in the graph below – a few stations are dropped each year, particularly in the 70s, then that sudden loss of almost 200 stations after 1989.

Note that the graph also reports whether the stations have a warming or cooling trend (having the data in a database [TEKtemp] is very useful). Deriving a trend can be a quick and dirty way of seeing ‘shape’ in the data, but trend can be very sensitive to the start and end years used and can be misleading. For example 1940-1970 is generally acknowledged as a period when global temperatures fell and records ending within or shortly after this period may be disproportionally affected by it.  On the other hand it is worth noting that a large proportion of the records that dropped out in 1989/90 had an overall cooling trend.  Their loss from the record leaves behind those with a warming trend. Is that what allows the dT to ‘take off’ after 1990? (it is worth reading what E.M. Smith says  about “The Reveal” here). So just how important are those stations that had a cooling trend up to 1989/90? Were they in any way representative of their locality at the time? Would they still be showing a continuing cooling trend now? Of this, more, later.

This climate stuff is challenging me to come up with alternate ways of looking at the shape of the data.  Scanning the datasets it looked as if there were a lot of short-lived data sets in the record for Canada, but how to show this visually?  One of the graphs I came up with is this one which plots the year when a station begins reporting against the length of time it reports in the record:

I find this quite revealing. The GIStemp use of the records in GHCN v2.mean starts in 1880, and many of the stations that commence in the period 1880-1900 are still active until 1989/90, but only four of these survive in the current set for 2009.  Does this matter?  On the other hand many stations report for a short period of less than 40 years. It is notable that a lot of short-lived stations start reporting between 1950 and 1970 only to drop out rapidly after 20-30 years. What value do these add to the record?

I thought the proportion of short-lived stations did not seem normal, and, compared with the rest of the world it is not:

The dataset for Canada seems to have proportionally fewer long-lived stations and more short-lived ones than in the complete global GHCN dataset. This strikes me as odd. Why so many short-lived stations in Canada? I mean why bother including them if you only need a small number as representative of the whole country?

And what of the Adjustments?
We know there have to be adjustments. The purpose of these in GIStemp is for Urban Heat Island correction and increased homogeneity.  This is how the GIStemp documentation explains it:

“The goal of the homogenization effort is to avoid any impact (warming or cooling) of the changing environment that some stations experienced by changing the long term trend of any non-rural station to match the long term trend of their rural neighbors,..”

Correction for any warming due to the growth of an urban area, rather than adjusting current temperatures, warms the older part of the record.  This reduces the slope of the graph and decreases the warming trend.

“If no such neighbors exist or the overlap of the rural combination and the non-rural record is less than 20 years, the station is completely dropped; if the rural records are shorter, part of the non-rural record is dropped.”

So, adjustment can cause truncation of the station data if the adjusting rural record is shorter, and this can affect trend (which I have been looking at), however there is very little truncation in the Canadian record – lots of rural stations (<10,000 population) are present to adjust the urban records.  However, a lot of rural stations are adjusted as well.  This was a surprise to me. What’s happening here – increasing the homogeneity?.  OK, a few examples (from TEKtemp).  The table below lists the most extreme adjustments – either increasing or decreasing the trend of the station data (thumbnails of the graphs are below):

In just those eight stations there are five rural stations that get major adjustment, presumably because other rural stations in the local area tell a different story. Well, if you have two relatively close rural stations, one cooling and one warming, unless you examine them in great detail, how can you say which one is representative of the area? Hmm, I might come back to that.


Here is the overall shape of the adjustments (graph right). Note I have highlighted the adjustments that increase the warming trend in the data.  There are a lot of very small adjustments, in fact 279 stations have either no adjustment or one that makes only a very minor difference to the slope (-0.01> <0.01 deg C / decade), but there are 78 that have an adjustment of at least 0.05 deg C / decade (0.5 deg C / century). That is more than 10% of all the stations in the Canadian record, and remember we are looking at a GLOBAL increase of only slightly more than this over the last century. Now I know I’m not a climate scientist and perhaps I’m just being stupid, but I really can’t see how or why you can validly make adjustments to data that cause an increase in the warming trend of say 2 degrees C per century [comments please…].

The effects? Well, while Toronto gets an appropriate adjustment for urban growth, others, such as Prince Albert, Saskatchewan, do not. Other stations in Canada seem to suffer this ‘wrong way’ adjustment too, but some of these have already been corrected with the reworking of the data following GISS’ updated nightlights adjustment (more here).

In summary it looks as if the shape of the climate data for Canada differs somewhat from many of the other areas I have looked at (but not yet written about).  The ‘oddness’ includes: an overabundance of shortlived stations reporting into the dataset; lots of gaps and missing years, (which I did not cover, yet); quite a lot of wrong-way adjustment; a shifting base of stations with drop-out that reduces the current numbers to less than 5% of all stations.  I have started to look at the data that is available from Environment Canada and this will be the focus of Part 2 (when I can get to it).

Advertisements
This entry was posted in Mapping, Station Data, Trends and tagged , . Bookmark the permalink.

9 Responses to Canada – Top of the Hockey League (Part 1)

  1. bahamamamma says:

    I have been corresponding with Environment Canada and NOAA with the idea of finding out how stations are selected for the GCHN v2 data base.

    Thus far I have been advised that there are 37 weather stations in the Canadian Arctic that meet GCN/WMO standards but only four of these (Eureka, Whitehorse, Dawson and Coral Harbour) show up in the GHCN v2 raw data.

    To date there has been no hint of “stone walling” so it is entirely possible that some sort of explanation may be forthcoming.

  2. VJones says:

    @Bahamamamma,
    Thanks for the comment. Yes, I’ve found the Arctic Stations on the Environment Canada site. There is a good record for Alert as well as for Eureka. Alert shows cooling temperatures, but its absence from GHCN data means it’s effect on Arctic data is lost:
    http://boballab.wordpress.com/2010/03/09/giss-infilling-the-true-hypothetical-cow/

    It would be great to have some explanation of this. In my own field, when data is not used there is usually a good reason for it. I have read on blogs various explanations the lack of coverage after 1990, but nothing diffinitive. It is great that you are enquiring of the sources directly. Please do come back and share any explanation you receive.

    I’ve been really surprised in the data comparisons I’ve done so far – with the GHCN data (and GIStemp adjusted data) both differing from that provided by Environment Canada. Again, although it is tempting to point at it and say “look at this!”, I am only too aware of my own ignorance of some of the necessary adjustments of data. I am working up a post on it and hope that others can fill in with such knowledge.

  3. Sean Peake says:

    bahamamma, FYI, Whitehorse, Dawson and Coral Harbour are not technically in the Arctic (at or above of 66 degrees 30 minutes N lat.).

  4. bahamamamma says:

    My correspondent at NCDC failed to address my specific questions and encouraged me to wait for .v3 scheduled for release in June. Now I am feeling stone-walled!

    Sean Peake,
    Yes, those three stations are between 60N and 65N leaving only the Eureka and Alert stations in the GHCN v2 data set. I can’t remember when Alert “dropped out” (2002?) but in 2010 there is only one.

    It is truly wonderful what NASA can do with such a sparse data set. For example, just last month they assured us that the Arctic was warming dramatically. After all, one would expect to find global warming (or cooling) trends to be magnified at high latitudes:

    http://data.giss.nasa.gov/cgi-bin/gistemp/do_nmap.py?year_last=2010&month_last=3&sat=4&sst=0&type=anoms&mean_gen=03&year1=2010&year2=2010&base1=1951&base2=1980&radius=1200&pol=pol

  5. VJones says:

    Phew – yes – an anomaly of more than +4 degC is a lot. Might have to look into that!

    June for V3 – well it is good to have a timescale for it. Funny thing though – I’ve been seeing a lot of changes in the GISS Station data in the last couple of months. The reasons for some of it are clear, but others it is subtle. I documented some of it here: http://diggingintheclay.blogspot.com/2010/04/nightlights-and-shifting-sands.html

    The latest from Canada noticed while comparing GISS stations with those from Env. Canada is that some of the stations have a break for a year and then come back with a new series digit (e.g Resolute here: http://data.giss.nasa.gov/cgi-bin/gistemp/findstation.py?datatype=gistemp&data_set=1&name=resolute) Now a new series – the last digit(s) of the ID – is usual with a change of location or equipment, but normally this will be dealt with as the individual series are combined in GIStemp Step 1. This is showing up in the “after combining sources at same location” data option. The “new” station does not then make it though GIStemp Step 2 as it has less than 20 years of data. I plan to keep an eye one this.

  6. bahamamamma says:

    A temperature of -9 degrees Celsius was recorded on March 29th at Eureka. Three days later (April 1st) and NASA’s chart might have looked quite different! Compared to the long term average for that date (-37), a huge anomaly. The average for the entire month of March at Eureka was -32.2 Centigrade. With so few stations the effect of a single reading could be startling!

    According to the link you sent me, “Resolute” is back but with a different Station ID. I have not been at this long enough to pick up on things like that! That makes 2 stations in the Canadian Arctic or did I miss some more?

    My correspondent at NCDC mentioned the changing station numbers but this appears to be a fairly recent problem (2004/2005?) so it did not explain the sudden drop in stations around 1999.

    I tried your method using Open Office instead of Excel but I had to break up the GHCN v2 files into pieces that my spreadsheet could digest! Not a problem as I plan to concentrate on Canada and Russia owing to their huge land areas at high latitudes.

    I had a scheduled teaching trip to North Carolina that would have allowed me to visit NCDC in Asheville next week. Sadly it was cancelled so now I will have to wait for the third week of September for my next scheduled course in NC.

    I really appreciate your taking the time to reply to someone who is a real novice in the climate business.

  7. VJones says:

    @Bahamamamma,
    I really appreciate your taking the time to reply to someone who is a real novice in the climate business.
    LOL – 9 months ago I was a real novice in the climate business (in fact 3 years ago I was an avid warmist*AGW proponent). I’ve learned from a combination of interest, obsessiveness and collaboration. If fact most of what I have learned has been from other bloggers. E.M. Smith (Chiefio – see blogroll) calls it “climate barn-raising” (from his family’s Amish roots).

    I have been planning a stand-alone ‘methods and resources’ page with lots of links to sources and other pages of explanation on specific areas at other blogs.

    If you are familiar with this page: http://data.giss.nasa.gov/gistemp/station_data/
    when you get a graph for a particular station you can click on ‘download monthy data as text’ below the graph. In the past Kevin (collaborator on this blog) and I have compared this source to the GHCN v2.mean and they were identical. So you can regards it as the same as v2.mean but with individual stations combined (sorry if you know this already).
    Verity

    *Altered 01Oct2011 in line with new policy: https://diggingintheclay.wordpress.com/2011/10/01/cleaning-house/ VJ.

  8. bahamamamma says:

    As luck would have it there is a post on WUWT today:

    http://wattsupwiththat.com/2010/04/22/dial-m-for-mangled-wikipedia-and-environment-canada-caught-with-temperature-data-errors/#more-18812

    This shows how a reading on July 13, 2009 had an amazing effect. Could it be that the anomalous reading on March 29 was something similar?

  9. Pingback: weltklima - Seite 174 - Aktienboard

Comments are closed.