The trouble with anomalies… Part 1

The use of anomalies is the normal defense against any critical assessment of the variability of numbers of stations across the climate record – the so-called ‘station drop-out’. The rationale is that it should not matter if stations are located in warm or cool areas, as it is the anomaly, the departure from a defined ‘normal’ period that matters. From NCDC:

Using reference values computed on smaller [more local] scales over the same time period establishes a baseline from which anomalies are calculated. This effectively normalizes the data so they can be compared and combined to more accurately represent temperature patterns with respect to what is normal for different places within a region.

Ask yourself this – Are all anomalies created equal? The answer is of course that they are not.  The presence, or absence of water vapour and its phase changes have an effect such that we cannot expect a desert to respond to warming in the same way as a tropical rainforest, or high altitude mountain to warm like a coastal town.

Comparing the effect of climate variation on Reno, SFO and Sacramento

Take three stations, which are at approximately the same latitude.  San Francisco Airport is affected by the sea (and fogs) which gives mild Winters and less extreme heat in summer; Sacramento ~100km inland is cooler in January (Figure 1a) on average and experiences lower record highs and lower record lows; Reno, ~100km further inland and at high altitude, experiences cooler again averages in Winter, extreme record lows, but record highs similar to that of Sacramento.  All are likely to have had some effect of city growth and UHI development, but let’s forget that for the purpose of this example. When the temperature ranges are plotted as anomalies (Figure 1b) Reno has the greatest range of temperatures or volatility.

Figure 1. Comparison of monthly mean temperatures in January for three stations
(SFO, Sacramento, Reno) illustrating the increased range or volatility typically
found in stations with lower mean temperatures. [a] Actual temperatures (F);
[b] Temperature anomalies (⁰C).

E.M.Smith’s Volatility Surmise is simply that a highly volatile station can have a greater effect on local anomalies and contribute more to the change of global average than a low volatility station.  The most important part is WHEN it is in the record.  During warm periods Reno (and other volatile stations) will warm more.  This is also exemplified in the Arctic.  In cold periods such stations will experience more extreme lows.  Less volatile stations can never get as low as the highly volatile ones, therefore station dropout, and when it occurs for highly volatile stations, matters.

Figure 2 shows how this works for the three example stations. In the warmer period 1925-1944, the presence of Reno makes the averages slightly warmer (it probably would be prudent to check there are no uncorrected station moves in that period, although this unadjusted data is from GISS using TOBS and SHAP adjusted USCHNv2; such moves should be corrected whether we agree with the correction or not); in the cooler period after 1950 Reno makes the averages cooler.  Note that it could also be interpreted that population growth and increase of UHI may be less in Reno than in SFO and Sacramento which means it will respond to a period of cooling with lower anomalies than for the cities that are warming.

Figure 2. Average anomalies for SFO and Sacramento, with and without Reno

The issue then is how widespread is this effect and, most importantly, is the effect likely to be significant? The question is how to quantify this quickly and easily.  The use of variance, with some caveats and still a few concerns, seems to be valid (Figure 3):

Figure 3. Frequency distribution of annual mean temperatures for Reno and SFO

Further than this Figure 4 shows a plot of frequency for monthly anomaly data at Reno. This resolves into two separate peaks when frequencies for warm and cool periods are plotted separately.  Analysing the warm 1925-1944 and cool 1945-1964 periods individually shows separation of the periods, however this analysis has not been taken any further and the difference may not be statistically significant.

Figure 4. Comparison of frequency analysis for temperature anomalies during the warm (1925-1944) and cool (1945-1964) periods for Reno.

Figures 5 and 6 show comparisons of frequency analysis of anomalies for all three stations during the warm and cool periods described previously.  These suggest that SFO has a more stable climate that is likely more affected by sea surface temperature and is less susceptible to warming or cooling than the inland stations.

Figure 6. Comparison of frequency analysis for temperature anomalies during the cool (1945-1964) period for SFO Sacramento and Reno. Note the ‘cold shoulder’ for Reno.

Figure 5. Comparison of frequency analysis for temperature anomalies during the warm (1925-1944) period for SFO, Sacramento and Reno.

Figure 6. Comparison of frequency analysis for temperature anomalies during the warm (1925-1944) period for SFO, Sacramento and Reno.

Using Variance as an indicator of volatility

Since variance and standard deviation are measures of the spread of a distribution, is the use of variance a potentially valuable tool in visualising how stations respond differently to climate? Note that it is not intended to do any analysis of variance, merely to use the values as a means of approximating volatility.

Variance is initially investigated in preference to standard deviation due to the greater range of values.  Variance, (σ2 = [Σ(X-μ)2]/N) where μ is the mean and N is the number of variables, is sensitive to the size of N and this may limit its use.  As well as a large spread of data, a small value of N will increase the variance.

The first look was at the annual data only.  Calculating a variance from the yearly annual means gave a low variance for each station in the range 0.45-0.65.  Sacramento (0.45), despite a warmer overall average temperature had a lower variance than Reno (0.49), while SFO (0.65) had the highest variance of the three stations.

The use of the full monthly data, however, produced acceptable differences between the stations, which were representative of their internal variation in temperature (Table 1: All years).  Reducing the period of examination to 20 years to capture the peak of the 1930s warm period did not increase the variance, and this holds true for all three stations for sequential 20 year periods up to 2004 (Table 1).  This suggests that, within a 20 year period at least, there is a sufficient number of data points to capture variability and ‘average’ data.  A 30 year period would be better, however this analysis intends to focus on shorter periods of more extreme conditions.

Table 1. Average temperatures and variance of temperature data for SFO, Sacramento and Reno.




Temperature: Average variance Average variance Average variance
All years 13.60 7.32 15.93 33.48 11.26 55.80
1925-1944 13.63 4.24 16.06 33.11 11.55 56.00
1945-1964 13.47 8.57 15.79 35.08 10.93 53.37
1965-1984 14.21 8.74 17.09 31.95 12.00 47.33
1985-2004 14.54 8.96 16.41 34.94 11.75 59.84

Repeating this analysis with anomaly data is shown in Table 2.  Monthly anomaly values were calculated as deviations from monthly average values and all monthly anomaly values for each period in question were averaged.  This shows for each station the relative warmth or coolness of each period; the variance is within a relatively narrow band for each station, and that of Reno is much greater than for either of the other stations.

Table 2. Average anomaly values and variance of anomaly values for SFO, Sacramento and Reno.




Anomaly: Average variance Average variance Average variance
All years 0.08 1.83 0.04 1.89 0.00 3.26
1925-1944 0.10 1.45 0.16 1.62 0.29 3.28
1945-1964 0.00 1.49 -0.08 1.60 -0.28 2.36
1965-1984 0.36 1.58 0.45 1.80 -0.18 2.51
1985-2004 1.09 1.47 0.53 1.91 0.48 2.93

Why could quantifying volatility be important?

If stations with more extreme values than Reno or more stable temperatures than SFO conform to the same pattern, we have a means of quantifying station variability. This is a possible means to look a the GHCN data set as a whole for and allow examination of bias, intentional or otherwise, in the data set as a result of the dramatic reduction of stations in 1990.  If more volatile stations do respond to climate cycles in the same way as Reno, and it is an expectation that they should, then is there a case for saying that their uneven distribution in the data set could lead to quantifiable bias? More in Part 2.

This entry was posted in Station Data and tagged , , , , , , . Bookmark the permalink.

4 Responses to The trouble with anomalies… Part 1

  1. E.M.Smith says:

    Interesting to note in Table 1 that when Reno variance is lowest, SFO is high, while when SFO is lowest, Reno is high. That suggest both some ideas on cause and an amplification of the effective differences at extremes. I think we’ll find “water matters”…

  2. Graeme says:

    I have seen several posts from carrick on the Blackboard poitning out that coastal sites are not good measurement points for land temperatures. Given the mountains between rReno and San Fran, I think you are amplifying the point he has made.Onj the other hand, Sacramento seems weird because it is so far inland. I really would not expect it to be so different from an inland desert location such as Reno.

    • Verity Jones says:

      It is altitude that makes the difference between Sacramento and Reno and that is the point. The cities were chosen for their relative proximity and familiarity.

  3. Pingback: The trouble with anomalies - US Message Board - Political Discussion Forum

Comments are closed.