Critical debating points answered – Part 1

Environmental Modeling & Assessment

Wherein we rebut Points 1, 2 & 3

In What Mullan actually says on 7 November I answered the Hot Topic post Danger Dedekind heartbreak blah blah of 5 November, in which Mr Gareth Renowden, presumably advised by Dr Brett Mullan, principal climate scientist at NIWA, had levelled criticisms at the recently published reanalysis of the NZ temperature record. I set out to identify clear, falsifiable statements that Gareth Renowden (or Brett Mullan) was making. There were nine debating points, which you can find in What Mullan actually says. We promised every one would be rebutted.

Point 1

Let me pose a question. What does Dedekind think Rhoades and Salinger were doing in their 1993 paper? Indulging in a purely theoretical exercise? In fact, they developed their techniques by working on what became the Seven Station Series (7SS), and from 1992 onwards the 7SS was compiled using RS93 methods properly applied.

Debating point 1. From 1992 onwards the 7SS was recalculated using the Rhoades & Salinger (1993) measurement techniques.

Point 2

Just to be clear, when I said in the original post that the use of one or two year periods is not adequate, I was using the RS93 terminology of k=1 and k=2; that is, k=2 means 2 years before and after a site change (so 4 years in total, but a 2-year difference series which is tested for significance against another 2-year difference series).

Page 3 in the 1992 NZ Met Service Salinger et al report. The final paragraph clearly states k=2 and k=4 were used.

Top of page 1508 in Peterson et al 1998: “Homogeneity adjustments of in situ atmospheric climate data: a review”, International J. Climatology, 18: 1493-1517. Clearly states k=1, 2 and 4 were considered.

Debating point 2. From 1992 onwards the 7SS was recalculated using k=2 and k=4 for the comparison periods.

Point 3

During the discovery process before the High Court proceedings, Barry Brill and Vincent Gray examined a set of storage boxes at NIWA — dubbed the “Salinger recall storage boxes” — that contained (amongst other things) all of Jim Salinger’s original calculations for the 1992 reworking of the 7SS.

Debating point 3. All of Jim Salinger’s original calculations for the 1992 version were made available during discovery in the High Court proceedings.

We deal with all three points together.


In 1991-2, a Salinger-led team at the NZ MetService received a grant to research the New Zealand historical temperature record. Using Salinger’s 1981 thesis, it produced a booklet containing detailed station records for most of the long-standing weather stations and decided that only 24 were suitable for climate studies. As three of these were offshore, the selected stations became known as the 21+3 group.

An internal MetService technical report (Salinger92) says the data from the 21+3 group was homogenised and various averages and compilations were derived. No details of the adjusted data survived and no 21+3 time series was ever recorded.

When NIWA later came into existence, the MetService work seemed to disappear. No mention of it is made in NIWA’s records until 2011, when the NZ Climate Science Education Trust (NZCSET) Court proceedings were under way.

Changing positions

In late 2009, NIWA said in response to two Official Information Act (OIA) requests that it had not a single relevant paper to produce because the 7SS was based on Salinger’s thesis—which was copyright.

The answers from NIWA’s Minister to written Parliamentary questions were also clear that the 7SS adjustments were taken from the thesis, which had never been published but could be read in the reserved section of the Victoria University library. It was then disclosed that the actual calculations were not in the thesis but had been lost in a University computer upgrade during 1983.

After Salinger had been fired in April 2009, NIWA had no idea how the 7SS adjustments had been calculated. As a first step in filling the void, Brett Mullan embarked upon a six-week project of comparing Salinger’s 7SS spreadsheet with the thesis, identifying the dates, locations and causes of each of the 35 adjustments. By February 2010, NIWA was able to publish a “Schedule of Adjustments”, which the Coalition had been asking for. This was accompanied by a “Hokitika Example” that described an adjustment methodology and included actual calculations.

NIWA’s Minister (the Hon. Dr Mapp) tabled these two documents in Parliament on 18 February 2010 and promised a larger project that would extend the Hokitika rework to the adjustments at all seven stations. He said it would be peer-reviewed by BOM in Australia and the detailed methodology would be published in an international science journal by June 2012.

However, when NIWA came to file its defence to the Court proceedings in September 2010, they shifted their position markedly. They now said the 7SS had not been derived directly from the 1981 thesis, but from the 21+3 time series of 1992. This was strange because nobody had ever seen such a series and there was no evidence it had ever existed.


NIWA’s discovery process was a classic example of the “blizzard of documents” trick. Barry Brill, NZ Climate Science Coalition chairman and solicitor for the NZCSET, was led into a mid-size room at NIWA’s Wellington premises where climate data spreadsheets were stacked 2-3 cartons deep around all the walls, to a height of about 5 feet. However, he found some cartons marked “Salinger” which he was told had been sitting in Jim Salinger’s office for untold years.

He finally came across one that seemed to relate to the MetService work. It was a total mess. There was no way in the world anybody could have found data relating to the seven stations, let alone identified the adjustments or the calculation techniques. The only reference to adjustments were updates done by Maunder and Finkelstein. The lack of adjustment calculations is unsurprising. Salinger92 says they “set the computers rolling” rather than do them manually.


The claims that NIWA’s pre-2010 7SS adjustments were taken from the non-existent 21+3 series AND used RS93 adjustment techniques is ridiculous. The evidence is overwhelming:

  1. The 7SS adjustments were spread over 1853-1975. The 21+3 work was confined to 1920-90.
  2. The Rhoades & Salinger paper was published in November 1993. Salinger92, the internal MetService technical report, went to print before June 1992.
  3. A couple of years later, Folland & Salinger (1995) (submitted 27/10/1994) uses the thesis as its source of homogenised data, not 21+3.
  4. The 21+3 adjustments were allegedly computerised while RS93 requires manual work.
  5. Salinger92 says the thesis was used in 21+3.
  6. The Hokitika Example (taken from the contemporary 2009 7SS) clearly uses thesis techniques and makes no mention of RS93—this is the clincher.


NIWA now claims it has the schedule of adjustments for the original 7SS and the calculations that produced them, yet it has never published them. Worse, it implied to us that it did not have them. It obtained public funds to develop M10 in order to “reconstruct” the adjustments, but didn’t use RS93. It is impossible to know why, when first we asked for the schedule of adjustments and the calculations behind it, NIWA did not simply produce what it already had.

However, now we know these calculations exist we would like formally to request they be published.

As for using k=4, RS93 uses k=1 and k=2, and their example in section 2.4 uses k=2. RS93 does not use k=4 for comparison tests. Renowden has acknowledged this, after his initial confusion between k=2 and a four-year total period.

NIWA remains free at any point to perform its own modified RS93 analysis using k=2 and k=4 and to publish the results, but it has not done so—not even when it was given funds to produce M10. It is hard to avoid the conclusion that NIWA is aware that doing so would challenge its previous finding of 0.9°C/century warming.

Dr Mullan seems to want to bypass the significance tests, perhaps to ensure that as many adjustments as possible are made, because of the reasons laid out in Why Automatic Temperature Adjustments Don’t Work.

Increasing k values has two effects—it reduces the likelihood of a false shift being rejected and increases the likelihood that sheltering or UHI effects at other stations introduce a spurious warming trend after adjustment. RS93, however, recognising the problem, limits the k values to the lowest that achieve statistical significance for a genuine shift, namely k=1 and k=2. RS93 is quite clear on this and the peer reviewers of our paper understood it.

Why not use RS93?

NIWA accepts that the NZ Climate Science Coalition Statistical Audit of the NIWA 7-Station Review (pdf, 2.8 MB), reviewing Mullan10, correctly applied RS93. Its criticism is that the application was too strict and ought to have been more flexible—particularly in the length of comparison periods.

So NIWA knows that a strict application of RS93 leads to positive and negative adjustments, broadly speaking, balancing out, which means no increase in the warming trend. That is why NIWA won’t now publish a 7SS that uses RS93 techniques (although it claims to have secretly done the calculations). It also proves that a pre-2010 7SS could not possibly have been calculated with RS93 methods.

8 Thoughts on “Critical debating points answered – Part 1

  1. Richard C (NZ) on November 15, 2014 at 10:07 am said:

    >”Why Automatic Temperature Adjustments Don’t Work.”

    A message for BOM, BEST, and GISS.

    And NIWA.

    Mullan, A.B; Stuart, S.J; Hadfield, M.G; Smith, M.J (2010) in particular.

  2. Yes, definitely. What I like about this little lesson in statistics is that it came from none other than James Hansen himself. So the warmists ought to pay attention!

  3. Richard C (NZ) on November 15, 2014 at 1:57 pm said:

    >”NIWA accepts that the NZ Climate Science Coalition Statistical Audit of the NIWA 7-Station Review (pdf, 2.8 MB), reviewing Mullan10, correctly applied RS93.”

    [Renowden] – “….the 7SS was compiled using RS93 methods properly applied”

    Hmm… I think we need an adjudicator. Like a judge, and a court hearing.

    Or perhaps not,

  4. Richard C (NZ) on November 16, 2014 at 12:44 pm said:

    [Renowden/Mullan?] – “….from 1992 onwards the 7SS was compiled using RS93 methods properly applied.”


    Correlation and weighting comparisons for Masterton 1920, ‘Statistical Audit’ SI vs M10:

    SI, M10, correlation comparison
    0.73, N/A
    Albert Park
    0.58, N/A
    Christchurch Gardens
    0.68, N/A
    0.88. N/A

    SI, M10, weighting comparison
    0.24, 0.00
    Albert Park
    0.09, 0.00
    Christchurch Gardens
    0.18, 0.00
    0.49, 0.00

  5. Richard C (NZ) on November 16, 2014 at 2:59 pm said:

    >”As for using k=4, RS93 uses k=1 and k=2, and their example in section 2.4 uses k=2. RS93 does not use k=4 for comparison tests.”

    From what I can gather, USHCN uses the RS93 equivalent of k=0.5.

    ‘Understanding adjustments to temperature data’

    by Zeke Hausfather

    Pairwise Homogenization Algorithm (PHA) Adjustments

    The Pairwise Homogenization Algorithm [hotlink #1 – see below] was designed as an automated method of detecting and correcting localized temperature biases due to station moves, instrument changes, microsite changes, and meso-scale changes like urban heat islands.

    The algorithm (whose code can be downloaded here [hotlink] is conceptually simple: it assumes that climate change forced by external factors tends to happen regionally rather than locally. If one station is warming rapidly over a period of a decade a few kilometers from a number of stations that are cooling over the same period, the warming station is likely responding to localized effects (instrument changes, station moves, microsite changes, etc.) rather than a real climate signal.

    To detect localized biases, the PHA iteratively goes through all the stations in the network and compares each of them to their surrounding neighbors. It calculates difference series between each station and their neighbors (separately for min and max) and looks for breakpoints that show up in the record of one station but none of the surrounding stations. These breakpoints can take the form of both abrupt step-changes and gradual trend-inhomogenities that move a station’s record further away from its neighbors.

    Hotlink #1:

    ‘Homogenization of Temperature Series via Pairwise Comparisons’


    NOAA/National Climatic Data Center, Asheville, North Carolina
    (Manuscript received 2 October 2007, in final form 2 September 2008)

    Page 4 pdf,

    a. Selection of neighbors and formulation of difference series

    Next, time series of differences Dt are formed between all target–neighbor monthly temperature series.
    To illustrate this, take two monthly series Xt and Yt, that is, a target and one of its correlated neighbors.

    Following Lund et al. (2007), these two series can be represented as

    XmT+v = uv^x + B^x (mT+v) + SmT+v^x + emT+v^x (1)


    YmT+v = uv^y + B^y (mT+v) + SmT+v^y + emT+v^y (2)

    m represents the monthly mean anomaly at the specific series,
    T = 12 represents the months in the annual cycle,
    v = (1, . . . , 12) is the monthly index,
    m = the year (or annual cycle) number,

    and the et terms denote mean zero error terms at time t for the two series.
    The St terms represent shift factors cause by station changes, which are thought to be step functions.

    Is this not a rolling (iterative) 12 month comparison, X to Y?

  6. Richard C (NZ) on November 16, 2014 at 4:41 pm said:

    >”Following Lund et al. (2007)”

    I assume that is a typo and that the paper is actually:

    Lund, R., and J. Reeves, 2002: Detection of undocumented changepoints: A revision of the two-phase regression model. J. Climate, 15, 2547–2554.

    This seems to be the 95th Percentile Matching (PM-95) method that BOM uses for ACORN-SAT except it’s not an X to Y neighbour comparison as in BOM’s method.

    2. An Fmax test statistic

    We start with the simple two-phase linear regression
    scheme for a climatic series {Xt} considered by Solow
    (1987), Easterling and Peterson (1995), and Vincent
    (1998; among others). This model can be written in the

    Xt = [model] (2.1)

    where {et} is mean zero independent random error with
    a constant variance.

    The model in (2.1) is viewed as a classic simple linear
    regression that allows for two phases. This allows for
    both step- (u1 ± u2) and trend- (a1 ± a2) type changepoints.
    Specifically, the time c is called a changepoint
    in (2.1) if u1 ± u2 and/or a1 ± a2. In most cases, there
    will be a discontinuity in the mean series values at the
    changepoint time c, but this need not always be so (Fig.
    10 in section 5 gives a quadratic-based example where
    the changepoint represents more of a slowing of rate of
    increase than a discontinuity).

    # # #

    Fmax test statistic series lengths (n) range from 10 (e.g. 10 months, RS93 k=0.4) to 5000 in Table 1, but I can’t see any recommendation for length n in respect to temperature series. There’s nothing said about n in equation 2.2 for example. What happens when a break (c) occurs at time 5 months of n = 100 months for example? Isn’t this just effectively n = 10?

    First look at this so I’ve probably misunderstood completely. I suppose I’ll have to school up on F-test:

    Worked example is n = 6 (RS93, k=0.25)

  7. Richard C (NZ) on November 17, 2014 at 6:24 pm said:

    >”Is this not a rolling (iterative) 12 month comparison, X to Y?” [Menne & Williams (2009)]

    Possibly for F-test, but the actual adjustments are by +/- 24 months from the break i.e. the break must be known to do this. Although I cannot see this process from the statistics sequence above.

    de Freitas et al (2014) page 5 pdf:

    “We note that [14] (p. 1206) also used ±24-month
    comparison periods by default for their algorithm based on
    pairwise comparisons”

    Actually p. 1706, and [14] is Menne & Williams (2009):

    “Adjustments are determined by calculating multiple
    estimates of D using segments from neighboring series
    that are homogeneous for at least 24 months before and
    after the target changepoint”

    Didn’t see that until directed by deF et al but how, without this note, is anyone to know? Lund & Reeves (2002) is even less helpful.

Leave a Reply

Your email address will not be published. Required fields are marked *

Post Navigation