*One way to approach the controversies in climate science is to get the raw data and analyze it in a way that one trusts, in a way that makes sense, if there are suspicions about the methods used. Even better would be to collect the raw data oneself, but that's not possible.*

The conclusion I reach with this exercise is that GISS's (Goddard Institute for Space Science at NASA) analytical methods for processing weather station data don't introduce a significant amount of bias at least where the data available is concerned. The other issue is the heat island effect. This analysis shows that it exists and was not accounted for by the GISS analysis; there's no evidence that it was subtracted from the grand mean of global temperatures. This becomes important when attempting to extrapolate available weather station data onto the rest of earth's surface. Not removing the heat island effect from data extrapolated out to the rest of the globe would tend to make the earth look too hot.

The conclusion I reach with this exercise is that GISS's (Goddard Institute for Space Science at NASA) analytical methods for processing weather station data don't introduce a significant amount of bias at least where the data available is concerned. The other issue is the heat island effect. This analysis shows that it exists and was not accounted for by the GISS analysis; there's no evidence that it was subtracted from the grand mean of global temperatures. This becomes important when attempting to extrapolate available weather station data onto the rest of earth's surface. Not removing the heat island effect from data extrapolated out to the rest of the globe would tend to make the earth look too hot.

As Steve McIntyre has shown, Jim Hansen's methods [at GISS] for processing station temperatures into global temperatures don't really make sense.

The burning question for me therefore has been: If GISSs methods don't make sense and perhaps introduce bias, then what do we get when proper methods are used? What, exactly, is the truth of the matter?

Steve pointed out that the job of processing this data and eliminating the effects of various factors to come up with an estimation of the temperature anomalies over time could be done using well understood mixed effects statistical modeling algorithms. I thought it rather astounding that a statistical method seemingly completely different from Hansen's methods would come up with much the same results. Steve used this in modeling the combination of location data into grid cells, but why not apply it to the whole data set?

And so I tried that.

giss.dset0, you may recall is temperature data taken from different weather stations. In some cases these stations are all together in one location, as in 4 or 5 stations in a given city area. Hansen first combined these stations together for a location, then he did a lot of complicated smoothing and adjustment of the data in a series of steps. dset0 is raw data, dset1 includes some adjustments and combination of data of stations at given locations, dset2 includes more adjustments within a given location, and gridding of the data does more of the same, smoothing data over geometrical areas, between locations. With all of the adjusting and smoothing going on the introduction of some sort of bias seems likely. For example, Hansen always used the longest records to start with and then increased or decreased the shorter records to match the longer. But aren't the longer records usually from urban stations? Would this not cement the heat island effect into the combined location data?

And so, working in the R Project statistical package and following McIntyre's lead, I first reorganized the data in giss.dset0 into a simple long data frame with location information folded in from giss.info.

extr = function(giss.dset0, giss.info) {

year = 0; temp = 0; alt=0; ur=0; loc=0; lxs=0; sta=0

Ni = (1:length(giss.dset0))[!is.na(giss.dset0)]

for(i in Ni) #if(sum(is.na(giss.dset0[

*])) > 0) next else*

for(j in 1:length(giss.dset0[

for(j in 1:length(giss.dset0[

*])) if(is.na(giss.dset0[**][[j]])) next else*

{

N = length(giss.dset0[{

N = length(giss.dset0[

*][[j]][[1]])*

sta = c(sta, rep(names(giss.dset0[sta = c(sta, rep(names(giss.dset0[

*])[j],N))*

year = c(year, giss.dset0[year = c(year, giss.dset0[

*][[j]][[1]])*

temp = c(temp, giss.dset0[temp = c(temp, giss.dset0[

*][[j]][[18]])*

ur = c(ur, rep(giss.info$urbanur = c(ur, rep(giss.info$urban

*,N))*

alt = c(alt, rep(250*(giss.info$alt.interpalt = c(alt, rep(250*(giss.info$alt.interp

*%/%250),N))*

tmp=substr(names(giss.dset0[tmp=substr(names(giss.dset0[

*])[j],1,8)*

loc = c(loc, rep(tmp, N))

lxs = c(lxs, rep(paste(tmp, "i", names(giss.dset0[loc = c(loc, rep(tmp, N))

lxs = c(lxs, rep(paste(tmp, "i", names(giss.dset0[

*])[j],sep=""),N))*

}

sta=sta[-1]; year=year[-1]; temp=temp[-1]; alt=alt[-1]; loc=loc[-1]

ur = ur[-1]; lxs=lxs[-1]

return(data.frame(loc=loc,sta=sta,year=year,ur=ur,alt=alt,lxs=lxs,temp=temp))

}

dset0.t = extr(giss.dset0, giss.info)

}

sta=sta[-1]; year=year[-1]; temp=temp[-1]; alt=alt[-1]; loc=loc[-1]

ur = ur[-1]; lxs=lxs[-1]

return(data.frame(loc=loc,sta=sta,year=year,ur=ur,alt=alt,lxs=lxs,temp=temp))

}

dset0.t = extr(giss.dset0, giss.info)

(No, I could not use monthly data without running out of memory.)

Then calculate random effects:

(load the lme4 package)

dset0.fm = lmer(temp~(1|loc)+(1|sta)+(1|ur)+(1|alt)+(1|year), data=dset0.t)

dset0.fm1 = lmer(temp~(1|sta)+(1|year), data=dset0.t)

(loc = location id, sta=station id with location, ur=urban vs suburban vs rural, alt= location altitude in 250 meter bands)

ranef(dset0.fm)$year and ranef(dset0.fm1)$year contain the random effects year over year. The temperature anomaly for each year, in other words. Most of the variance is dumped into other groups. Using more groupings in the model formula makes no difference in these estimates (dset0.fm vs dset0.fm1 R-squared = 0.999).

Here is your heat island effect:

> ranef(dset0.fm)$ur

(Intercept)

R -0.23716400 (rural)

S -0.04312743 (suburban)

U 0.28028908 (urban)

And here is the effect of altitude:

> ranef(dset0.fm)$alt

(Intercept)

0 10.0821297

250 9.5952519

500 8.2160810

750 7.4049196

1000 6.4774994

1250 5.4415439

1500 4.2931860

1750 3.4352673

2000 1.7974976

2250 -0.5588156

2500 -1.6522236

2750 -5.5811832

3000 -5.2383055

3250 -13.6373919

3500 -8.7024097

3750 -13.8671371

4000 1.2833311

4250 -6.0728011

4500 -3.3266110

4750 0.8495323

5000 -1.0044707

5750 -0.4843320

What that yearly random effect looks like compared to the GISSTEMP book numbers:

View attachment 67134866

And this is the difference between the two:

View attachment 67134867

R-squared = 0.73. Close enough for government work. The slope coefficients are nearly identical (a 0.00035 per year difference). And I'm sure that a statistician would do much better. Much of what Hansen did in step 1 and 2 and in gridding the data was smoothing. Application of a smoothing algorithm would no doubt take out the variation in the random effects estimation I got, but that would not improve the fidelity of the results.

dset0 annual averages are as close to the raw data as I can get without running out of computer memory. I went from that data to a final result with one command in R.

If I can come up with the same results with no spacial factors at all other than altitude and station grouping then does gridding mean anything? Can the analysis represent anything more than the 25% of the earth's surface the stations occupy?

This shows that GISS methods, regardless how byzantine, introduce no significant amount of bias where temperature trends are concerned with regard to the station data available.

The reason that the GISS data set runs so much hotter that others may be in the effort to extrapolate station data to areas that don't have stations, that is to say, the other 75% of the earth's surface. This would be using a data set that is perhaps dominated by weather stations in urban areas with a significant heat island effect.

That lack of coverage of the earth's surface in station data, I'm guessing, is the reason for the difference in satellite, radiosonde, and station estimates of global temperature.

The amount of memory and processing power I have in my clearance priced desktop PC would have been duplicated by a computer the size of a city block, if at all, back in the days Hansen started doing this climate business. His multi step process of analyzing station data was perhaps an effort to perform an analysis that would have been impossible using the statistical methods I used here.

Update

Including another grouping factor, latitude in 5 degree bands, has no effect on the temperature anomalies, but it's interesting to look at the random effect for that group:

dset0.fm2 = lmer(temp~(1|ur)+(1|lat)+(1|alt)+(1|sta)+(1|year),

data=dset0.t)

> ranef(dset0.fm2)$lat

(Intercept)

-90 -48.0765255

-85 -30.6208086

-80 -30.5843329

-75 -26.8087508

-70 -17.5837928

-65 -10.6445832

-55 -0.4677458

-50 1.9302840

-45 4.6332613

-40 7.5261542

-35 10.1896612

-30 7.2221295

-25 16.5283583

-20 17.4532507

-15 18.3289638

-10 18.4701555

-5 18.5784181

0 18.7600821

5 19.5108772

10 19.8686941

15 19.7051168

20 17.5278219

25 15.0410194

30 11.2047241

35 7.3644670

40 3.2930061

45 -0.4509212

50 -0.3098546

55 -5.3064547

60 -9.0101544

65 -13.3917199

70 -16.3916299

75 -20.7265894

80 -22.7631131

And including that group gives more consistent altitude and urban estimates:

> ranef(dset0.fm2)$alt

(Intercept)

-1000 7.6009074

0 8.7884377

250 8.2316191

500 5.1136268

750 9.5794809

1000 5.9054092

1250 5.3426933

1500 4.0768528

1750 3.4410112

2000 2.0043036

2250 1.0247369

2500 -0.5251244

2750 -1.8542720

3000 -0.6135464

3250 -8.0035847

3500 -3.3222669

3750 -5.1468110

4000 -5.1510483

4250 -8.3572983

4500 -8.9318326

4750 -5.5925205

5000 -5.9627467

5750 -7.6483137

> ranef(dset0.fm2)$ur

(Intercept)

R -0.89551186

S -0.03941906

U 0.93493366(No, I could not use monthly data without running out of memory.)

Then calculate random effects:

(load the lme4 package)

dset0.fm = lmer(temp~(1|loc)+(1|sta)+(1|ur)+(1|alt)+(1|year), data=dset0.t)

dset0.fm1 = lmer(temp~(1|sta)+(1|year), data=dset0.t)

(loc = location id, sta=station id with location, ur=urban vs suburban vs rural, alt= location altitude in 250 meter bands)

ranef(dset0.fm)$year and ranef(dset0.fm1)$year contain the random effects year over year. The temperature anomaly for each year, in other words. Most of the variance is dumped into other groups. Using more groupings in the model formula makes no difference in these estimates (dset0.fm vs dset0.fm1 R-squared = 0.999).

Here is your heat island effect:

> ranef(dset0.fm)$ur

(Intercept)

R -0.23716400 (rural)

S -0.04312743 (suburban)

U 0.28028908 (urban)

And here is the effect of altitude:

> ranef(dset0.fm)$alt

(Intercept)

0 10.0821297

250 9.5952519

500 8.2160810

750 7.4049196

1000 6.4774994

1250 5.4415439

1500 4.2931860

1750 3.4352673

2000 1.7974976

2250 -0.5588156

2500 -1.6522236

2750 -5.5811832

3000 -5.2383055

3250 -13.6373919

3500 -8.7024097

3750 -13.8671371

4000 1.2833311

4250 -6.0728011

4500 -3.3266110

4750 0.8495323

5000 -1.0044707

5750 -0.4843320

What that yearly random effect looks like compared to the GISSTEMP book numbers:

View attachment 67134866

And this is the difference between the two:

View attachment 67134867

R-squared = 0.73. Close enough for government work. The slope coefficients are nearly identical (a 0.00035 per year difference). And I'm sure that a statistician would do much better. Much of what Hansen did in step 1 and 2 and in gridding the data was smoothing. Application of a smoothing algorithm would no doubt take out the variation in the random effects estimation I got, but that would not improve the fidelity of the results.

dset0 annual averages are as close to the raw data as I can get without running out of computer memory. I went from that data to a final result with one command in R.

If I can come up with the same results with no spacial factors at all other than altitude and station grouping then does gridding mean anything? Can the analysis represent anything more than the 25% of the earth's surface the stations occupy?

This shows that GISS methods, regardless how byzantine, introduce no significant amount of bias where temperature trends are concerned with regard to the station data available.

The reason that the GISS data set runs so much hotter that others may be in the effort to extrapolate station data to areas that don't have stations, that is to say, the other 75% of the earth's surface. This would be using a data set that is perhaps dominated by weather stations in urban areas with a significant heat island effect.

That lack of coverage of the earth's surface in station data, I'm guessing, is the reason for the difference in satellite, radiosonde, and station estimates of global temperature.

The amount of memory and processing power I have in my clearance priced desktop PC would have been duplicated by a computer the size of a city block, if at all, back in the days Hansen started doing this climate business. His multi step process of analyzing station data was perhaps an effort to perform an analysis that would have been impossible using the statistical methods I used here.

Update

Including another grouping factor, latitude in 5 degree bands, has no effect on the temperature anomalies, but it's interesting to look at the random effect for that group:

dset0.fm2 = lmer(temp~(1|ur)+(1|lat)+(1|alt)+(1|sta)+(1|year),

data=dset0.t)

> ranef(dset0.fm2)$lat

(Intercept)

-90 -48.0765255

-85 -30.6208086

-80 -30.5843329

-75 -26.8087508

-70 -17.5837928

-65 -10.6445832

-55 -0.4677458

-50 1.9302840

-45 4.6332613

-40 7.5261542

-35 10.1896612

-30 7.2221295

-25 16.5283583

-20 17.4532507

-15 18.3289638

-10 18.4701555

-5 18.5784181

0 18.7600821

5 19.5108772

10 19.8686941

15 19.7051168

20 17.5278219

25 15.0410194

30 11.2047241

35 7.3644670

40 3.2930061

45 -0.4509212

50 -0.3098546

55 -5.3064547

60 -9.0101544

65 -13.3917199

70 -16.3916299

75 -20.7265894

80 -22.7631131

And including that group gives more consistent altitude and urban estimates:

> ranef(dset0.fm2)$alt

(Intercept)

-1000 7.6009074

0 8.7884377

250 8.2316191

500 5.1136268

750 9.5794809

1000 5.9054092

1250 5.3426933

1500 4.0768528

1750 3.4410112

2000 2.0043036

2250 1.0247369

2500 -0.5251244

2750 -1.8542720

3000 -0.6135464

3250 -8.0035847

3500 -3.3222669

3750 -5.1468110

4000 -5.1510483

4250 -8.3572983

4500 -8.9318326

4750 -5.5925205

5000 -5.9627467

5750 -7.6483137

> ranef(dset0.fm2)$ur

(Intercept)

R -0.89551186

S -0.03941906

U 0.93493366