I’ve moved to WordPress: http://bobtisdale.wordpress.com/

Monday, July 5, 2010

An Overview Of Sea Surface Temperature Datasets Used In Global Temperature Products

I’ve moved to WordPress.  This post can now be found at An Overview Of Sea Surface Temperature Datasets Used In Global Temperature Products
I was asked by one of the independent researchers who are reconstructing global temperature data for an overview of Sea Surface Temperature (SST) datasets. I’ve limited the discussion to those SST datasets used in the Hadley Centre, GISS, and NCDC global temperature products.

The Hadley Centre uses HADSST2 data in its HADCRUT data. GISS uses HADISST from January 1880 to November 1981 and Reynolds (OI.v2) from December 1981 to present for their combined GISTEMP dataset. And the NCDC has used its ERSST.v3b data in its global temperature product since July 2009.


Long-term Sea Surface Temperature (SST) datasets are based on the International Comprehensive Ocean-Atmosphere Data Set (ICOADS) data, which is divided into 2 degree grids. ICOADS link:

Sea Surface Temperature (SST) before the advent of buoys and satellites was measured by ships. The majority of ships traversing the oceans kept to shipping lanes. This limited the spatial coverage of SST measurements, especially in the South Pacific, which continues to have limited measurements in recent decades. Figure 1 is a collection of COADS SST data coverage maps available from the National Center for Atmospheric Research (NCAR). And here are the links to the full-size individual maps:

NCAR describes the maps as “Percentage of non-missing data in each time period is plotted. The minimum number of observations needed per month per grid box was 1.” Non-missing data is an interesting but descriptive explanation. It definitely gets the point across that missing data is the norm in many areas. Each of the maps illustrates the monthly SST coverage for a 20-year period, starting in 1861 and ending in 1997. Zero to 10% coverage is in white, while complete coverage (90 to 100%) is shown in gold. Even the last period from 1981 to 1997 shows major gaps in the data for the South Pacific.

http://i48.tinypic.com/2i7spwy.pngFigure 1

The SST data suppliers take the long-term ICOADS data, make corrections for the transitions from one measuring method to another (non-insulated buckets, insulated buckets, ship intakes, buoys, satellites) and infill missing data using methods that vary from one dataset to another.

The Hadley Centre takes two approaches to infill the missing data. The simplest is the approach they use for their HADSST2 data, which is available monthly from 1850 to present. The HADSST2 data is presented in 5 degree grids, so basically, the Hadley Centre has taken the ICOADS data that’s in 2 degree grids and expanded its coverage by converting it to 5 degree grids. But for the HADSST2 data, that’s as far as the Hadley Centre takes the infilling. There are large gaps in the coverage of HADSST2 data even to current times. Refer to Figure 2, which shows two maps of HADSST2 SST anomaly data. Cell a represents January 1900 and Cell b shows March 2010. Missing data is in white. While coverage has improved greatly in recent years, there are still large areas of the Southern Hemisphere where SST measurements are missing.
Figure 2

In one respect, the Hadley Centre provides the most realistic presentation of SST with their HADSST2 data, since they only present data in grids where measurements exist. But HADSST2 data has a curious upward bias after 1998. The Hadley Centre changed SST data suppliers in 1998. The merging of the two datasets appears to have created an upward shift in the HADSST2 data that appears in no other SST dataset. This can be seen in comparisons to other datasets; that is, when other global SST datasets are subtracted from HADSST2 global data. An example is shown in Figure 3. There are other examples in my post Met Office Prediction: “Climate could warm to record levels in 2010”.
Figure 3

The HADSST2 dataset is described in Raynor et al (2006) Improved analyses of changes and uncertainties in sea surface temperature measured in situ since the mid-nineteenth century: the HadSST2 data set.

A link to the application required to download raw HADSST2 data from the Met Office can be found at:

Note that the Met Office restricts to whom they make data available. See:

Under the heading of Restricted Data Access, they write, “The online application for access to the Met Office SST data includes the Met Office Agreement to be electronically accepted. Please note that the Met Office data sets are available for bona fide academic research only (sorry no undergraduates), on a per person per project basis (i.e. all members on a same project who will be using the data must individually apply for access to the data). If you wish to access the Met Office data for commercial or personal purposes, please contact the Met Office directly.”


The second of the Hadley Centre SST datasets is HADISST, which is “globally complete” monthly from January 1871 to present. Refer to Figure 4.
Figure 4

HADISST uses satellite data from 1982 to present that is supplemented by buoy and ship readings. Prior to 1982, they basically use ICOADS data and infill missing SST data using what they describe as Reduced Space Optimum Interpolation (RSOI) in Raynor et al (2003):

They write in Raynor et al (2003): “These historical data sets all use data reconstruction techniques based on empirical orthogonal functions (EOFs), which are used to capture the major modes of SST variability and are then projected onto the available gridded SST observations to form quasi-globally complete fields.” And they continue, “In HadISST1, broad-scale fields of SST are reconstructed using one of these EOF-based techniques, reduced space optimal interpolation (RSOI). RSOI is described by Kaplan et al. [1997], who show that it is more reliable than EOF projection, which was used in GISST and by Smith et al. [1996]. We adapt RSOI into a two-stage process: first reconstructing the global pattern of long-term change and then the residual interannual variability. This results in a better representation of trends than does a single application of RSOI as used by Kaplan et al. [1998, 2003]. Also, we augment the reconstructions by blending with quality improved in situ SST to recapture local variance lost in the broad-scale RSOI.”

The last sentence is important as it MAY separate HADISST data from the long-term NCDC dataset that follows. After manufacturing data to create the appropriate patterns of SST and SST variability, the Hadley Centre reinserts (blends back in) the actual readings.

A link to the referenced Kaplan et al (1997) “Reduced Space Approach To The Optimal Analysis Of Historical Marine Observations: Accomplishments, Difficulties, And Prospects”:

Note: I wrote above that the Hadley Centre “basically use(s) ICOADS data.” In their description of HADISST data…
…the Hadley Centre states, “The SST data are taken from the Met Office Marine Data Bank (MDB), which from 1982 onwards also includes data received through the Global Telecommunications System (GTS). In order to enhance data coverage, monthly median SSTs for 1871-1995 from the Comprehensive Ocean-Atmosphere Data Set (COADS) (now ICOADS) were also used where there were no MDB data.”

But since the Met Office MDB data is a part of ICOADS data, “they basically use ICOADS data.” Refer to Woodruff (2001) “COADS Updates Including Newly Digitized Data and the Blend with the UK Meteorological Office Marine Data Bank.”

The Hadley Centre also requires an application be submitted prior to their release of their HADISST data, and the same restrictions apply:


ERSST.v3b is a product of the NOAA National Climatic Data Center (NCDC). It is also globally complete over the term of the dataset, from January 1854 to present. Refer to the January 1900 and March 2010 SST anomaly maps, Figure 5.
Figure 5

There have been a number of versions of the NCDC Extended Reconstruction Sea Surface Temperature (ERSST) dataset since it was first released in 2003. Version 1 is presented in Smith and Reynolds (2003) “Extended Reconstruction of Global Sea Surface Temperatures Based on COADS Data (1854-1997).” Version 2 appeared a year later and was accompanied by Smith and Reynolds (2004) “Improved Extended Reconstruction of SST (1854-1997).” Those papers provide detailed descriptions of the methods used to reconstruct SST data for the early ERSST datasets and the improvements made with the new version.

Version 3 was released early in 2008 and was accompanied by the Smith et al (2008) paper “Improvements to NOAA's Historical Merged Land-Ocean Surface Temperature Analysis (1880-2006)”. Much of the improvements in the ERSST.v3 data were due to the inclusion of satellite data starting in 1985. In the Appendix, Smith et al (2008) write, “The ERSST.v3 is improved by explicitly including bias-adjusted satellite infrared SST estimates. In ERSST.v2 and ERSST.v3, information from satellites is indirectly included because the HF analyses are based on modes computed from the Reynolds et al. (2002) analysis, which includes the satellite data. In ERSST.v3 the Pathfinder infrared SST estimates are introduced in the analysis by combining those SST data with ship and buoy data. Satellite SSTs are bias adjusted relative to the ship and buoy data as previously discussed. The SST estimates from satellite, ships, and buoys are merged using a weighted sum of the different inputs, with weights inversely proportional to the noise estimate for each type (see section 2d). The merged SSTs are used in the ERSST.v3 analysis. In ERSST.v2 only in situ SSTs are used. The greatest influence of the satellite data is to produce greater variability south of 45S beginning in 1985. In most other regions the influence of satellite data is small because of generally sufficient in situ monthly sampling in the recent period.”

In fact, a major portion of the SST portion of Smith et al (2008) discussed the lengths taken to include satellite data and benefits of including it. Refer to Section 2d.

The NCDC made the ERSST.v3 data available in a number of latitude bands on monthly and annual bases through a webpage linked to their ERSST Version 3/3b webpage. (The link to the ERSST.v3 data is no longer operational). I used it in a few early posts, including ERSST.v3 Version of Southern Ocean SST Anomaly. Then, after only a few months, the NCDC stopped updating the new ERSST.v3 data (last updated April 2008). About 6 months later, the data in numerous latitude bands was made available again through the ERSST Version 3/3b webpage, and the data was identified as Version 3b. NCDC provided an explanation about the switch to version 3b. Reynolds, Smith and Liu write in the third paragraph, “In the ERSST version 3 on this web page we have removed satellite data from ERSST and the merged product. The addition of satellite data caused problems for many of our users. Although, the satellite data were corrected with respect to the in situ data as described in reprint, there was a residual cold bias that remained as shown in Figure 4 there. The bias was strongest in the middle and high latitude Southern Hemisphere where in situ data are sparse. The residual bias led to a modest decrease in the global warming trend and modified global annual temperature rankings.”

What is curious, though, is that there were differences between the ERSST.v3 and .v3b versions before 1985, Figure 6, which is from my post Unheralded Changes in ERSST.v3 Data.
Figure 6

There was no explanation provided for the changes to the earlier data.

Gridded ERSST.v3b data is available in two formats, ASCII and NetCDF.
ASCII format: Monthly ERSST. See also readme fileNetCDF format: Monthly NetCDF ERSST.

More on the removal of satellite data in ERSST.v3b later in this post.

Reynolds (OI.v2) [USED BY GISS FOR 1982 TO PRESENT]
The Reynolds (OI.v2) Optimum Interpolation SST dataset is also a product of NOAA. It is globally complete and has been available since November 1981. Refer to Figure 7 for the March 2010 SST anomaly map for this dataset.
Figure 7

This in situ and satellite-based SST dataset is described in NCEP Environmental Modeling Center NOAA Optimum Interpolation Sea Surface Temperature Analysis webpage as, “The optimum interpolation (OI) sea surface temperature (SST) analysis is produced weekly on a one-degree grid. The analysis uses in situ and satellite SSTs plus SSTs simulated by sea ice cover. Before the analysis is computed, the satellite data is adjusted for biases using the method of Reynolds (1988) and Reynolds and Marsico (1993). A description of the OI analysis can be found in Reynolds and Smith (1994). The bias correction improves the large scale accuracy of the OI.”

There is a long list of reference papers at the bottom of the NOAA Optimum Interpolation Sea Surface Temperature Analysis webpage.

Gridded data is available from the NOAA National Centers for Environmental Prediction (NCEP) Environmental Modeling Center (EMC) in weekly and monthly formats.


Note: HADSST2 data, as far as I can tell, is only available in anomaly form. The other datasets are presented as Sea Surface Temperature. NOAA/NCDC also provides climatologies for calculating anomalies. Refer to:
Reynolds, R. W. and T. M. Smith, 1995: A high resolution global sea surface temperature climatology. J. Climate, 8, 1571-1583.
Smith, T. M. and R. W. Reynolds, 1998: A high resolution global sea surface temperature climatology for the 1961-90 base period. J. Climate, 11, 3320-3323.


Figure 8 is a comparison graph of global SST anomalies for the four long-term SST datasets discussed above: ICOADS, HADSST2, HADISST, and ERSST.v3b. It is provided to show the magnitude of the difference between the ICOADS data and the other datasets. These differences represent the corrections made to account for changes in measurement methods and represent the additional impacts of data treatment: infilling, interpolation, etc.
Figure 8

In Figure 9, the ICOADS data has been eliminated from the comparison. The HADISST variability is much less than the HADSST2 and ERSST.v3b datasets. And the HADSST2 has the greatest year-to-year variability, which makes sense, since the HADSST2 has fewer filtering adjustments and it is the least spatially complete. Note the dip and rebound from the 1880s to the 1940s is suppressed in the HADISST data. The HADISST SST anomaly in 1878 is approximately the same as the 1976 value, but the HADSST2 and ERSST.v3b readings are significantly higher in 1878 than they are in 1976.
Figure 9

The short-term Reynolds (OI.v2) data has been added to the comparison in Figure 10, and the term of the graph has been shortened accordingly. Again, the HADSST2 data has the greatest month-to-month variability. The impact (upward shift) of the Hadley Centre’s switch of SST suppliers for the HADSST2 data is also visible.
Figure 10

In Figure 11, the datasets have been smoothed with a 13-month running-average filter.
Figure 11


Figures 12 through 15 are maps of the changes in SST anomalies during the two cooling and two warming epochs that occurred between 1880 and 2005. The period was broken down into four epochs: 1880 to 1910, 1910 to 1945, 1945 to 1975, and 1975 to 2005. To create the maps of the changes in SST anomalies, the first year of an epoch was used as the base year for anomalies and the SST anomalies for the the last year of the epoch were then plotted. Note how the patterns north of 30S for the cooling epoch from 1945 to 1975 (Figure 14) are basically the opposite of the warming epoch of 1975 to 2005 (Figure 15). There are also some similarities in the opposing patterns for the earlier cooling and warming epochs, Figures 12 and 13. Note also how much greater the changes are in ERSST.v3b data than the HADISST data in Figures 12 and 13.
Figure 12
Figure 13
Figure 14
Figure 15

The ERSST.v3b and HADISST SST anomalies for the Southern Ocean (90S-60S) are shown in Figure 16. There is little to no variability in the HADISST data prior to 1962, while the ERSST.v3b data shows considerable variation. As shown in Figure 1, there is little to no data in the Southern Hemisphere in early decades. Is the ERSST.v3b data in this portion of the globe a result of the reconstruction methods employed by the NCDC? Or is the curve based on another reconstruction? On a thread at WattsUpWithThat, an observant blogger noted the similarity between the curve of the ERSST.v3b Southern Ocean SST anomaly data and, what my memory says was, a curve of an Antarctic sea ice reconstruction for the same period. He provided a link to the paper. I believe the post was from 2008 or 2009, but, unfortunately, I have not been able to find that comment or the paper.
Figure 16

If I do find it, or if the blogger who had linked the Antarctic sea ice paper at WUWT reads this post and links it in the comments below, I will post the graph in an update.

Earlier during the discussion of the HADISST data, I noted that after manufacturing data to create the appropriate patterns of SST and SST variability, the Hadley Centre reinserted (blended back in) the actual readings. I also noted that this step may separate HADISST from the NCDC ERSST.v3b data.

I can find no mention of a similar step in the paper about ERSST.v3b data, Smith et al (2008) “Improvements to NOAA's Historical Merged Land-Ocean Surface Temperature Analysis (1880-2006)”. And since the addition of satellite data was such a major portion of Smith et al (2008),it is difficult to determine what calculations and methods remain. There is mention of fitting the reconstructed SST data to in situ data in ERSST.v1 in Smith and Reynolds (2004) Improved Extended Reconstruction of SST (1854-1997). In Smith and Reynolds (2004), they note, “…the ERSST.v1 analysis is always computed by a fit to in situ data, even in the period with satellite data. The U.K. Met Office computed their own analysis (Rayner et al. 2003) using a similar technique but using all available data, both in situ and satellite.” But again, there is no mention of a similar step in the ERSST.v3b data.

The paper for the ERSST.v2 version [Smith and Reynolds (2004) Improved Extended Reconstruction of SST (1854-1997)] notes the perceived value of satellite data. On page 10, they state, “Although the NOAA OI analysis contains some noise due to its use of different data types and bias corrections for satellite data, it is dominated by satellite data and gives a good estimate of the truth.”

Figure 17 is a comparison graph of HADSST2, HADISST, ERSST.v3b, and Reynolds (OI.v2) data for the South Pacific and Southern Ocean (90S-0, 145E-70W), from January 1982 to April 2010. The data have been smoothed with a 13-month filter. This area was chosen because the ICOADS data, upon which the ERSST.v3b and HADSST2 data are based, is incomplete even in current times. Refer back to Figure 1. As discussed earlier, the HADSST2 data is biased upwards by the switch in source data in 1998. With the data shift, the HADSST2 data has the highest trend. HADISST contains satellite data, and the Reynolds (OI.v2) data “is dominated by satellite data and gives a good estimate of the truth.” And since the ERSST.v3b data does not include satellite data, it appears it is biased upward, creating a trend that is approximately 50% to 80% higher than the Reynolds (OI.v2) and HADISST data, respectively, for this area of the globe.
Figure 17

Data and maps used in the post unless otherwise noted are available through the KNMI Climate Explorer:


Bill Illis said...

Thanks very much Bob. This took a lot of work (and mostly a lot of work building up a knowledge base over time).

I still find the differences in the different datasets to be unusual/random. The more they are reanalyzed/restated, the less we will know about what conditions were really like.

steven said...

I'm not clear on the difference between icoads and the NCAR maps

Bob Tisdale said...

steven: Please provide a link to the NCAR maps you're discussing.


Tips are now being accepted.

Comment Policy, SST Posts, and Notes

Comments that are political in nature or that have nothing to do with the post will be deleted.
The Smith and Reynolds SST Posts DOES NOT LIST ALL SST POSTS. I stopped using ERSST.v2 data for SST when NOAA deleted it from NOMADS early in 2009.

Please use the search feature in the upper left-hand corner of the page for posts on specific subjects.
NOTE: I’ve discovered that some of the links to older posts provide blank pages. While it’s possible to access that post by scrolling through the history, that’s time consuming. There’s a quick fix for the problem, so if you run into an absent post, please advise me. Thanks.
If you use the graphs, please cite or link to the address of the blog post or this website.