Data Set Documentation

Historical Arctic Rawinsonde Archive


Summary

Note: This data set is now on FTP so references to CD-ROM are historic and no longer applicable.

The Historical Arctic Rawinsonde Archive is on FTP, and contains millions of vertical soundings of temperature, pressure, humidity, and wind, representing all available rawinsonde ascents from Arctic land stations poleward of 65 degrees North. HARA includes soundings from the beginning of record through mid-1996. Most stations began recording soundings in the late 1950s, but a few began in 1947 or 1948. Coverage is relatively uniform, except for the interior of Greenland. Typically, 20 to 40 levels are available in each sounding. HARA documentation is provided in the FTP directory and in hard copy (NSIDC Special Report 2, 1992). The FTP directory contain software (Fortran and C) for retrieval of sounding data subsets. The original soundings were obtained from the National Center for Atmospheric Research (NCAR), Boulder, Colorado and the National Climatic Data Center (NCDC) of NOAA in Asheville, North Carolina.

Citing These Data

Serreze, M., and S. Shiotani, editors. 1997. Historical Arctic rawinsonde archive. Boulder, CO: National Snow and Ice Data Center. Digital media.

Table of Contents

1. Title
2. Investigators
3. Introduction
4. Theory of Measurements
5. Equipment
6. Procedure
7. Observations
8. Data Granularity
9. Data Description
10. Data Manipulations
11. Notes and Problems
12. Notes
13. Application of the Data Set
14. Data Set Plans
15. References
16. Related Software
17. Data Access
18. Output Products and Availability
19. Glossary of Acronyms

1. Title

Data Set Identification: Historical Arctic Rawinsonde Archive

Revision Date: 1997-08-22

2. Investigators

Investigators' Names and Titles

Serreze, M., J. Kahl, and S. Shiotani under support from the National Oceanic and Atmospheric Administration's Climate and Global Change Program (Grant A85RAH05066), the Electric Power Research Institute (Grant RP2333-07), and the National Science Foundation (Grant DPP-8822472).

Title of Investigation

Historical Arctic Rawinsonde Archive

Contact Information

NSIDC User Services
National Snow and Ice Data Center
CIRES, 449 UCB
University of Colorado
Boulder, CO 80309-0449  USA
phone: +1 303.492.6199
fax: +1 303.492.2468
form: Contact NSIDC User Services
e-mail: nsidc@nsidc.org

3. Introduction

This document describes the Historical Arctic Rawinsonde Archive, compiled by Principal Investigators J. Kahl, M. Serreze, and R. Schnell, under support from the National Oceanic and Atmospheric Administration's Climate and Global Change Program (Grant NA85RAHO5066), the Electric Power Research Institute (Grant RP2333-07), the National Science Foundation (Grant DPP-8822472), and the NSIDC Distributed Active Archive Center. The archive comprises millions of vertical soundings of temperature, pressure, humidity, and wind, representing all available rawinsonde ascents from Arctic land stations poleward of 65 degrees North, from the beginning of record through mid-1996. For most stations, the record begins in 1958, but a few begin as early as 1947 or 1948. The archive is presented in ASCII format that may be easily processed by most types of computers. Plans are to periodically update this archive as additional data become available.

Objectives/Purpose

The archive is useful for meteorological and climatological research, applying to polar studies and investigations of global climate change. The archive is designed to facilitate climate analyses requiring sounding time series at multiple stations throughout the arctic.

Summary of Parameters

The archive consists of all available Arctic rawinsonde data from fixed-position (mostly land) stations north of 65 degrees N. Data from drifting ice islands, ships, and aircraft dropsondes have been assembled as a separate archive.

Soundings data were obtained from the National Center for Atmospheric Research (NCAR) in Boulder, Colorado, and the National Climatic Data Center (NCDC) in Asheville, North Carolina. Five original archives were used to compile the database. These sources are assigned identification codes (ID) 1 through 5.

A considerable amount of overlap exists between original data sources. We have made an effort to choose the best available data from each original archive, based on error analyses and completeness of the soundings.

Appendix 1 lists all stations for which soundings are available. In the appendix, the year column indicates that sounding data are available for at least part of a given year. Millions of soundings are available, covering at least a 30-year period for most stations.

The station locations are shown in Figure 1. Note that the coverage is relatively uniform, with the exception of the interior of Greenland. Depending on the station, soundings are available from one to four times daily at 0000 GMT, 0600 GMT, 1200 GMT and 1800 GMT. Most often, reports are available at 0000 GMT and 1200 GMT. Prior to 1952, reporting times were 0800 GMT and 1400 GMT.

4. Theory of Measurements

This section is not applicable.

5. Equipment

Please review Radiosondes.

The instrument type used in sounding (whether Vaisala, Metox, etc.) is only available for ID=1 soundings.

See Additional Quality Codes.

See also Elliot and Gaffen (1991), and Garand et al. (1992).

Instrument Description

Please see Radiosondes.

Platform

Radiosondes are balloon-borne instruments. The balloons are made of natural or synthetic rubber, which expand as they ascend, and eventually burst.

6. Procedure

Data Acquisition Methods

See Sources and Coverage.

See Data Compilation.

7. Observations

There are no notes at this time.

8. Data Granularity

Data represent one-dimensional vertical soundings taken at land stations north of 65 degrees latitude, each including measurements of pressure, temperature, winds, and humidity at various atmospheric levels.

9. Data Description

Spatial Characteristics

Spatial Coverage

North polar regions:

Spatial Coverage Map

Coverage Map

Temporal Characteristics

Temporal Coverage

The archive extends from the beginning of record through mid-1996. For most stations, the record begins in 1958; a few begin in 1947 or 1948.

Type of Data

Each sounding in the archive consists of a header record, followed by a number of data records. Typically 20 to 40 levels are available in each sounding. Appendix 2 includes an example of a sounding. The database is archived on FTP.

We recognize that users usually desire a database of this type to be sorted in one of two ways:

  1. Synoptic-ordered: One time period -- all stations; followed by the next time period -- all stations, etc.
  2. Station-ordered: A complete time series for station #1; followed by a complete time series for station #2, etc.

As a compromise, the following file structure is used: One file contains a one-year time series of soundings for one station. Users requiring station-ordered data can simply append individual one-year files to obtain a station history of a length that is suitable to their needs. Users requiring synoptic-ordered data may extract files for several stations corresponding to a particular year, then extract from them the necessary dates and times.

The variables and formats of the individual soundings are as follows:

Header Record

1. STATION
World Meteorological Organization (WMO) station identification number. All stations are ascribed a WMO number, with the exception of "SHIP M" (Figure 1; Appendix 1). This was arbitrarily assigned a station code of 80000.

2. LATITUDE
Latitude of the station in degrees and hundredths of a degree (N).

3. LONGITUDE
Longitude from 0 to 360 degrees, in degrees and hundredths of a degree, measured counterclockwise from the Greenwich Meridian as viewed from the pole.

4. YEAR
Year of sounding.

5. MONTH
Month of sounding.

6. DAY
Day of sounding.

7. HOUR
Hour of sounding. This is usually 0000 GMT, 0600 GMT, 1200 GMT, or 1800 GMT. Prior to 1952, however, soundings reported at 0800 GMT and 1400 GMT.

8.-10. PROC1, PROC2, PROC3
Special processing codes. These are only available for sounding type ID=1. They provide information on special processing, and whether soundings were manually or automatically processed (see Appendix 4). For all other sounding types (ID=2, 3, 4 or 5), PROC1, PROC2, and PROC3 are assigned blanks. Pages 6 and 7 of Office Note 29 (Mulder (1977)), which describe these codes for the ID=1 data, are given in Appendix 5.

11. REP
Report type. This will always be assigned the value of 011 or 0. It denotes that the sounding was taken at a fixed land station.

12. ELEVATION
Station elevation in meters above mean sea level. Missing values are 99999.

13. INSTRUMENT
Instrument type used in sounding (e.g., Vaisala, Metox). This information is only available for ID=1 soundings. Page 9 of Office Note 29 (Mulder (1977)), describing the instruments for ID=1 soundings, is included in Appendix 5. For all other sounding types, this variable is assigned a value of 0.

14. NLEVELS
The number of levels reported in the sounding; i.e., the number of data records following the header record.

15. ID
An identification code indicating the sounding's original source. Values correspond to the listing in Section 2. These identification values are important because the quality codes vary with each sounding type (Appendix 4).

The FORTRAN format of the header record (15 variables) is as follows: FORMAT (A5, 2I5, 1X, 4I2, 1X, 3A1, I3, I5, I2, 1X, I3, 1X, I1).

Data Records

The variables are as follows:

1. PRESSURE
Pressure in tenths of millibars. Data from most stations are rounded to the nearest 1 mb. Missing values are 99999.

2. GEOPOTENTIAL HEIGHT
Geopotential height of the pressure level in whole meters. Missing values are 99999.

3. TEMPERATURE
Temperature in tenths of a degree (degrees C). Missing values are 9999.

4. DEWPOINT DEPRESSION
Dewpoint depression in tenths of a degree (degrees C). Missing values are 999.

5. WIND DIRECTION
Wind direction, 0 to 360 degrees, measured clockwise from north (e.g., 90 degrees is east). Missing values are 999.

6. WIND SPEED
Wind speed in whole meters per second. Missing values are 999.

7.-15. QG, QG1, QT, QT1, QD, QD1, QW, QW1, QP
Quality control flags for geopotential height (QG, QG1), temperature (QT, QT1), dewpoint depression (QD, QD1), winds (QW, QW1), and pressure (QP). The values of QG, QT, QD, QW, and QP depend on the sources of the sounding as identified by the value of ID (see Appendix 4). These vary considerably among the sources of the original soundings, and on whether the original processing was accomplished automatically (via computer) or manually (designated by "auto" or "man" in Appendix 4).

QG1, QT1, QD1 and QW1 are additional quality code flags for geopotential height, temperature, dewpoint depression, and winds, based on our error-checking procedure, described earlier. Values are either 'P' (passed limits check) or 'F' (failed limits check). No limits check quality flag is given for pressure, as we used pressure as the test variable in the limits check. *(These additional codes are not available on Volume 5. Please see Revision below.)

16. LEVCK
A quality control flag set to 'P' if no errors were detected in the level for any variable, based on the limits check described above. If any errors were detected in the limits check, it is set to 'F'. *(This variable is not available on Volume 5. Please see Revision below.)

17. LTYPE
Code for level type (surface, significant, or mandatory). Only relevant for soundings of type ID=4 (see Appendix 4). For sounding types other than ID=4, the value is assigned a blank. *(This variable is not available on Volume 5. Please see Revision below.)

18. LQUAL
Flag for quality of level. Only relevant for soundings of type ID=4 (see Appendix 4). *(This variable is not available on Volume 5. Please see Revision below.)

The FORTRAN format of the data records (18 variables) are as follows: FORMAT (2(I5, 1X), I4, 1X, 3(I3, 1X), 2A1, 1X, 2A1, 1X, 2A1, 1X, 2A1, 1X, 4A1).

*Revision

Experience has shown that the additional quality codes have limited value to the user. Starting with Volume 5, these codes are no longer provided. The columns corresponding to these codes simply appear as blanks. FORTRAN code that performs more comprehensive error-checking is available from M. Serreze.

Sample Data Record

(THULE, GREENLAND: ID=4)

04202 765229125 59 1 1 0     11   63 0  23 4 
10040    31 -329  94  90   3 9P 9P 9P 9P 9P00
10000    62 -308  95  90   3 9P 9P 9P 9P 9P10
 9500   430 -277  79  70   3 9P 9P 9P 9P 9P90
 9000   819 -283  66 105   4 9P 9P 9P 9P 9P90
 8500  1227 -287  83 142   5 9P 9P 9P 9P 9P10
 8000  1660 -301  93 182   5 9P 9P 9P 9P 9P90
 7500  2130 -323  94 225   7 9P 9P 9P 9P 9P90
 7000  2602 -346  97 250   7 9P 9P 9P 9P 9P10
 6500  3130 -370 101 260  11 9P 9P 9P 9P 9P90
 6000  3668 -396 101 260  12 9P 9P 9P 9P 9P90
 5500  4260 -416 999 260  16 9P 9P 8  9P 9P90
 5000  4903 -437 999 270  18 9P 9P 8  9P 9P10
 4500  5600 -488 999 270  20 9P 9P 8  9P 9P90
 4000  6371 -541 999 270  24 9P 9P 8  9P 9P10
 3500  7219 -589 999 270  21 9P 9P 8  9P 9P90
 3000  8180 -620 999 270  18 9P 9P 8  9P 9P10
 2500  9312 -603 999 270  14 9P 9P 8  9P 9P10
 2000 10712 -588 999 260  14 9P 9P 8  9P 9P10
 1750 11545 -620 999 260  15 9P 9P 8  9P 9P90
 1500 12499 -616 999 260  15 9P 9P 8  9P 9P10
 1250 13624 -631 999 260  14 9P 9P 8  9P 9P90
 1000 14984 -670 999 260  10 9P 9P 8  9P 9P10
  800 16321 -698 999 260  10 9P 9P 8  9P 9P90

10. Data Manipulations

Sort/Merge Routines

All original data except those corresponding to ID=1 and ID=5 (Asian data through 1987 and all data from 1988, onward) were received already sorted by decreasing pressure. ID=1 and ID=5, however, were not sorted by decreasing pressure and, therefore, required some special processing. Each original sounding was sorted into five categories:

  1. temperature, dewpoint depression, geopotential height and winds at mandatory pressure levels;
  2. temperature, dewpoint depression and winds at variable pressure levels;
  3. winds at variable pressure levels;
  4. winds at variable geopotential heights (this category was generally reported infrequently); and
  5. tropopause data.

Pressure levels are often repeated in this format. For example, while the first level in categories 2, 3, and 4 is always the surface, each category contains different information. At other times, what was reported as a mandatory level was also repeated as a significant level. Typically, two to four repeat levels were found for each sounding. To make these data compatible with the soundings from other sources, each sounding was passed through a sort/merge program.

The sort/merge program worked as follows: Each sounding was first sorted by decreasing pressure, then by increasing geopotential height. Repeat levels were flagged, based on identical values of pressure or geopotential height. The most complete record possible for each repeat level was then assembled. The first occurrence of any repeat value and its associated quality flag was retained (see Appendix 4 ). A drawback of the technique is that although two levels may be identified as repeats, temperature or wind values associated with repeat levels are sometimes slightly different. Retaining only the first occurrence will occasionally result in retaining a bad value. While in theory, this problem could have been eliminated by also inspecting each quality flag, the large volume of processing made this impractical. Spot checks indicate that this is only a minor problem. An artifact of the technique, however, is that since some of the quality flags indicate whether or not there is agreement between values at repeat levels (but do not indicate which level is correct), any flags retained relevant to between-category checks should be ignored (see Appendix 4).

*Error-Checking

Data through 1991 were subjected to additional rudimentary error-checking routines. The design of the error-checking procedure was guided by three considerations:

  1. a desire to produce the highest-quality database possible;
  2. recognition that the large amount of raw data (106 soundings) prohibit exhaustive error-checking procedures, such as manually inspecting each sounding; and
  3. a desire to retain all data values present in the original soundings, thus allowing users to make their own determination of data quality.

The first step involved flagging obvious errors, such as negative wind speeds, negative geopotential heights, etc. All soundings were then passed through a seasonally adjustable limits check to screen additional errors. No guarantee is made that this routine identifies all errors and it is recognized that, on occasion, what is flagged as errors may in fact be valid data. As such, all obvious and probable errors are simply flagged, but the original data is retained.

The seasonally adjustable limits check worked as follows: All data from all stations for 1987 were stratified by season into 15 atmospheric layers bounded by pressure levels. Seasons are defined as December through February (winter), March through May (spring), June through August (summer) and September through November (autumn). Since all checks are based on pressure, no check on the pressure values, themselves, have been performed.

Initial frequency histograms of geopotential height, temperature, wind direction and wind speed were compiled for each layer and for each season in 1987. Extreme outliers were eliminated by discarding values more than four standard deviations from the respective means. Since wind speed does not follow a normal distribution, these values were first converted into log wind speed. Means and standard deviations for the remaining data were then recomputed. Using these data as representative sample means and standard deviations of the complete data set, we then flagged any value that was greater than plus or minus four standard deviations from the sample mean for the respective season and layer. The means and standard deviations used in the limits check are given in Appendix 3.

An implicit assumption in the error-checking routine is that the layer mean is representative of the mean for any level within that layer. Since a large number of atmospheric levels are used in the check, this is a tolerable assumption for testing temperature, wind direction and wind speed. It was found to be inappropriate for geopotential height; however, due to the logarithmic decay of pressure with increasing elevation.

To check the geopotential heights, we made use of this logarithmic relationship: Taking the log of pressure at the bottom (P1) and top (P2) of the layer in which the observed pressure (P) fell, a weight W was calculated:

W = [LOG(P)-LOG(P2)]/[LOG(P1)-LOG(P2)]. (1)

Next, ZL1 and ZH1, respectively, are defined as the lowest and highest allowable geopotential height at the base of the layer, and similarly, ZL2 and ZH2 as the lowest and highest allowed limits of the top of the layer (with limits taken as +/- 4 standard deviations from the mean). We then calculate ZL and ZH, which, by incorporating the weight W, define the limits of allowable geopotential height for the value of P:

ZL = W*[ZL1-ZH1] + ZH1 (2)

ZH = W*[ZL2-ZH2] + ZH2 (3)

The allowable value of Z at pressure level P in the sounding is, thus

(4).

As part of the error-checking routine, the 1000-millibar (mb) mandatory level was deleted if the corresponding geopotential height was less than the reported station elevation. Frequently situations were encountered in which a level was reported (i.e., pressure or geopotential height data were given), but all other variables had missing value codes. These levels were deleted.

Since atmospheric moisture can be highly variable, with considerable uncertainties in cold arctic conditions, the dewpoint depression data were simply screened to flag negative values and any values exceeding an arbitrary high threshold of 35 degrees C. Those wishing to use these data are referred to Elliot and Gaffen (1991), who provide an overview of problems in rawinsonde moisture data.

*Experience has shown that these error checks are of limited value to the user. They have, hence, been dropped for 1992, onward. FORTRAN code providing more comprehensive error checks is available from M. Serreze.

11. Notes and Problems

Since each original database was in a different format, some adjustments and changes were required to make all soundings consistent. A number of problems were also noted during processing and as a result of manual spot checks. These are summarized below. Users encountering additional problems are urged to contact the authors of this document. All soundings subsequent to 1987 were obtained from NCAR tapes, which are in the ID=1 format.

Through a processing oversight on the part of M. Serreze, the pressure quality codes (QP) in Volumes 1 through 4 are missing and set to '9'. This problem has been corrected in Volume 5. Soundings in earlier volumes also appear to have some remaining problems related to the sort/merge routines. Again, these have been corrected in Volume 5.

ID=1: (Asian Arctic for 1976-1987; All stations for 1988, onward)

  1. Winds were originally in knots and were changed to meters per second.
  2. Longitudes were originally reported 0 to 360 degrees clockwise from the Greenwich Meridian. They were converted to 0 to 360 degrees counterclockwise as viewed from the North Pole.
  3. If categories 2, 3, and 4 were missing (the data categories for soundings of type ID=1 were discussed in Section 3a above), the surface report was padded with missing values.
  4. The geopotential height of the surface level is not given in categories 2 through 4. The value has been set equal to the station elevation. However, reported station elevations do not always match those listed by WMO.
  5. Geopotential heights are usually available only for mandatory levels.
  6. If the indicated surface pressure is < 930 mb, the surface report is assumed to be missing.
  7. As noted previously, the requirement to merge duplicate levels means that any quality codes referring to agreement/disagreement between duplicate levels reported in different categories should be ignored.

ID=2: (Victor Starr, 1958-63)

Of all the original data, the Victor Starr soundings contained the most frequent errors. As such, we tried to use soundings from other sources whenever possible (i.e., during periods of overlap).

  1. Occasional surface reports with a value of 1000 mb should be present in any large sounding archive. Very few are noted in the original data, however. It is suspected that, at some point in time, 1000-mb surface reports were lost due to a processing error.
  2. Numerous reported surface pressures from 950 to 960 mb were encountered. In most cases, these must be errors. While such levels will often be flagged as errors on the basis of the geopotential height limits check, many were observed to pass this check.
  3. Temperature values were frequently flagged by the limits check algorithm. Surface temperature values in these situations were sometimes above 20 degrees C, with suspiciously high values throughout the profiles.
  4. Winds were originally reported in U and V components. They were converted to speed (meters per second) and direction (0-360 degrees).
  5. At times, the highest reported level had a geopotential height of 0 m. Temperatures also appear bogus in these cases. These are flagged as errors. The problem seems to only occur when 27 levels (the maximum possible) are reported. This appears to be a problem with the original data, and may have resulted from a bad array size.
  6. Errors are particularly common at 1200 GMT. The presence of a systematic error in the original processing is suspected. These soundings should be viewed with caution.

ID=3: (Canadian)

  1. The geopotential height of the surface report and the station elevation were supplied in the original sounding. Situations were frequently encountered in which the geopotential height at the surface was lower than the station elevation. Also observed were cases in which the surface geopotential height was greater than the station elevation.
  2. Surface pressure reports are sometimes shown as higher than the adjacent level by approximately 0.1 mb. (Only with ID=3 data are surface pressures commonly given to nearest 0.1mb.) However, the adjacent level shows geopotential height lower by 1 to 2 m.
  3. In the original tapes, the five-digit code for the Canadian stations had a '2' or '4' as the second byte (variable STATION in the header record). The second byte was changed to '1' to match WMO codes.
  4. Wind speed was originally in knots, and was converted to meters per second.

ID=4: (Alaska and Thule, Greenland)

  1. As discussed earlier, these soundings have two additional quality codes LTYPE (type of level, e.g., significant, mandatory, surface, etc.) and LQUAL (overall quality code for each level). LQUAL should not be confused with LEVCK, which is the overall quality code for the level based on our limits check used through 1987.
  2. Wind speed was originally in knots, and was converted to meters per second.
  3. Original moisture data were reported as relative humidity, and were converted to dewpoint depression.
  4. In some early soundings, a 1000-mb level is reported as a mandatory level even when the surface pressure is < 1000 mb.

ID=5: (Soviet, 1962-75)

  1. If categories 2, 3, or 4 were missing (the data categories for soundings of type ID=5 are discussed in Section 3a above), the surface report was padded with missing values.
  2. If categories 2, 3 ,or 4 were available, but the indicated surface pressure was < 930 mb, the surface report was assumed to be missing, and was padded with missing values.
  3. As for ID=4, longitude values were converted to 0 to 360 degrees counterclockwise as viewed from the pole.
  4. Wind speed was originally in knots, and was converted to meters per second.
  5. Station elevation was not included in the sounding reports until 12/16/63. All station elevations were set to missing for years before 1964.

See Error Checking.

12. Notes

See Notes and Problems.

13. Application of the Data Set

Soundings are typically used in synoptic meteorology and for climatology.

14. Data Set Plans

Plans are to periodically update this archive as additional data become available.

15. References

Elliot, W. P. and D. J. Gaffen. 1991. On the utility of rawinsonde humidity archives for climate studies. Bull. Amer. Meteor. Soc. 72(10): 1507-1520.

Garand, L., C. Grassotti, J. Halle, and G. Klein. 1992. On differences in radiosonde humidity reporting practices and their implications for numerical weather prediction and remote sensing. Bull. Am. Meteor. Soc. 73(9):1417-1423.

Kahl, J. D., M. C. Serreze, S. Shiotani, S. M. Skony, and R. C. Schnell. 1992. In situ meteorological sounding archives for arctic studies. Bull. Am. Meteor. Soc. 73(11):1824-1830.

Kahl, J. D., M. C. Serreze, R. S. Stone, S. Shiotani, and M. Kisley. 1993. Tropospheric temperature trends in the Arctic: 1958-1986. Jour. of Geophysical Research 98(D7): 12,825-12,838.

Mulder, P. 1977. Record format for the NCAR archive of NMC upper air ADP data. NOAA Office Note 29, 50 pp.

Nielon, J. R. 1964. Upper-air ADP record format. Office Note No. 20. Computational Branch, National Meteorological Center, U.S. Department of Commerce, Weather Bureau.

Serreze, M. C., J. D. Kahl, and R. C. Schnell. 1992. Low-level temperature inversions of the Eurasian Arctic and comparisons with Soviet drifting station data. Journal of Climate 5(6):615-629.

16. Related Software

Please see HARA Access Software.

See also HARA Volume Structure.

17. Data Access

Access the data via FTP.

18. Output Products

This data set contains millions of vertical soundings of temperature, pressure, humidity and wind, representing all available rawinsonde ascents from the beginning of record through mid-1996.

19. Glossary of Acronyms

No further information at this time.