Get Data


Data can be downloaded through a Web browser or command line via FTP. When using a Web browser, the FTP link first directs you to an Optional Registration Form that if filled out, will allow you to receive notifications about updates or processing changes related to that specific data set. After completing the Optional Registration Form, the FTP directory becomes available. For additional help downloading data through an FTP client, go to the How to access data using an FTP client support page.


Data Set ID:

Central Asia Temperature and Precipitation Data, 1879-2003, Version 1

This data set provides temperature and precipitation data from 298 meteorological stations in the Northern Tien Shan and Pamir Mountain Ranges of Central Asia, specifically from stations in Kazakhstan, Kyrgyzstan, Tajikistan, Turkmenistan, and Uzbekistan. The period of record covered by each station is variable, however, most stations have almost 100 years of observations with the earliest record from 1879 and the latest from 2003. The data are stored as tab-delimited ASCII text format, Microsoft Excel, and PDF, and are availabe via FTP.

Geographic Coverage

  • Atmospheric Temperature > Air Temperature
  • Precipitation > Precipitation Amount
Spatial Coverage:
  • N: 50, S: 35, E: 85, W: 55

Spatial Resolution: Not Specified
Temporal Coverage:
  • 1 January 1879 to 31 December 2003
Temporal Resolution: 1 month
Data Format(s):
  • Microsoft Excel
  • PDF
  • ASCII Text
Version: V1
Data Contributor(s): M Williams, Vladimir Konovalov

Data Citation

As a condition of using these data, you must cite the use of this data set using the following citation. For more information, see our Use and Copyright Web page.

Williams, M. W. and V. G. Konovalov. 2008. Central Asia Temperature and Precipitation Data, 1879-2003, Version 1. [Indicate subset used]. Boulder, Colorado USA. NSIDC: National Snow and Ice Data Center. doi: [Date Accessed].

Back to Top

Collapse All / Open All


This data set updates and expands the NOAA Global Historical Climate Network (GHCN) of quality controlled meteorological records, focusing on the Northern Tien Shan and Pamir Mountain Ranges of Central Asia. It is compiled primarily from meteorological measurements conducted by the National Hydrometeorological Services (NHMS) of the Central Asian countries. Precipitation data are monthly sums bias corrected for gauge type and for wetting, but not for wind. The correction factors, K1 for gauge type, K2 for wind, and K3 for wetting, are included as separate files. Temperature data are monthly means, for example, the mean of daily temperatures for that month, where daily temperature is defined as the average of all observations for each calendar day. For many stations, average maximum and average minimum temperature are supplied as well. These are derived from daily maximum and minimum temperatures. The station metadata for this data set are station histories, population, vegetation, and topography. Data were subjected to rigorous quality control and homogenieity assessment procedures, consistent with those used for the GHCN.

There are records from 298 stations. The period of record covered by each station is variable, and in the period from 1985 to 1995, there was a sharp reduction in the number of operating stations. The earliest record is from 1879. Records are updated through 2003 where data are available. Most stations have almost 100 years of observations. Records are from stations in Kazakhstan, Kyrgyzstan, Tajikistan, Turkmenistan, and Uzbekistan.

The Global Historical Climate Network product and data from Central Asia

Climate research relies heavily on meteorological data from surface stations. Usually, stations are part of a national network, and some or all of these stations may be in the World Meteorological Organization network of stations, reporting observations over the Global Telecommunication System (GTS). The instrumental records from surface stations are generally kept to assess local conditions and are not always suitable for climate studies. For instance, the records are often not digitized or are not readily available outside of the country in which they were measured. An uneven distribution of stations introduces network biases that have significant effects on estimated temperature trends. Instrumental records also often contain data errors resultant from the data recording and archiving processes. These errors take many forms (e.g., outliers, truncations). In addition, instrumental records are subject to inhomogeneities caused by many factors. For example, stations are relocated with attending changes in surrounding topography or vegetation, new instrumentation is introduced, personnel changes may result in subtle changes in operating procedures, and populations may grow up around stations, introducing a heat island effect. Such inhomogeneities introduce non-climatic variation into historical records.

The GHCN of monthly data is often used for climate studies (Jones and Moberg 2003) because the data records it contains have been quality checked. Both historical and near-real-time GHCN data undergo rigorous quality assurance reviews. These reviews include preprocessing checks on source data, time series checks that identify spurious changes in the mean and variance, spatial comparisons that verify the accuracy of the climatological mean and the seasonal cycle, and neighbor checks that identify outliers from both a serial and a spatial perspective. From GHCN Monthly Verison 2.

Version 1 of the GHCN was released in 1992 and Version 2 in 1997. Version 2 enhancements include:

  • data for additional stations to improve regional-scale analyses, particularly in previously data sparse areas
  • the addition of maximum/minimum temperature data, to provide climate information not available in mean temperature data alone
  • detailed assessments of data quality to increase the confidence in research results
  • rigorous and objective homogeneity adjustments to decrease the effect of non-climatic factors on the time series
  • detailed metadata such as population, vegetation, and topography that allow more detailed analyses to be conducted
  • an infrastructure for updating the archive at regular intervals so that current climatic conditions can constantly be put into historical perspective (Peterson and Vose, 1997).

There are relatively few high elevation climate station records in the GHCN: most stations are below 500m elevation. This gap is significant, because mountainous areas drive synoptic climatology, and climate cannot be modeled well in these areas without station data. To address this problem for Central Asia, data from state or country official Monthly Reference Books, or from other data collections, were acquired. Most observations were originally recorded on paper at the meteorological station. Most records were provided to the data set authors in digital form, but some required digitization. In some cases digital archives of daily data or operational data from the GTS were used. Appendix 1 lists references for data sources, as well as some related references.

GHCN v.1 has not been updated since 1992, and the Central Asia records generally end in 1988. GHCN v.2 is being updated regularly by the National Climatic Data Center (NCDC), but of the relatively few Central Asia stations in GHCN v.2, most have records only to 1993 or earlier. The authors of this data set have filled a gap in the climatological record by making high elevation station data from Central Asia available after extensive quality control. They have accounted for inhomogenieties in the station air temperature data by standardizing data and removing values more than three standard deviations outside the mean. Precipitation data have been corrected (for gauge type and wetting but not for wind) using standard methods and coefficients. The authors evaluated for spatial homogeneity in temperature data by using multifactor equations based on latitude, longitude, and elevation.

Table 1 shows the GHCN stations to which this data set adds data. Table 2 also gives an overview of the data coverage in this data set. However, Appendix 7 has more detailed coverage information. In Table 1 and Table 2, the country codes correspond to these countries: 211 = Kazakhstan, 213 = Kyrgyzstan, 227 = Tajikistan, 229 = Turkmenistan, and 231 = Uzbekistan. In all, this data set adds to 75 stations already existing in GHCN v.1 or v.2, and also provides data from 223 new stations.

In Table 1, a cross marks those stations for which station records were extended or gaps were filled by this data set. From an unpublished report by Williams and Konovalov, 2005.

                                                                                                     Table 1. Central Asia stations in GHCN v.1 and GHCN v.2 (click for larger view)

In Table 2, the NN column has the number of stations that can now be added to the GHCN, the Years/st column is the number of station years for each country, and the % gaps column is the percentage of the time covered by the records for that country where data are absent, from Williams and Konovalov (2005).

                                                        Table 2. Data Set Coverage Statistics

Differences with GHCN Data

These data have undergone rigorous quality assessment like data in the GHCN. However there are some differences: GHCN data have been quality controlled by a variety of methods (Peterson et al., 1998) and data points that fail the QC tests are not included in the data set. In addition to the quality controlled data, GHCN includes a quality controlled and homogeneity adjusted version. This version makes historical data homogeneous with present day observations by adjusting for non-climatic discontinuities. Homogeniety adjusted data can be better for looking at regional scale long term trends, but using adjusted data has some disadvantages. In particular, adjusted station data can be less representative of local microclimate than non-adjusted data (Peterson and Vose, 1997). This Central Asia data set has only one version of temperature data, and these data have not been adjusted for homogeneity following GHCN practices. Instead, temperature values are checked for inconsistencies relative to station record statistics as described in a later section.

For any given station, the GHCN may have more than one temperature time series (Peterson and Vose, 1997) and each time series taken singly may not cover the entire history of the station. One reason for this is that there may be more that one series, and generally the method of calculating the monthly mean temperature at a station is not readily available. (Peterson and Vose, 1997) point out that differences in temperature attributable to calculating the mean using two different methods at a particular station can be greater than the temperature difference between two neighboring stations. Since it is not known how the differences came to be, all the series are retained. In this Central Asia data set, there is only one record for each station. This is because data were originally recorded following precise instructions, therefore it was rare to have more than one version of a station record.

The GHCN data set includes raw precipitation as well as adjusted data, while this data set consists of only adjusted precipitation data.

History of Meteorological Observations in the former Soviet Union

All meteorological stations in the former Soviet Union measured climate parameters using the same instruments and sampling frequency. Observations were recorded by hand in field notebooks. Monthly observations books were then mailed to the Computing and Processing Department of the National Hydrometeorological Services (NHMS) for each state, where they were stored. Data were then entered onto magnetic tape using a format that translates as Language of Data Description. The main archival facility for Central Asia prior to 1991 was the Regional Computing and Processing Center at Tashkent, Uzbekistan. Prior to 1991, a subset of data from Central Asia was archived at the Russian Center of Hydrometeorological Information in Obninsk. The data set authors worked with the Regional Computing and Processing Center at Tashkent, Uzbekistan to download archived meteorological information from magnetic tape to ASCII. Other data sources were used as well.

Data are no longer sent to the Russian Center of Hydrometeorological Information in Obninsk nor the center in Tashkent. Access to the meteorological information collected since 1991 differs by country within Central Asia. The following information was current in 2005.

  • Kazakhstan: Meteorological data is free for non-profit entities registered with the Federal Administration. Some but not all of the data is saved in electronic format.
  • Krygyztan: Meteorological information is stored only in paper format, there is only one master copy, and is only available to employees of the Kyrgyzhdromet.
  • Tadjikstan: Same as Krygyztan.
  • Turmenistan: Meteorological data is free for non-profit entities; it is stored only in paper format.
  • Uzbekistan: Same as Kazakhstan.

After the breakup of the Soviet Union in 1991, most meteorological stations continued operation. There have been some reductions since that time. For example, in 1980 there were 312 stations measuring air temperature and precipitation amount in Central Asia. By 1996, the number of stations decreased to 248.

Detailed Data Description


The data are stored as tab-delimited ASCII text format, Microsoft Excel, and PDF.

Each data sheet contains a table with columns containing the following:

  • Three-digit WMO country code
  • Five-digit WMO station number
  • Station or post name
  • Coordinates (longitude and latitude in decimal degree)
  • Altitude (meters above sea level)
  • Year
  • Temperature (degrees Celsius) or precipitation (mm) values for each month, in separate columns with Roman numerals for months

The metadata sheet for each files contains the same contents as listed above along with the first and last year of record, duration of record in years, percentage of the record with no data (data gaps), and percentage of data completeness for each month for each station.

Missing data are indicated by -999.0.

Background color on
File Naming Convention and Contents
Table 3. File Naming Convention and Contents
File Name
Taver_v1.xls Excel file with a Taver sheet containing monthly average temperature by station and year, and a Info_Taver sheet containing information about each station record. The metadata sheet Info_Taver contains the same contents as listed above along with the first and last year of record, duration of record in years, percentage of the record with no data (data gaps), and percentage of data completeness for each month and for each station.
Taver_v1_1.txt The Taver sheet of the Excel file, as a tab delimited text file.
Taver_v1_2.txt The Info_Taver sheet of the Excel file, as a tab delimited text file.
Tmin_Tmax_v1.xls Excel file with four sheets for monthly mean minimum temperature, information about the monthly mean minimum temperature records, monthly mean maximum temperature, and information about the monthly maximum temperature record. The format is the same as for Taver_v1.xls.
Text files for the above sheets.
Precip_v1.xls Excel file with a Precip sheet containing monthly precipitation sums by station and year, and a Info_prec sheet containing information about each record including start and end year and percentage complete. The format is the same as for Taver_v1.xls.
Precip_v1_1.txt The Precip sheet of the Excel file, as a tab delimited text file.
Precip_v1_2.txt The Info_prec sheet of the Excel file, as a tab delimited text file.
Appendix_1.pdf Adobe PDF file with a table of all the sources used for data, and related references, in Russian (Cyrillic) as well as English.
Appendix_2.xls Excel file of station metadata such as location and surroundings. The first six columns are for the data files.

INFO (column G) has PRCP to indicate when the stations’ precipitation gauge with Nipher shield was replaced by a Tretyakov precipitation gauge, with the year, month and day of the change in columns H, I, and J. 

The INFO column also has a MOVE indicator. If MOVE is followed by a year in column H and –9s in columns I and J, 0 in K, and –99 in L, the station has remained in the same spot. -9 is also used for missing data. If MOVE is followed by other information, that information is on the year, month, and day of the move, the cardinal direction of the move, and the distance in km. 

If the Distance column contains 0, the relocation was by less than 100m.

Column M gives information about the surroundings, while columns N-Q give the population for stations in inhabited areas for census years 1959, 1979, 1979, and 1991, in thousands. 
Appendix_2_1.txt A tab delimited text file based on Appendix_2.xls.
Appendix_3.pdf Adobe PDF file with a table of all the sources used for the metadata in Appendix 2, in Russian (Cyrillic) as well as English. These references are also in Appendix 1.
Appendix_4.xls Statistics for Kazakhstan, Kyrgyzstan, Tajikistan, Turkmenistan, and Uzbekistan on long-term mean monthly air temperatures (mean, standard deviation, minimum, maximum, utmin and utmax).
Text files for the sheets in Appendix_4.xls.
Appendix_5.xls Charts of anomalies in mean monthly maximum and minimum temperature, 1961-1990, in a Microsoft Excel file. The plus or minus 3 sigma line is also shown. There is a chart for each month, and each chart is based on all the stations in the data set. Note: There is not a tab-delimited file for this spreadsheet since there are primarily graphs contained in this file.
Appendix_6.1.pdf Excel file with precipitation bias correction coefficients K1.1
Appendix_6.2.pdf Excel file with precipitation bias correction coefficients K2.1
Appendix_6.3.pdf Excel file with precipitation bias correction coefficients K3.1
Appendix_7.xls A summary of the location and data coverage of each station in the database such as:
  • country number
  • country code
  • station name
  • longitude and latitude
  • altitude
  • first and last years
  • duration of years
  • percentage of gaps and the percentage of monthly data completeness for both temperature and precipitation data
Appendix_7_1.txt A tab delimited text file based on Appendix_7.xls.
1In these PDF files, the column named Type refers to the site descriptions given in Table 4 (Williams and Konovalov, 2005).


Table 4. Classifications and Site Descriptions of Stations/Posts
Site Description
1a Site bounded by solid, not ventilated fence. Clearing in dense forest. Site in mountain depression. Most favorable for measurements.
1b Site bounded by openwork vegetation or buildings Vertical profile of wind is uniform.
2a Half-protected site in city, village, or hilly place. Protection by buildings or vegetation from one, two, or three sides of horizon. Very different influence on precipitation.
2b Outlying district of inhabited locality. Half-protected site, bounded by dense obstacles such as separate buildings or groups of trees. Narrow gorge. Most unsuccessful conditions for measurement of precipitation.
3 Open site inside of small houses or separate trees. Saving logarithmic profile of wind.
4 Open sea shore. Mouth of large river. Island or summit of mountain. Regional features of place are most important.
Background color on
Measurement History

As a rule, meteorological observations were performed according to the Manual for Hydrometeorological Stations and Posts, Gidrometeoizdat, 1985 (translated from the Russian "Nastavlenie gidrometeorologicheskim stantsyyam i postam." Vypusk 3, chast’ 1). Stations are sites that have a more complete program and observations than at posts. Stations have a WMO number and posts may not. A standard meteorological site was 26 m by 26 m, and was located on relief typical of the area. It was more than 100 m distant from any bodies of water, and at a distance 20 times the height of any obstruction such as trees or a building.

Changes in methods and instruments for air temperature and precipitation measurements over the period of record that can lead to inhomogeniety in the data record are described by Williams and Konovalov (2005). See that paper, and the references therein, for more information.

Background color on
Air Temperature

The thermometer type used at each station remained the same throughout the period of record. Minimum temperature was consistently measured with an alcohol thermometer, whereas hourly and maximum temperatures were each collected with separate mercury thermometers. When the air temperature approached the freezing point of mercury (-38.9°C), either an alcohol thermometer, or in some cases a minimum thermometer alcohol column, was used in place of the mercury thermometer. Information on when thermometers were replaced is available from the site histories at each station. Refer to Appendix 2.

The type of shelter or screen was not standardized until about 1930. No guidelines on the type of shelter existed prior to 1912. In 1912 thermometers were housed in Stevenson screens. From 1920 to 1930, depending on the station, the Stevenson screens were replaced with the current screens. The Russian translation for these screens is meteorological small house. In 1928, additional guidelines were issued regarding the exact dimensions of the shelters and their mounting heights. From 1930 on, most stations had their thermometers sheltered in roughly the same fashion. Potential inhomogeneities in the temperature data set were caused by differences in the times when hourly measurements were recorded. Prior to 1936, hourly measurements for computing daily mean temperature were taken at 0700, 1300, and 2100 Local Mean Time (LMT). Because of the lack of nighttime observations, daily mean temperature was probably overestimated at some stations, depending on location. Beginning in 1936, all thermometers (hourly, minimum, and maximum) were checked at 0100, 0700, 1300, and 1900 LMT at most stations. From 1966 to the present, all thermometers were checked at 3-hour intervals beginning at midnight Moscow winter Legal Time (MLT). (MLT is three hours later than Greenwich Mean Time). Table 5 provides a summary of this information.

Table 5. Summary of Air Temperature Recording Methods and Instrumentation. From (Williams and Konovalov 2005).

Background color on

By the late 1800s there were several hundred precipitation gauges in what was to become the former Soviet Union. In the late 1880s, it was determined that precipitation gauges underestimated the true amount of precipitation, especially precipitation in solid form. In 1891, the Nipher wind shield was introduced, and installation of the shield began.

The need for meteorological observations at airports was responsible for the move of some stations to airports in the 1930s. As new stations came online after the 1930s, many were located at airports. The unobstructed terrain characteristic of airports resulted in the precipitation instruments becoming more exposed to wind, potentially causing undercatch to increase. Even with the Nipher shield, undercatch of precipitation during the winter was still a problem.

Over the period 1946 to 1960, rain gauges with Nipher shields were replaced with the Tretyakov gauge. Tretyakov's precipitation gauges reduce errors caused by undercatch in windy conditions and the blowing of solid precipitation out of the gauges during snowstorms. Nipher gauges record considerably less solid precipitation than Tretyakov gauges. For this reason data acquired using Nipher shield gauges cannot be used with data acquired using Tretyakov gauges without adjustments to the data record. This correction, K1, is typically applied to monthly totals using coefficients derived for individual stations. The amount of snow recorded at each station increased from 5 - 40 percent after installation of the Tretyakov gauge. The date that this new gauge replaced the old is included in the station history metadata.

At the stations, a meteorologist measured the amount of precipitation one, two or four times every 24 hours. The duration and type of precipitation were observed continuously. The amount of precipitation was measured to a precision of 0.1 mm.

Changes in the time of observation occurred in 1936, 1966, and 1986. Prior to 1936, precipitation was measured once a day at 0700 LMT. From 1936-65, gauges were checked twice daily at 0700 and 1900 LMT. Beginning in 1966, gauges were checked four times a day, relative to MLT, which varied with the time zone of the region. In 1986, observations were again reduced to twice daily.

Concomitant with the increased daily sampling in 1966, a wetting correction was applied. The wetting correction accounts for systematic moisture losses due to moisture remaining on the sides and bottom of the precipitation gauge bucket when precipitation is decanted to the precipitation measurement glass. The amount of precipitation added to the total by the wetting correction depends on the number of observations per day. This is particularly important correction because a volumetric method of measurement for non-recording precipitation gauges is used. Unfortunately, the wetting correction for 1966 was insufficiently tested. In 1967 a better wetting correction based on controlled laboratory experiments was implemented. The correction adds 0.2 mm for liquid and mixed precipitation and 0.1 mm for solid precipitation for each individual measurement. The data published in reference books since 1967 use this improved wetting correction. The data for 1966 use the less reliable method for that year. Precipitation data published in reference books prior to 1966 do not have a wetting correction. Table 6 provides a summary of the history of changes in precipitation measurement practices.

Table 6. Summary of Precipitation Recording Methods and Instrumentation. From Williams and Konovalov (2005).

Background color on
Station Metadata

The history of each station was acquired from publications such as History and Physical-Geographic Description of Stations and GaugePosts, published as part of the Reference Book on the Climate of the USSR (1966-1969). Appendix 3 gives specific metadata sources. Historical metadata includes station relocation date(s), the distance and direction of any such move(s), and date(s) for any changes in instrumentation such as the switch from Nipher shields to Tretyakov precipitation gauges.

Metadata consisting of information about the present environment of the station is included in Appendix 2, following the protocol in Peterson and Vose (1997). Station name, latitude, longitude, and elevation, following the coding from the current World Meteorological Organization station listings (WMO, 1996) is provided, in addition to:

  • Population: Population metadata provides a valuable tool for climate analysis. Knowing whether a station is located in a rural area or near a large city allows data users to avoid the effects of urban heat islands, one of the criteria in the initial selection of the Global Observing System Surface Network (Peterson et al., 1997). Population values were obtained from 1990 census data. To conform with GHCN protocols, each site is classified as: (i) rural (not associated with a town larger than 10,000 people); (ii) small town (10-50,000 people); and (iii) urban (city with a population greater than 50,000).
  • Airport locations: Many stations were moved to airport locations over the last several decades. If a station is located at an airport, this information along with the distance from its associated city or town is included.
  • Topography: Each station is classified as flat, hilly, or mountainous. Mountain valley stations and the few ridge or mountaintop stations are distinguished.
  • Vegetation: Vegetation classification in the surrounding area for each station included forested, agricultural, clear, open, marsh, ice, and desert.
Background color on
Temporal and Spatial Coverage and Resolution

Data are available as monthly values (monthly means for air temperature and monthly sums for precipitation). The temporal coverage varies by station, but the longest range spans from 1879-2003.

The data fall within this bounding box:

Northernmost Latitude: 50° N
Southernmost Latitude: 35° N
Easternmost Longitude: 85° E
Westernmost Longitude: 55° E

Figure 1 shows the locations of the stations in this data set.

Figure 1. Spatial Coverage Map for Central Asia Precipitation and Air Temperature Stations

Background color on
Parameter or Variable

Sample Data Record

In each Excel file, there is a Data and an Info worksheet. For example, in the average air temperature data files, these worksheets are named Taver and Taver_info. The worksheets with info in the name contain the metadata that correspond to the data. Refer to Tables 7 through 12 for an example of the beginning sections of the worksheets for the Taver_v1.xls, Precip_v1.xls, and Tmin_Tmax_v1.xls files.

Table 7. Taver Worksheet for the Average Air Temperature Data File

Table 8. Info_Taver Worksheet for the Average Air Temperature Data File

Table 9. Precip Worksheet for the Precipitation Data File

Table 10. Info_Precip Worksheet for the Precipitation Data File

Table 11. Tmax Worksheet for the Maximum Temperature Data File

Table 12. Info_Tmax Worksheet for the Maximum Temperature Data File

Background color on

Software and Tools

Background color on

Data Acquisition and Processing

The data were acquired from books, online sources and publicly available data sets, and keyed or reformatted as needed. Appendix 1 provides a list of the original data sources. Sometimes daily data from the GTS or other sources of digital data were used. The WMO codes for countries and stations, the station coordinates, altitudes and the English transcription of station names are from references sources. See Appendix 3 for the metadata sources. In some instances, topographical maps with a scale of 1:100,000 were used to verify coordinates.

The data set authors:

  • Digitized (keyed) the data when needed.
  • Performed quality control on the measurements. This included standardizing air temperature data and removing values more than 3.5 sigma outside the normalized mean.
  • Applied standard Soviet bias corrections to precipitation data.
  • Evaluated air temperature for spatial homogeneity using multifactor equations based on latitude, longitude, and elevation. See Equation 5 in the document GHCN: Regional Data Base on Climate of Central Asia v.1.0 (Williams and Konovalov, 2005).
  • Prepared metadata that are in Appendix 3.
  • Assessed the spatial homogeneity of the data.

NSIDC published the data with only making a few minor changes to the data. The changes include substituting -999.0 for no data (original data had no data marked by a blank space colored in yellow), creating text versions of the Excel files, and removing equations from the Excel files, for example, Excel files containing only values. In Appendix_2.xls, rows that had MOVE in the INFO column followed by -999.0 in the year column were removed. Appendix_5.xls was linked to the original source data and this link was severed at NSIDC.

Consistency checks

Checks were made for file truncations, formatting errors, unreadable records, date variables having reasonable values, each station is chronologically sorted, the number of lines and record length are consistent with the time span, variable storage locations and variable types (integer,real, character) are consistent and in the correct columns.

Background color on
Evaluation of Station Time Series

For source data that passed the consistency checks, the individual station time series were evaluated. Information on station location (latitude, longitude, elevation) was checked against the WMO directory and local atlases for each country. The authors have local knowledge that gave an additional measure of rigor to the checks.

Other checks included checking for three or more months with values identical to a precision of one decimal place. Gross discontinuities such as those caused by inappropriate concatenation of two stations into one, or a change in the units, for example, from Celsius to Kelvin, were evaluated using two tests. The Cumulative Sum (CUSUM) developed by (van Dobben de Bruyn 1968) tests for a change in the mean of a time series and has been used to determine the homogeneity of climate stations (Rhoades and Salinger, 1993). This test is sensitive to both long-term trends and to outliers. Because mountain stations often have anomalous but valid climate measurements, each decade is tested separately. Changes in variance were evaluated using a variant of CUSUM called SCUSUM.

Background color on
Air temperature

The primary GHCN approach to homogeneity adjustments for air temperature is to create a homogeneous reference series for each station (Peterson and Easterling, 1994). With the reference series created, inhomogeneities are detected by examining the difference series between a station and its reference series (Easterling and Peterson, 1995). Homogeneity adjustments are than made on time series with significant discontinuities (Students t-test) by using the difference in the means of the difference series’ (station minus reference) over a 12-yr window. An evaluation of monthly air temperature data from Central Asia showed that homogeneity adjustments were not needed.

Background color on
Evaluation of Individual Air Temperature Points

Higher elevation stations may have much greater year to year variability than stations at lower elevations. A measure of variance, the biweight standard deviation (Lanzante, 1996), was used to normalize the data for each station for each month prior to evaluating the record for outliers. Great care was taken to not throw away good data that happens to be extreme, because extreme events represent very important aspects of climate.

The statistics used in the normalization are in Appendix 4. Normalization and computation of standard scores (z-scores) was carried out according to:

zi = (Xi - Xmean)/stdX

Outliers with a z score outside 3.5 standard deviations were identified and discarded using:

-3.5 < z > 3.5

Normalization and checking for outliers was carried out on a monthly basis for each station. Statistical values for each station are given according to month in Appendix 4. The minimum and maximum z scores (Utmin and Utmax) are given as well.

See the Control of Tolerance for Data on Air Temperature section in the document GHCN: Regional Data Base on Climate of Central Asia v.1.0 (Williams and Konovalov, 2005) for an explanation of the graphical analysis. The final versions of these graphs are shown in Appendix 5.

Background color on

Groisman et al. (1991) have developed a protocol for homogeneity adjustments of precipitation measurements from the former Soviet Union. Essentially, three correction coefficients (K1, K2, and K3) have been developed for unbiased and homogeneous precipitation records. Parallel testing of the Nipher shielded gauges and the new Tretyakov gauges made it possible to develop monthly adjusted coefficients for the change in gauges (Hydrometeorological Service of the USSR, 1964). Experience has shown that these monthly adjustment coefficients (K1) have a standard error of estimate of about 10 percent (Shver 1965, 1976) when used to estimate the monthly precipitation that would have been measured by a Tretyakov gauge:


The Reference Book on the Climate of the USSR (1966-1969) contains the estimates of the proportion of the total mean monthly precipitation that adhered to the bucket walls (K3) for every station in the FSU. The values of K3 were estimated based on two observations per day. So, to correct the time series for the unapplied wetting correction prior to 1936 when measurements were made only once a day, the following equation is appropriate:


Where PMOIST is the adjusted data for the wetting corrections and PORIG is the originally measured precipitation amount. For the period 1936-1965, use:


More problematic is adjustments that need to be made because of undersampling of precipitation caused by blowing rain and snow. Scale correction multiplication coefficients (K2) were developed as a function of climatological wind speed, temperature, and precipitation from the work of Bogdanova (1966). The use of K2 coefficients at individual stations or individual months is not recommended. However, all three coefficients can be used to develop mean area estimates of the true precipitation at ground level (Groisman et al., 1991).

Background color on

References and Related Publications

Contacts and Acknowledgments

M. W. Williams
Institute of Arctic and Alpine Research
Boulder, USA

V. G. Konovalov
Institute of Arctic and Alpine Research
Boulder, USA


The National Science Foundation funded the creation of this data set through Award No. ATM-0118384: Regional updating and expansion of the Global Historical Climate Network (GHCN) database: High-Mountain areas of Central Asia

Publication of the data set by NSIDC is supported by funding from NOAA's National Environmental Satellite, Data, and Information Service (NESDIS) and the National Geophysical Data Center (NGDC).

Document Information

Document Authors

F. Fetterer and L. Ballagh prepared this document based on correspondence with M. Williams. An unpublished document supplied by M. Williams and Konovalov (2005) provided a majority of the content along with a presentation by Williams, Konovalov, and Dyurgerov given in 2004, and other material provided by M. Williams.

Document Creation Date

February 2008

No technical references available for this data set.
No FAQs or How Tos available for this data set.

Access complete Knowledge Base

Questions? Please contact:
NSIDC User Services
Phone: 1 303 492-6199