The Bootstrap Algorithm

J.C. Comiso
Goddard Space Flight Center

Published 1992

Algorithm Description

The bootstrap technique as described by Comiso (1986) and Comiso and Sullivan (1986) uses basic radiative transfer equations and takes advantage of unique multichannel distributions of sea ice emissivity. To derive ice concentration using the Bootstrap algorithm, only two microwave channels are needed but additional channels may be required to mask the open ocean. An inherent assumption in this algorithm is that there are large regions in the Central Arctic during winter with ice concentrations of 100 percent. In the Arctic region, which is covered predominantly by multiyear ice, scatter plots of emissivities show non-linear clustering of consolidated ice data (Comiso 1983). The non-linearity is due mainly to internal scattering which varies in effect with frequency and polarization. A most welcomed exception is the 37 horizontal (H) versus 37 vertical (V) scatter plot which shows very high correlation of the horizontal with the vertical polarization data at 37 GHz. The latter is a special feature utilized by the algorithm to obtain ice concentration in the Central Arctic region.

Using these two channels has several virtues: (a) it provides the best resolution from the set of channels (i.e. 19 GHz and 37 GHz at both polarizations) used for ice algorithms, (b) the standard deviation across the linear cluster is less than +2.5 Kelvins and may provide accuracies suitable for the Arctic region where the percentage of open water is usually less than five percent in winter, (c) surface temperature variations are taken into account (see Comiso 1986), and (d) the slope of the line AD, which is used to find the tie point for consolidated ice, is consistently close to 1.0 in every winter data set from either SMMR or SSM/I. The set of channels is most suitable for the Central Arctic region in the middle of winter.

In the seasonal sea ice regions where first year ice dominates, as in the Antarctic, the sole use of the 37H and 37V channels does not provide consistent identification of consolidated ice data points. Sometimes, consolidated ice is difficult to distinguish from mixtures of ice and water due to presence of snow cover, flood or roughness effects. The 19 GHz and 37 GHz data, at vertical polarization, were found to be considerably more consistent and accurate in this regard. This set of data is thus used primarily in the seasonal ice region, including the Antarctic region.

To obtain ice concentration, it is not necessary to invoke the ice type. The ice concentration, Ci, corresponding to an observed brightness temperature Tb, is derived from the mixing formulation [eq. (1) but with all ice types combined] and is given by:

Ci = (Tb - To)/Ti - To

where Toand Ti are reference brightness temperatures for open water and sea ice, respectively.  Sea ice in this equation may be any or a combination of the various ice types. The above equation uses the appropriate values of Ti and To, which are determined from a set of two channels 1 or 2 (either 37 H and V, or 19 V and 37 V). For any set of channels, the ice concentration for any data point B can be derived from the following equation:

C = {[(T1B - T1o)2 + (T2B - T2o)2]/[(T2i - T2o)2 + (T1i - T1o)2]}1/2

where T1i and T2i are the reference brightness temperatures for ice, and T1o and T2o are the reference brightness temperatures for open water. The slopes and offsets for consolidated ice used by the algorithm are given in Table 1. The physics of the technique are explained in more detail in Comiso (1986).

The ice concentration maps derived using the Bootstrap technique show a well identified marginal ice zone and very high ice concentration values in the inner pack, except in a few areas. The actual ice concentration gradient near the ice edge may be higher than is exhibited by the data. This is partly because of changes in emissivity from the ice edge (where there is an abundance of new ice) to the inner pack and partly because of the antenna pattern of the sensor, which tends to smear the data when the satellite passes a boundary of two contrasting surfaces.

Ocean Mask/Weather Filter

The brightness temperatures used by the algorithm as the reference point for open water within the ice pack are inferred from the scatter plots of ice-free ocean. The point chosen (i.e., H) is close to the minimum value on account of the usually uniform and smooth surface and low surface temperature within the ice pack. In the open ocean, atmospheric effects, waves, higher temperatures, and foam cause the brightness temperatures to increase substantially in some areas. Application of the algorithm in these areas produces unrealistic ice concentration values in the open ocean.

Fortunately, data from the open ocean regions can be separated from those in ice covered areas on account of predictable changes in brightness temperatures in the region when atmospheric forcing and ocean conditions change. This is especially evident in scatter plots of SMMR 18 GHz vs 37 GHz brightness temperature data at vertical polarization. In the open ocean, the brightness temperatures are linearly related in these plots and are clustered distinctly enough for easy separation of ice from ice covered oceans. An ocean mask that utilizes this set of channels was effectively applied in Comiso et al. (1984), but was first documented in Comiso and Sullivan (1986).

A similar but slightly modified technique has been used by Gloersen and Cavalieri (1986). With SSM/I data, the same method was not as effective because the 19 GHz channels available for this purpose are closer to the water vapor line (22Ghz) than are the 18 GHz SMMR channels. Therefore the 19 GHz open ocean data are no longer linearly related to the 37 GHz data, making it difficult to mask open ocean without simultaneously masking ice covered areas, especially when the brightness temperatures are high. This problem was easily resolved because the difference of 22 GHz data (i.e., T22V - T19V) separates open ocean from ice covered areas at moderate to high brightness temperatures. Data in which this difference is greater than 14 K were found to be located in areas where the 19 versus 37 GHz mask does not work well. The slopes and offsets used for the latter set are given in Table 1.

Error Analysis

Under ideal winter situations when only thick ice and open water are present, ice concentration can be derived with Bootstrap technique at an accuracy of about five to 10 percent, based on standard deviations of emissivities as used in the formulation. Errors are higher in the seasonal ice region than in the central Arctic region because of higher standard deviations of consolidated ice in the 19 vs 37 GHz plots. This is partly because of spatial changes in surface temperature that are not as effectively accounted for by this set of data.

Constantly changing emissivities of some surfaces present unresolved problems. For example, when leads open up during winter, the open water is exposed to the cold atmosphere and grease ice quickly forms at the surface. The surface then metamorphoses from grease ice, to nilas, to young ice and then to first-year ice with snow cover. During these transitions, the emissivity of the surface can change considerably from one stage to another (Grenfell and Comiso 1986). Since such changes in emissivity are not taken into account in ice concentration algorithms, the derived fractions of open water are therefore not strictly those of open water and may include some mixtures of grease ice and new ice. In spring and summer, the emissivity of thick ice also changes with time, especially over the perennial ice region in the Arctic. The slopes and offsets of the consolidated ice line AD in the scatter plots are adjusted to automatically take this into account during onset of spring in June; original values should be restored during winter freeze-up. Despite this adjustment, the error is still substantial and can be larger than 20 percent due to spatial variations in melt and affects of meltponding.

Several field and aircraft experiments have been performed in both polar regions and in some of them the basic assumptions about ice types and interpretation of the cluster plots have been confirmed. However, validation of satellite ice concentration data using data from these experiments has not been easy. Field data are difficult to use because of limited coverage compared with the large footprint of the satellite sensors (about 30 X 30 km). While generally easier to interpret because of fine resolution and availability of ancillary measurements, aircraft data are useful but need to be validated by ground measurements.

Another strategy has been to utilize high resolution satellite (or space shuttle) data for validation. While the use of high resolution data has its advantages, such strategies degenerate into comparative analysis because the other satellite data also need to be validated. For example, unambiguous discrimination of open water, grease ice, small pancakes, and gray nilas in both visible and microwave channels may be impossible even with high resolution sensors. Generally, however, the passive microwave data provide valuable information about large scale characteristics of the ice cover as well as locations of ice edges, polynyas, and extensive leads. It is, however, useful to note that some of the comparative studies yielded high correlation coefficients.

Surface Type and Future Directions

With the Bootstrap technique, ice concentration can be derived without an a priori assumption about the type of ice surface within the satellite footprint. This is possible only because consolidated ice data points appear linear in the scatter plot of the set of channels used in the algorithm. The technique described previously can be slightly modified to derive the fraction of each ice type within a data element if there are only two radiometrically different ice surfaces, the average emissivities of which are known. In a 2-D scatter plot, the three radiometrically different surfaces, including open water surfaces, would be represented by three corners of a triangle. The surface type fraction can then be linearly interpolated from this triangle, assuming pure type at each corner point. However, as a rule, it is not easy to find the corner points due to the effects of snow cover, wetness, temperature and flooding. Furthermore, the number of radiometrically different surface types may be more than three, beyond which the set of channels from the radiometers cannot effectively discriminate.

The ambiguities in the classification of surface types may be minimized by applying the algorithm on a regional basis. Previous analyses of subsets of the Arctic Ocean and vicinity indicate the merit of regional studies (Comiso 1983). However, finding a standard method of establishing the boundaries for each of these subsets for global studies is not trivial because the polar regions are so vast, and also because of the dynamic character of the sea ice cover.

This problem can be addressed using a combination of geographical boundaries and physical property boundaries. Different bays, seas, and oceans may be regarded as distinct geographical regions. The use of geographical boundaries provides an opportunity to take advantage of the time history of the ice pack. For example, if a bay (e.g. Baffin Bay) is ice-free during the summer, it is unlikely that it would have multiyear ice during the subsequent winter. In such a region, the primary ice types would be first-year ice and new ice.

Physical property boundaries can be associated with the different ice regimes, such as the outer, inner, divergence, convergence, and perennial zones. The outer zone, which includes the marginal ice zone, is perhaps the easiest to identify in the passive microwave data because of the marked contrast in emissivity between the ocean and the ice covered regions at some frequencies. It is also characterized as the region where loose pancakes are located and where the affect of waves dominates. Alternatively, the inner zone is the region where the ice cover is practically a continuous sheet of ice except where there are leads and polynyas. The divergence zone is the area of extensive lead formation while the convergence zone is the area of extensive rafting and ridging. The perennial zone is the area where much of the multiyear ice floes are located. The problem exists, how can these different regimes be separated consistently, especially in Arctic regions where the history of the ice cover can be so varied and complex.

An unsupervised technique for identifying the ice regimes/types has been developed using multichannel cluster analysis (called isoclass) as described by Comiso (1990). Isoclass utilizes all SSM/I channels and finds data that are clustered together within a specific standard deviation. Data in each cluster are expected to have similar radiative, and therefore physical, characteristics.

Physically distinct regions such as the marginal ice zone and the seasonal ice region, for example, form different clusters. Different clusters in the perennial zone have also been identified (Comiso 1990). However, although the procedure is a viable option, good field data establishing correct interpretation of each of these clusters are required and have been difficult to come by. Before an ice-type algorithm can be applied regionally, the typical types of ice in the region and their emissivities must be known. The passive microwave data also show strong evidence of surface melt and meltponding effects. These are also surface types but are prevalent only during the spring and summer seasons. Quantification of the areal extent of these and other surface types would be very useful. The results could provide a way to assess the state of the ice cover. Also, if the average albedo for each type is known, the albedo of the polar regions can be estimated at a good accuracy. The development of new techniques that may include concurrent use of ancillary data (e.g. SAR or Landsat) would be most welcomed.

Analysis of results using the Bootstrap algorithm suggest that the SSM/I data cannot provide accurate information about ice concentration when the value is below 8 percent. Thus 7 percent was used as a flag to indicate that the derived concentration ranging from 0 to 7.99 percent represents open water. Due to the uncertainties associated with the emissivity of 100 percent ice concentration, derived concentration values are set to a maximum value of 108 percent instead of 100 percent. Because the error is approximately ñ 8 percent, 100 percent ice concentration may be indicated by any value derived within the 92 percent to 108 percent range.

Table 1. Slope and offsets used by the Bootstrap Algorithm for SSM/I data

						Months	        Slope   Offset 	  M0
100% Ice - Winter 	
37 H v. 37 V (N. Hemisphere)		20 September - June	1.000	-12.0
19 V vs. 37 V (N. Hemisphere)		16 October - May	0.553	117.0
19 V vs. 37 V (S. Hemisphere)		March - October		0.473	139.0

100% Ice - Summer	
37 H v. 37 V (N. Hemisphere)		1 July - 18 July	1.226	-70.1
					19 July - 4 August	1.033	-25.0
					5 August - 20 Sept.	0.993	-14.0

19 V vs. 37 V (N. Hemisphere)		June - 12 September	0.560	119.0
					13 Sept - 15 Oct	0.607	103.0
19 V vs. 37 V (S. Hemisphere)		November - January	0.473	139.0
					8 February - March	0.620	120.0
					February 1-7		0.547	120.5
					April 1-7		0.547	120.5

Open Ocean Masking - Winter
19 V vs. 22 V 	(N. Hemisphere)		October-May		0.567	78.0			
(22V-19V) (N. Hemisphere) >14K	
19 V vs. 22 V (S. Hemisphere)		March-October		0.493	93.0
(22V-19V) (S. Hemisphere)		

Open Ocean Masking - Summer 
19V vs. 22V (N. Hemisphere)		June - October		0.580	72.26
(22V-19V) (N. Hemisphere)							>23 K
19V vs. 22 V (S. Hemisphere)		November - February	0.493	93.0
(22V-19V) (N. Hemisphere)							>16 K

Open Water Tie Points - Winter
Northern Hemisphere  
19V=179K, 37V=202 K, 37H=130 K
Southern Hemisphere 
19V=179 K, 37V=202 K

Open Water Tie Points - Summer
Northern Hemisphere  
19V=181 K, 37V=203 K, 37H=130 K
Southern Hemisphere 
19V=179 K,37V=202 K