NSIDC Special Report 6
An Evaluation of the Results from Using Nearest Neighbor and Area Min/Max Resampling Methods to Regrid AARI Digitized Sea Ice Charts to EASE-Grid
Raymond F. Kokaly
26 April 1996
National Snow and Ice Data Center
Cooperative Institute for Research in Environmental Sciences
University of Colorado, Boulder, Colorado, USA
2. AARI Data
2.1 SIGRID Format
2.1a Data Values
2.1b Grid Points
2.2 1972 Data Set
3. Regridding Methods
3.1 Nearest Neighbor
3.2 Area Min/Max
4. Results of Regridding
4.1 Nearest Neighbor
4.1a Data Transfer
4.1b Spatial Coverage of Each Layer
4.1c Regridding Examples of Individual Files
4.2 Area Min/Max
4.2a Data Transfer
4.2b Examples of Regridding
5. Improvements to Regridding Methods
5.1 Nearest Neighbor Regridding to 12.5 km EASE-Grid
5.1a The 12.5 km Resolution Northern Hemisphere Projection of EASE-Grid
5.1b The Results of Nearest Neighbor Regridding to the 12.5 km EASE-Grid
5.2 More Complex Decision Making for Area Min/Max Selection
6. Evaluation of the Effects of Regridding
6.1 Direct Effects on Data Transfer and Values in EASE Images
6.2 Impacts on Computer Resource Requirements
6.3 Indirect Impacts
7. Recommendation for a Regridding Method to Apply to the Entire AARI Data Set
8. Future Work
The Russian Arctic and Antarctic Research Institute (AARI) digitized their operational sea ice charts as part of an international data exchange program. The National Snow and Ice Data Center (NSIDC) is distributing this data in the digitized format known as SIGRID (Sea Ice Grid). Due to the difficulty in working with data in this encoded ASCII format, NSIDC has developed a gridded form of the AARI data. The Equal Area SSM/I Earth Grid (EASE-Grid) was selected as the map projection scheme for the gridded product. The EASE-Grid is the standard projection and grid for SSM/I (Special Sensor Microwave Imager), and other polar data distributed by NSIDC.
The data are being transferred to a gridded map projection for several reasons:
- Data visualization. In their current format, the data are in an encoded ASCII format that contains latitude and longitude positions with associated data values. In that format, it is difficult to create a sensible "picture" of the data. By regridding to a standard map projection such a picture is created. This resulting image can be used to check the spatial coverage of data or to quickly assess the content of the files visually.
- Ease of extraction. Currently, the AARI data are difficult to extract from the SIGRID format. It requires running a program to interpret the coded ASCII format to get a tabular form of output that lists grid point locations and data values. In contrast, with EASE-Grid the analyst can use standard image processing tools.
- Scientific applications. By having the data in a gridded format scientists will be able to use the data for more applications. For example, a climatological analysis of the entire data set can be made by selecting an area on the map projection and reading the data values in each file. In their current SIGRID form, the scientist would first have to extract the data from the encoded ASCII format and then put them in another format before conducting an analysis. Another use of the data in a gridded form would be for compositing with similar data (for example, with the digitized charts of the National Ice Center or Japan). By having the AARI data already in a map projection, other data can be overlaid or aggregated onto the map. Furthermore, climate and sea ice modelers can more readily use a gridded product as input or for validation.
For some uses of the data, such as validating SSM/I sea ice concentrations, working with the original data is more appropriate. However, the reasons listed for creating a gridded product motivate the work summarized in this document.
This document presents two resampling methods suitable for regridding Russian AARI digitized sea ice charts to EASE-Grid. These two methods, nearest neighbor and area min/max, are applied to the AARI files for 1972. The results of the regridding are presented and each method is evaluated on its ability to precisely, accurately, completely, and consistently transfer the data to EASE-Grid. Based on these evaluations, a resampling method is recommended for regridding the entire AARI data set to the EASE-Grid northern hemisphere projection.
This document assumes the reader has a basic familiarity with the AARI data and EASE-Grid projection. For background documents describing these topics consult the reference section below for links to the World Meteorological Organization (WMO) report on SIGRID, and to EASE-Grid documentation. While the relevant details of regridding data from AARI to EASE using nearest neighbor and area min/max are presented, the reader may consult NSIDC Special Report 5 which discusses general aspects of regridding and evaluates several methods for regridding AARI to EASE.
This section describes the AARI data format, SIGRID. NSIDC currently has AARI data in the SIGRID format for 1967 to 1990. One year, 1972, was selected to test the regridding schemes. This section also describes why this year was selected and the temporal and spatial characteristics of this subset of data.
SIGRID was developed for the World Meteorological Organization. It was designed to place information recorded on operational ice charts into a digitized format more suitable for statistical and climatological use. This was done by assigning numerical values to the ice parameters and recording them at specific grid points on the charts.
All the data values in SIGRID are symbolic. The numbers are codes for a specific value or a range of values. For example, an AARI ice concentration code of 90 refers to 90% ice concentration and a value of 91 refers to the range of concentrations greater than 90% and less than 100%. Furthermore, many of the SIGRID data values are descriptions of the ice rather than numerical measurements of sea ice. For example, a value for an ice form of icebergs or level ice. The descriptive and symbolic nature of AARI data affects regridding approaches that combine data values (e.g., interpolation or averaging methods). This was a major factor that led to selecting nearest neighbor and area min/max as the only candidate regridding methods.
Although the published format of SIGRID allows for many more parameters, AARI digitized only ten parameters. These parameters are: total sea ice concentration, the partial concentration, stage of development, and form of the 1st thickest ice, the concentration, stage and form of the 2nd thickest ice, and the concentration, stage and form of the third thickest ice.
In the AARI files of digitized sea ice charts, each grid point is coded with a value for either land, unknown or with a value of total sea ice concentration. If a value of total ice concentration is recorded, then more parameters describing the ice at the grid point may also be present. For ice concentrations of 0% (ice free) no additional ice information is digitized. For non-zero ice concentrations, the next set of three ice parameters describe the 1st thickest ice type: its partial concentration, stage of development, and form. Partial concentration and stage of development are always recorded. However, ice form is only recorded if it is fast ice. As a result, some grid points should only have the first two of the three parameters recorded.
If the partial concentration of the 1st thickest ice does not equal the total ice concentration, indicating there is some thinner ice at the grid point, then three more parameters describing the 2nd thickest ice at the grid point will be present. These parameters are again: partial concentration, stage of development and ice form. Finally, if there is still more ice that is thinner than the first two ice types, there will be the same three parameters describing the 3rd thickest ice. As a result, AARI grid points may have up to ten values describing the sea ice at each grid point.
The SIGRID format uses a "geographical" grid. Essentially, this type of grid defines grid point locations according to latitude and longitude spacing. According to observations of the AARI implementation of SIGRID, the latitude spacing of the grid points is constant at 0.25°. The longitude spacing, in contrast, varies with latitude (Table 4). The longitude spacing in degrees of longitude is determined by multiplying the latitude spacing by a lon/lat ratio. The value of the ratio is different for each of eight defined latitude regions of SIGRID. As shown in Table 4, grid points at higher latitude regions have greater spacing in longitude.
In each AARI file, the locations of the grid points are determined from an origin latitude and longitude. The initial latitude and longitude must be consistently selected to always yield the same grid point locations. The regridding methods have been implemented for the grid point locations defined in the document describing the SIGRID format (1). Unfortunatly, the initial longitude for many AARI files is not consistant with the expected value. The result is that the initial longitudes selected by AARI give grid points that match the original grid points for latitudes equal to and below 87°. Thus, these values are correctly regridded to EASE. Above 87° latitude, the AARI grid points do not transfer to the EASE-Grid.
If the initial longitudes selected in the entire AARI data set result in consistent grid point locations then the regridding methods could be redefined to match these points and all data could be properly regridded to EASE. However, problems may arise later when trying to make comparisons with National Ice Center (NIC) data that have been digitized in the SIGRID format with the different grid point locations. This problem with inconsistent initial longitudes should be fixed in the revised implementation of SIGRID. The new format, called SIGRID-2, is described in a report by AARI to the WMO (6). NSIDC plans to verify the quality of the SIGRID-2 data and also plans to compare the contents of SIGRID and SIGRID-2 files.
Typically, AARI produced separate sea ice charts for eastern and western sectors. However, the AARI data for 1972 are compiled on a single chart covering the north polar region. In order to simplify the processing, analysis, and comparisons, the candidate regridding schemes were applied to this one year subset of of the full AARI data set. The coverage and content of the AARI 1972 data in SIGRID are presented in this section in order to later assess how well the regridding schemes transfer the data to EASE-Grid.
The AARI data set has been described as consisting of charts spaced in time by 10 days. While this has been observed for some of the 1990 data, the 1972 data does not follow this sampling. In fact, the data are not consistently spaced in time. Figure 1 shows how these files are distributed through the year. Table 5 lists the 19 files in the 1972 data set.
Every digitized chart does not contain data for the entire polar region. The coverage varies from file to file. For the 1972 data, the coverage is primarily in the eastern half of the Arctic. Figure 2 shows, in the EASE-Grid, the spatial coverage of the data. The different shadings discriminate how frequently ice information is recorded. For example, the lightest shading indicates only one file has ice information for the pixel and the darkest shading indicates all 19 files recorded ice information for the pixel. Section 2.2b details the content of the 1972 AARI data set.
The initial longitude selected by AARI when digitizing the sea ice charts affects the location of the SIGRID grid points. In all but two of the 19 files, the initial longitude was -24°. For the last two AARI files, an origin of 108° was used. These origin longitudes are the same as those observed for the 1990 AARI "eastern" and "western" sectors, respectively. As discussed in Section 2.1b, the origin longitude must be carefully selected when digitizing each ice chart in order to ensure that the grid points are always at the same locations. For latitudes greater than 87°, the two origin longitudes observed in the 1972 AARI files result in different grid points than the originally specified grid points for SIGRID (see the WMO report, 1). As a result, the regridding results presented in Section 4 will not show data above 87°. Furthermore, when comparing the grid points resulting from the two origin longitudes to one another again they do not match above 87°. The consequence of this mismatch for production of AARI data in EASE-Grid is that the selected regridding method, for latitudes above 87°, will have to be defined twice, once for each initial longitude.
As discussed in Section 2.1a, the AARI data has ten layers. Table 6 summarizes how many grid points in each AARI file have values for each of these different layers. The second column lists the total number of SIGRID grid points that are digitized in each file. The next three columns show the breakdown of what these points represent: land, unknown or sea ice concentration. Few of the points are digitized as land. As will be seen in the EASE-Grid images, AARI rarely digitized grid points on land with the code for land. Instead, most of these points were digitized as unknown. This accounts for the high percentage of the total grid points that were digitized as unknown.
In general, all AARI files for 1972 have similar proportions of land, unknown and ice grid points, with the exception of three files:
1) 720813 - many more grid points than average were digitized, however, these additional grid points were digitized as unknown.
2) 721015 - a lesser proportion of grid points were digitized as sea ice. Section 4.1c will show that this is due to AARI digitizing only a small area of sea ice in the eastern Arctic and continuing to digitize points in the western Arctic with unknown values.
3) 721025 - a more extreme example of 721015.
Excluding these three cases, the 1972 AARI files, on average, are digitized according to the following
unknown = 47%
land = 3%
sea ice concentration = 50%
The total number of grid points and the number in each of these three categories are plotted versus time in Figure 3. In general, the total number of digitized grid points is between 13,000 and 20,000. The three exceptions are evident: 720813 as the big increase in total and unknown grid points, 721015 and 721025 as above average number of grid points in the unknown plot and the below normal number of grid points in the plot of grid points with total ice concentration values.
Figure 4 and columns 6-8 in Table 6 show how many grid points were digitized with information on the concentration, stage of development and form of the 1st thickest ice. Because some of the grid points that have information on the total ice concentration have a zero value (i.e., ice free grid points) the number of grid points with information on the 1st thickest ice is less than the number with total ice concentration. As discussed in Section 2.1a, the only grid points that have a value of ice form are those with a form of fast ice. As a result, there are far fewer grid points with this parameter digitized than the other two parameters describing the 1st thickest ice.
Similar to the first thickest ice, columns 9-11 in Table 6 and Figure 5 show the number of grid points with values describing the second thickest ice. Finally, Figure 6 and columns 12-14 in Table 6 present the data content regarding the third thickest ice.
This section describes the two regridding methods: nearest neighbor and area min/max. For general background information concerning regridding AARI data to EASE, consult NSIDC Special Report 5(5) which includes a discussion of these and other resampling methods.
This method starts by establishing the center locations (latitude and longitude) of grid cells of the target projection, EASE-Grid. Then, the distances from these points to the grid points of the original projection, SIGRID, are calculated. Next, the SIGRID point closest to each EASE grid point is identified. Finally, the value of each EASE grid cell is filled by assigning it the value of its "nearest neighbor" SIGRID point.
The grid point spacing along latitude and longitude, of EASE and SIGRID change as latitude increases from the equator. Figure 7 shows the variation. The greater the grid spacing, the fewer grid points exist for a line of latitude. As a result, sometimes there are many more SIGRID points than EASE. For example, at 85° latitude, where the SIGRID spacing is 14.55 km and the EASE spacing is 25 km. Conversely, at some latitudes there are far fewer SIGRID points than EASE. For example, at 50.25° latitude the SIGRID spacing is 36 km and the EASE spacing is 23.5 km.
In areas where there are more SIGRID points than EASE points some of the SIGRID points are never selected as nearest neighbors. The values of these unselected SIGRID points are never transferred to EASE-Grid because other SIGRID points always lie closer to the surrounding EASE grid points. In situations where the SIGRID points are fewer in number than the EASE points, then more than one EASE grid point may select the same SIGRID point as their nearest neighbor. These two situations result in data loss and data replication, respectively, when regridding the AARI data to the EASE-Grid. Table 7 shows the amount of data loss (11%) and redundancy(20%) when resampling AARI to the 25 km EASE northern hemisphere projection.
Section 4 presents the results of regridding the AARI 1972 data to the 25.0 km EASE-Grid. However, the most serious fault with the nearest neighbor method remains the data loss of approximately 11% when regridding to this grid. In order to eliminate this data loss, regridding AARI data to the 12.5 km EASE-Grid is explored in Section 5.1.
The area min/max concept is based on overlaying the SIGRID grid cells onto the EASE-Grid. For each EASE grid cell, the SIGRID cells which have some area overlapping it are determined. From the overlapping SIGRID values, two EASE images are created: one with the maximum of the values lying in the EASE cells and one with the minimum of the values lying in the EASE cells.
For example, in Figure 8 the EASE grid cell with overlapping SIGRID values of 60%, 80% and 100% sea ice concentration would have a value of 60% in the minimum image and 100% in the maximum image.
As with the nearest neighbor approach, the disparate grid spacing of the two grids complicates the regridding. Figure 9 shows the amount of area the grid cells cover as a function of latitude. For the equal area EASE-Grid this is constant at 625 km². However, the SIGRID again shows variation as latitude increases. At the beginning of latitude regions the SIGRID cells are larger than EASE cells, thus, only a few overlap each EASE cell. However, within each region, as the latitude increases, the SIGRID cell sizes shrink and EASE cells are covered by more SIGRID cells.
Table 8 summarizes for the entire EASE-Grid how many SIGRID cells overlap each EASE cell. The majority of EASE grid cells have four overlapping SIGRID cells. The next most numerous are six and nine overlapping cells. Figure 10 shows how the number of overlapping SIGRID cells are distributed on the EASE-Grid. Within each latitude region, the number of overlaps is smaller at the beginning and increases at higher latitudes.
Figure 11 shows a closeup of the EASE subset region which will be used for distributing the AARI data. The majority of data in the AARI digitized sea ice charts cover the East Siberian, Laptev, Kara and Barents Seas. For each of these, the number of SIGRID cells overlapping the EASE cells is different. For example, the Laptev Sea near the coast has a high number of overlapping cells (in the range of 8 to 12) while the northern Kara Sea has fewer overlaps (in the 1 to 4 range).
The area min/max approach is more complex than the nearest neighbor because a decision must be made on the minimum and maximum values. From the previous example of four values of ice concentration overlying the EASE cell, the approach seems straightforward. However, complications arise when considering non-ice SIGRID values: grid points that are digitized with values for land or unknown and points that are not digitized. A major problem is deciding if the EASE pixel should be assigned a value for an ice parameter if it is covered primarily by one or more of these non-ice values.
This document contains results from area min/max regridding in which the ice parameter is always assigned to the pixel no matter how little area the SIGRID cell may cover. For example, if an EASE cell is covered over 90% of its area by three SIGRID cells with the value of unknown and 10% of its area by one cell with 100% ice concentration, the EASE grid cell is assigned the value of 100% ice concentration. The results of the application of this simple approach to the 1972 AARI data are presented in Section 4.2 and alternatives to this approach are discussed in Section 5.2.
This section presents the results of regridding AARI files for 1972 in the SIGRID format to the EASE-Grid northern hemisphere projection at 25.0 km resolution by nearest neighbor resampling. The average time to regrid an AARI file to EASE-Grid using an IDL (Interactive Data Language) program was 67 seconds.
As discussed in Section 3.1, the major drawback with the nearest neighbor method as applied to SIGRID is that approximately 11% of the SIGRID points do not transfer to the 25.0 km EASE-Grid. Additionally, some of the SIGRID values transfer to more than one EASE grid cell. The relative grid spacing between AARI and EASE affect these processes. Figure 12 compares the total number of SIGRID points in the AARI files to the number in the EASE-Grid images created using nearest neighbor regridding. On average, the EASE images have only 90% of the original number of grid points. Both data loss and replication are going on, thus, the net of these two processes is a 10% fewer grid points in the EASE-Grid.
Similarly, Figure 13 compares the number of EASE grid points with total ice concentration values to the original number of such SIGRID points. Again, there is a significant data loss. The amount of loss increases from 10% at the beginning of the year to over 20% at the end of the year. This is linked to a change in where the grid points are located (see the regridding examples in Section 4.1c for changes in spatial coverage). At the beginning of the year (for example, 720411), the grid points cover almost all the western Arctic. In addition, the early files cover the eastern coast of Russia. However, near the end of the year (e.g., 720915) the grid points are not located at the high latitude, polar portion of the western Arctic; instead, they cover a lower latitude zone of the northern Russian coast. The variation in data losses through the year are a result of such changes in coverage. As discussed in Section 3.1, the relative spacing of AARI and EASE grid points changes with latitude. The region off the Russian coast in the Laptev and East Siberian Seas is below 75°. Figure 7 shows that this is a region where there are more SIGRID points than EASE; consequently, nearest neighbor regridding results in high rates of data loss. In the latter part of the year when there are fewer grid points digitized in other areas, this data loss is more prominent.
Figure 14 shows a similar trend in data transfer of the partial concentration and stage of development of the 1st thickest ice. However, Figure 15, a plot illustrating the data transfer of the form of the 1st thickest ice, is different. The first ten AARI files are regridded to EASE with a net loss of 20% of the data. The loss is constant thru these files because the grid points are consistently located on the northern Russian coast. However, when the number of grid points with the ice form of the 1st thickest ice becomes small and restricted to a higher latitude area, the islands of North Land, then the effect of data replication becomes apparent (i.e., over 100% data regridded to EASE). Figure 15 shows that in August and September, there are few SIGRID points digitized with the form of the 1st thickest ice (see Table 9). The reason is simple, AARI records only the ice form of fast ice and in the warm summer months, the ice retreats and the land fast ice melts. Table 9 shows how on 720804 and 720907 EASE-Grid has more ice form grid points than AARI. Thus, the net effect of the nearest neighbor regridding on the AARI data is to increase the number of grid points in EASE. In contrast to the data loss, this is because the SIGRID grid points are located in an area where the EASE grid spacing is less than SIGRID. Thus, there are more EASE grid points than SIGRID and more than one of the EASE grid cells is filled with the value from a single SIGRID point.
For the 2nd and 3rd thickest ice types, similar trends in data transfer as those of the 1st thickest ice were observed.
The images presented in this section show the spatial coverage of each AARI parameter. The images are in the EASE-Grid projection. The colors represents the number of files (regridded by nearest neighbor) that contain information for that grid cell. Figure 16 shows the total ice concentration. Figure 17 shows the concentration, stage, and form of the 1st thickest ice. Figure 18 shows the concentration, stage and form of the 2nd thickest ice. Figure 19 shows the concentration, stage and form of the 3rd thickest ice.
The following images are examples of nearest neighbor regridding to the 25.0 km EASE-Grid.
This example shows the most complete coverage of the 1972 files. Figure 20a is the EASE-Grid image of total ice concentration for this date. The data covers the entire western Arctic and extends south down the western coast of Russia. AARI digitized almost all the grid points as 100% ice cover. Figure 20b shows the partial concentration of the 1st thickest ice. Note that the pixels that had 0% total ice concentration, ice free, in the Barents Sea, do not have additional ice information in this layer of AARI data. Figure 20c shows the stage of development of the 1st thickest ice. Figure 20d shows where the AARI digitized ice form of the 1st thickest ice. As expected, since AARI seems to only record fast ice, these pixels are along the coast lines. Figure 20e shows the stage of development of the 2nd thickest ice. Note that there is no multi-year ice in the image. The thicker multi-year ice is described in the images of the 1st thickest ice.
This mid-summer image, Figure 21 shows a decrease in ice concentrations and an increase in the amount of ice free area in the Barents, Kara, and Chukchi Seas. Also note the absence of higher latitude grid points in comparison to Figure 20a.
This image, Figure 22, shows the minimum ice extent in the AARI files. Ice free areas have developed in the Laptev and East Siberian Seas. The ice extent is near the average for these areas. In the Kara Sea, the data shows a slightly less than average extent of ice.
In this image of total ice concentration, Figure 23, only a small portion of the grid points digitized in previous files were recorded. As mentioned in Section 2.2b, AARI continued to digitize unknown values through the eastern Arctic. The image shows complete ice cover of areas that were ice free on August 15 (see Figure 22).
All the EASE-Grid images show that grid points over land were commonly recorded as unknown. In addition, they confirm the assertion in Section 2.2b that approximately 50% of the content of the AARI files is information on sea ice. Also seen is that sometimes land values are present in the ocean and, conversely, sea ice concentrations are located inland.
This section presents the results of regridding AARI 1972 data to the 25 km EASE-Grid. First, the effect of the transfer on data content is discussed. Secondly, examples of regridding are presented. The average time to regrid an AARI file to EASE-Grid was 864 seconds.
Area min/max uses all the SIGRID values to create two EASE-Grid images. Because the output at each EASE pixel is dependent on making a decision on the minimum and maximum values of different SIGRID points, it is difficult to summarize how well the data transfers from AARI to EASE. However, although all the SIGRID points are considered during the regridding, all are not necessarily represented in the two resulting images. Figure 24 presents a situation in which a SIGRID point would not be put in either the min or max EASE-Grid image. Again, since the output at each EASE grid point is dependent on the geometry of the two overlapping grids and the values at each point, it is difficult to estimate the number of grid points that do not transfer from AARI to EASE.
As mentioned in Section 3.2, the implementation of area min/max used to generate the examples in this document uses the most straightforward method for selecting the min and max values. Simply stated, the maximum ice value is put in the max image and the minimum ice value is put in the min image. What are the problems with this implementation?
1) The SIGRID cell with the maximum or minimum value may only cover a small portion of the EASE grid cell.
2) Sea ice values are chosen preferentially over land or unknown values, even though these non-ice values may cover a greater proportion of the EASE cell. This leads to the ice parameters spreading out or creeping over the EASE image.
3) Although represented in the same pixel (location) in EASE-Grid, the minimum value and maximum value come from different SIGRID grid points. This itself is not a problem, just a product of how the regridding scheme is defined.
3a) However, one problem comes when comparing one EASE max image to another EASE max image. For two different files, the values may come from different SIGRID points. In comparison to the consistency of the nearest neighbor approach, this makes it difficult for a user to work back from the EASE-Grid data to the original SIGRID.
3b) The major problem with this inconsistency in not using the same SIGRID point to assign the values to EASE-Grid comes when looking at the ten different layers of the AARI data. For example, consider an EASE grid cell with three overlapping SIGRID cells that have the values listed in Table 10. The resulting EASE grid max image would have values in its ten AARI layers that come from a different one of the three overlapping grid points. The results is an EASE grid cell with values different from any one of the three and inconsistent with the format of SIGRID (i.e., the partial concentrations add up to more than the total concentration).
The following images are examples of area min/max regridding.
Figures 25a and b show the min and max images of total ice concentration for the AARI files from the date April 4, 1972. In comparison to the nearest neighbor image for this date (Figure 20a) there are more pixels with values for ice concentration. For example, note the increase in the number of pixels along the "inland" side of the Russian coast that have 100% ice concentration. Also note, in the min and max images, the shrinking of the region of unknown values in the Kara Sea that is present in the nearest neighbor image.
On the whole, the EASE-Grid min and max images each have 10,402 pixels with total ice concentration values as opposed to 9,421 in the nearest neighbor image. This is due to EASE grid cells along the edge of the ice covered region being filled with an ice value if even a tiny portion of the cell is ice covered.
Figures 26a and b show the min and max images for the ice form of the 1st thickest ice for 720411. They are identical since AARI only digitized one code for ice form, fast ice. However, in comparison to the nearest neighbor EASE image, the area min/max images have many more fast ice pixels. The original SIGRID file had 1387 grid points with fast ice. The nearest neighbor image contains 1099, a 20% loss. The area min/max images, in contrast, have 1697 fast ice grid cells, a 22% increase. The nearest neighbor images lost SIGRID points within the area of fast ice. The points not transferred were within the fast ice region where the SIGRID spacing was much greater than EASE. Thus, although these points are not transferred to EASE, the resulting image accurately represents the SIGRID data since the area is homogeneous. The area min/max images, in contrast, add grid points along the edge of the fast ice region, essentially, expanding the coverage of fast ice. This is a result of the simple implementation of having ice values take precedence over non-ice values (as mentioned in Section 3.2). A better implementation, which considers how much of the EASE cell the SIGRID values cover, is presented in Section 5.2.
Figures 27a and b show the min and max images of total ice concentration for 720722. The min and max images are quite different. For example, the strip of 100% ice concentration (red pixels) in the Laptev Sea is larger in the max image as opposed to the min image. Furthermore, there is much more ice in the Kara Sea in the max image.
Based on observations of the results of regridding the 1972 AARI data to EASE-Grid, this section recommends improvements to the regridding schemes.
Because of the data loss problem in regridding AARI data to the 25.0 km EASE-Grid, the procedure was tested with the 12.5 km EASE-Grid. Since the data loss occurs when the SIGRID cells are more closely spaced than EASE, the higher resolution grid with half the distance between the grid points was expected to eliminate the data loss.
5.1a The 12.5 km Resolution Northern Hemisphere Projection of EASE-Grid
The EASE-Grid projections were developed under the SSM/I Pathfinder program for use with SSM/I data. Three equal area projections were defined: a global, a northern hemisphere and a southern hemisphere. The EASE-Grid for the northern hemisphere is a Lambert Equal-Area Projection in north polar aspect. It is based on a spherical model of the Earth with radius Re = 6371.228 km. The important aspects of this projection are its equal area, the azimuthal property showing true direction from center of the projection, and its scale invarience: the scale at a given distance from the center varies less from scale at the center than any of the other major azimuthal projections. For use with data from the SSM/I instrument, two grids with different scales were defined: a 25.0 km resolution for the lower SSM/I channels (19, 22, and 35 GHz) and a 12.5 km resolution for the higher channel (85 GHz). Table 11 shows the change in scale factors and grid cell dimensions with latitude for the 12.5 km EASE-Grid.
The nominal cell dimension of the 12.5 km grid is 12.5338 km. In order to remain equal area, the latitude spacing increases and the longitude spacing decreases from the north pole to the equator to preserve a nominal cell area of 157.0952 km2;. The projection is shown in Figure 28 with coast lines and a graticule of 15°; latitude and 30°; longitude overlaid (black pixels). In this projection, the corner pixels are in the southern hemisphere (negative latitude) and are shown in grey. These pixels are not used to represent data and are considered blank or non-valid pixels.
The EASE-Grid projection was established for the full hemisphere; however, the AARI data contain sea ice information over the polar regions only. Thus, a majority of a full EASE-Grid image resulting from regridding AARI data would show land and oceans at lower latitudes which are never covered by sea ice. In order to eliminate the need to store the full image when it is only partially filled with data, a subset of the EASE-Grid (see Figure 29) will be provided to the user. This subset covers the area in the northern hemisphere for which sea ice is likely to appear. The subset has the corner points at 29.8048°; latitude and 135°; W, 135°; E, 45°; W, and 45°; E longitude. These coordinates match the outside corners of EASE grid cells at the column,row coordinates of (360,360), (1080,360), (360,1080), and (1080,1080) respectively. The subset was selected to cover the same area as that defined for the "Pathfinder AVHRR (Advanced Very High Resolution Radiometer) Polar Products" (4) . The subsetted image has dimensions of 721 columns by 721 rows.
The grid point spacing of the 12.5 km EASE-Grid is half that of the 25.0 km EASE-Grid. As a result, the amount of data loss in regridding (which occurs when SIGRID grid points are more closely spaced than EASE grid points) was expected to decrease. Table 12 shows the number of grid points lost in regridding from AARI to the 12.5 km EASE is only 66 points. Compare this to the loss of 37,212 grid points (see Table 7) when regridding to the 25.0 km EASE-Grid.
As expected Table 12 also shows that regridding to the 12.5 km EASE-Grid also causes an increase in data replication. In fact, almost all the AARI grid points are regridded to more than one EASE cell. This is the consequence of regridding to a finer resolution grid. Users should be aware of this. However, data replication will be a problem when going to an equal area grid of any resolution since the AARI data in SIGRID format have such a variable grid spacing (see Figure 7). Only a projection that exactly replicated the spacing of the SIGRID format would be able to eliminate data loss and data replication. As a result, the best approach is to regrid to an EASE projection that, although increasing the amount of replication, eliminates the data loss. Thus, regridding the AARI data to the 12.5 km EASE-Grid is an improvement to the 25 km grid.
The following images are examples of AARI files regridded to the 12.5 km EASE.
Figure 30 shows the total ice concentration for file 720411 (a file from April, 1972). Figure 31 shows the total ice concentration for file 720722 (a file from July, 1972). Figure 32 shows the total ice concentration for file 720915 (a file from September, 1972).
Because the resolution of the 12.5 km EASE is double that of the 25.0 km EASE grid, there are four times as many grid points, and there is an approximately four-fold increase in the processing time to regrid from SIGRID to EASE-Grid. The average processing time is 240 seconds. In addition, there is a four-fold increase in the size of the EASE-Grid images.
Several problems with the simple approach to area min/max regridding were identified in Section 4.2a. The first two of these, selecting a min or max value that only covers a small portion of the EASE cell and selecting ice values preferentially over non-ice values, may be solved by including a consideration of the amount of the EASE grid cell each value covers when deciding on the min and max values. For example, if a value from a SIGRID cell is greater than all others but it covers less than 10% of the EASE grid cell, it is not selected as the maximum value. Instead, the next highest value covering 10% or more of the grid cell should be selected. Similarly, to solve the second problem, if the EASE grid cell is covered 50% or more by non-ice values, the EASE cell should not be filled with the minimum of maximum ice value, but rather with an appropriate non-ice value. These solutions will reduce the "growth" of ice when transferring from AARI to EASE.
Another problem encountered when examining the results of the area min/max resampling was an inconsistency in grid point selection between AARI files. The inconsistency in grid point selection between AARI files is a consequence of the definition of the regridding. The data loss problem of nearest neighbor regridding is averted by considering all the SIGRID points, although, which ones of the whole are selected as min and max values remains ambiguous in the resulting images. Still, additional EASE grid images with the location of the min and max grid points could be generated. These would have to be made for each AARI file.
The inconsistency in grid point selection among data layers for each EASE grid cell is a more serious problem. In the results presented in Section 4.2, the min and max for each data layer were determined separately. Rather than having values from a single SIGRID point, the results in the min and max images contain a combination of layers from all the grid points with area overlapping the EASE grid cell. Unfortunately, the result is not consistent between the layers. For example, for the maximum EASE result presented in Table 10, the total ice concentration (code of 92 = 100%) is not equal to the sum of the partial concentrations of the 1st, 2nd, and 3rd thickest ice types (codes of 92, 10, and 20 = 100%+10%+20% = 130%). The solution to this problem is to not determine the min and max values independently for each layer. Instead, the min and max values for each layer must come from the same SIGRID point. For example, this could be done by selecting the grid point with the min or max value of total ice concentration and, subsequently, transferring all ten AARI values at this grid point to the EASE image.
This section evaluates the regridding schemes based on the results of regridding the 1972 AARI files and the improvements discussed in Section 5. The evaluation is done in three parts. First, the regridding methods are evaluated for how accurately and consistently values are transferred from AARI to EASE. Secondly, the computer resource requirements of each method are presented. Finally, the methods are evaluated on what indirect impacts they may have on a scientific analysis of the AARI data in EASE-Grid. Based on these evaluations, one method is recommended for regridding the entire AARI data set to EASE-Grid in Section 7.
The primary concerns in regridding the AARI data in the SIGRID format to the EASE-Grid projection are: precise, accurate, complete and consistent transfer of the data.
The requirement to precisely transfer data from AARI to EASE eliminated several standard methods of regridding. NSIDC Special Report 5 (5) detailed problems with interpolation and area weighted averaging for regridding the AARI data. The AARI data values are symbolic codes. These methods combine values; consequently, some of the results are not acceptable AARI codes. The two regridding schemes considered in this document for regridding AARI data, nearest neighbor and area min/max, precisely regrid AARI codes.
Accurately regridding AARI data to EASE-Grid requires the locations of the regridded values to be as close as possible to the the original grid points. By definition, the "nearest neighbor" regridding accomplishes this task. The area min/max method, on the other hand, is not as accurate. This method may reject the closest value in favor of a higher or lower value that is further away.
While accurately regridding data, the nearest neighbor regridding to the 25.0 km EASE-Grid does not completely transfer all AARI grid points. Approximately 11% of the data are not represented in the image. In contrast, performing a nearest neighbor regridding to the 12.5 km EASE-Grid transfers 99.98% of the SIGRID points. The area min/max regridding does not explicitly represent all the SIGRID points in the two resulting EASE-Grid images. The method does, though, use or consider all the SIGRID values during the regridding process.
The final desirable characteristic of regridding methods is consistency. For nearest neighbor regridding, once the method has established the SIGRID points that are closest to the EASE grid points, the SIGRID points are transferred to the same locations in the EASE-Grid for each AARI file. In contrast, during area min/max regridding, a SIGRID point may transfer to the min image or max image or neither depending on its value and the values of the surrounding grid points. Thus, the same points are not always transferred to the same location in EASE-Grid.
6.2 Impacts on Computer Resource Requirements
The computer resource requirements for the various regridding methods are listed in Table 13. The requirements listed are for regridding one AARI file to EASE-Grid. These numbers are averages calculated from the results of regridding the entire 1972 AARI data set. In terms of processing time, nearest neighbor regridding to the 25.0 km EASE-Grid requires the least amount of resources, followed by nearest neighbor regridding to the 12.5 km EASE and, finally, area min/max regridding. In terms of storage requirements for the results of the regridding, nearest neighbor resampling results in EASE images that take up 1.24 Megabytes of space. Area min/max requirements are double this since two output images are created for each of the ten AARI data layers. Nearest neighbor regridding to the higher resolution 12.5 km EASE-Grid requires four times the storage of the 25 km results.
Computer storage requirements are lessened by using data compression. For the 1972 subset of AARI data in EASE-Grid, compression resulted in files that were, on average, 0.5% of the original volume. Using this compression ratio and generous estimates on the total number of AARI files, it is projected that the volume of the entire data set (using the most voluminous results from nearest neighbor regridding to the 12.5 km EASE-Grid) will compress to approximately 30 Megabytes. Thus, the requirements for computer storage of the data set in compressed form are minimal.
This section presents some impacts of the regridding on subsequent analysis of the AARI data in EASE-Grid. In other words, how will the results of analyzing the EASE-Grid images differ from a direct analysis of the AARI data in the SIGRID format. This section only mentions the possible impacts of regridding, but does not assess the magnitude of the effects. However, after a regridding method has been selected, additional work to quantify these impacts is recommended (see Section 8).
For a simple example of the effect of regridding on analysis, consider a researcher that is interested in ice concentration at a specific latitude and longitude. This user may find the closest point in the EASE image. However, the EASE-Grid point locations are different from the SIGRID points. A user may mistakenly think the EASE-Grid points are where the data values were actually gathered. Similar to this example is a question of how much does the location of the ice edge in EASE differ from one derived from the SIGRID data. With nearest neighbor regridding, it is possible to provide a data file that documents, for each EASE grid point, the SIGRID point (and thus the location) from which the value was transferred. Since the area min/max regridding selects the min and max values from many possible SIGRID grid point locations, it is not possible to provide a single file that gives the original location of the data.
Another impact of regridding is on calculations of the areal extent of ice cover in the EASE-Grid images. This is a simple calculation in EASE: 1) count up the number of ice covered pixels and 2) multiply the total by the amount of area covered by each pixel (a constant). However, how does this value compare to the a calculation from the AARI data in SIGRID format, in which the grid cell sizes are variable? This brings up a more basic issue of how much area does a grid cell in SIGRID cover as opposed to its regridded equivalent in EASE-Grid. A single SIGRID point may transfer to more than one EASE grid cell. What is the difference in the amount of area covered? For nearest neighbor resampling, the change in areal coverage can be easily computed. Furthermore, since the regridding method is constant, the calculation applies to all AARI files regridded by the method. For the more complex area min/max regridding, an estimate may be made for a single regridded file, but this would not be representative for other AARI files.
An NSIDC working group met to consider the regridding methods, the results for the 1972 subset of AARI data, and the evaluations contained in this document. As a result, nearest neighbor regridding to the 12.5 km EASE-Grid is recommended as the best technique for transferring the entire AARI data set to EASE-Grid. Based on the evaluations in Section 6, this method was the best at performing accurate, precise, complete and consistent data transfer from SIGRID to EASE-Grid. The resulting images are easily stored, retrieved and interpreted and satisfy the uses of the gridded product, including, visualization, ease of use and ease of application to climatological analysis, compositing, and incorporation into climate or sea ice models.
In order to address the issues raised in Section 6.3 further work is recommended. Additional information should be gathered on the selected regridding method (nearest neighbor resampling to the 12.5 km EASE-Grid). The additional information should include:
- For each EASE grid cell, the SIGRID cell from which the data transfered. That is, the latitude and longitude of the SIGRID point. This information is established during the definition of the regridding, that is, the selection of the "nearest neighbor" SIGRID points to the EASE grid points. Thus, a file can be provided to the user with the "locations" of SIGRID nearest neighbor for each EASE grid point.
- The difference in location between the EASE grid centers and their "nearest neighbor" SIGRID cells. Again, this is easily obtained during the definition of the regridding. Thus, a file can be provided to the user with this "distance" for each EASE grid point.
- Information quantifying the change in areal coverage of each SIGRID cell when regridding to EASE-Grid. This is readily computed by:< >calculating the area of a SIGRID cell.counting the number of EASE grid cells to which the SIGRID cell transfers by nearest neighbor regridding.multiplying the number of cells by the area that each EASE cell covers (a constant = 157.0952 km2).Information describing the overlap of SIGRID cells on EASE-Grid. That is, for each EASE cell tell the user which SIGRID cells overlap and how much area each have overlapping.
- Thompson, T. 1981. Proposed Format for Gridded Sea Ice Information (SIGRID), Report prepared for the World Climate Program, World Meteorological Organization.
- Brodzik, M.J., 1995. README file for NOAA/NASA Pathfinder Program SSM/I Level 3 Brightness Temperatures Equal-Area SSM/I Earth Grid (EASE-Grid) CD , National Snow and Ice Data Center, Boulder, Colorado.
- Brodzik, M.J., 1995. All About EASE-Grid, National Snow and Ice Data Center, Boulder, Colorado.
- Brodzik, M.J., 1995. Summary of NOAA/NASA Polar Pathfinder Grid Relationships, National Snow and Ice Data Center, Boulder, Colorado.
- Kokaly, R.F. 1996. Methods for Regridding AARI Data in the SIGRID Format to the EASE-Grid Projection, NSIDC Special Report No. 5, National Snow and Ice Data Center, Boulder, Colorado.
- Bushuev, A. V., R. G. Barry, V. M. Smolyanitsky, and V. J. Troisi. 1994. Format to Provide Sea Ice Data for the World Climate Program (SIGRID-2). Unpublished report to the World Meteorological Organization Commission for Marine Meteorology.