|
||||||
|
|
||||||
Special Reports
Computer rendition of Terra satellite. |
A screen shot of the ECS Distribution GUI. |
End users viewing and analyzing the data. |
Download the PDF version of this document (781 KB)
Maurer, J. 2004. An Introduction to the EOSDIS Core System (ECS) at NSIDC. NSIDC Special Report 12. Boulder, CO, USA: National Snow and Ice Data Center. Digital media.
The National Snow and Ice Data Center (NSIDC) and other NASA Distributed Active Archive Centers (DAACs) use ECS to ingest, archive, and distribute data. ECS stands for the EOSDIS Core System, and EOSDIS stands for the Earth Observing System Data and Information System. Raytheon develops and maintains this multi-million-dollar system for NASA using many of today's most advanced technologies.
The Earth Observing System (EOS) data sets that NSIDC collects come from the AMSR-E, GLAS, and MODIS instruments. NSIDC also collects many data sets that are not ingested, archived, or distributed through ECS. Examples include data from the AVHRR, SMMR, and SSM/I instruments. Procedures for handling non-ECS data sets are not covered in this document.
ECS is composed of multiple servers and commercial off-the-shelf (COTS) software products that run on several different machines networked together. The servers are individual C++ programs that handle specific tasks.
These machines, servers, and COTS work together to accomplish three main tasks related to ECS operations: ingest, archive, and distribution. Ingest refers to the acquisition of data from our external data providers. Archive refers to the transfer of ingested data onto a permanent storage device. Distribution refers to the transfer of archived data to users who request them. Each of these tasks is explained in greater detail below.
For a great introductory reference to NASA's Earth Science Enterprise, which includes EOSDIS and the DAACs, click here.
Before moving any further, let's first take a quick look at what type of data are archived via ECS.

Global data from multiple sensors on Terra.
Image by R.B. Husar, Washington U.
The data that ECS ingests, archives, and distributes are satellite remote sensing data products. Remote sensing involves obtaining information about an object without actually coming into contact with it. Photographs are a good example of remote sensing. Sensors obtain information not only in the visible portion of light, as with cameras, but they can also measure other portions of the electromagnetic spectrum (e.g., ultraviolet, infrared, and microwave radiation). These sensors may be housed in instruments used on the ground ("in situ"), on aircraft, or on satellites.
Image from NASA's Earth Observatory web site.
Satellite instruments normally measure radiation at discrete wavelengths of the electromagnetic spectrum called bands or channels. An everyday camera, for comparison, measures all of the light within the entire visible spectrum (400 nm to 700 nm wavelength) in a single band. The MODIS instrument, on the other hand, measures electromagnetic radiation at 36 individual bands between 400 nm and 14,500 nm, which spans from visible light to thermal infrared radiation. These bands range in width from 10 to 500 nm. Scientists use these bands to quantitatively assess properties of the Earth's land, oceans, and atmosphere that contribute to weather prediction, monitoring of natural disasters, global climate change assessment, and beyond.
Just as a camera views a certain amount of space through its lens (for example, you may have to back up in order to fit a person entirely in the field of view), satellite remote sensing instruments also have a limited field of view. A single MODIS data file, for example, covers a width of 2,300 km. By comparison, the Earth's diameter is 12,756 km. This gives MODIS the capability to view every part of the planet every 1-2 days.
Also, there is a limit to how small an area that a particular sensor views. For example, if you take a photograph of a person 1 km away, you cannot see the logo on his or her shirt, or even the color of his or her eyes. Similarly, remote sensing instruments have a specific resolution, which is the measure of the smallest object that one can "resolve," or view. In terms of a remotely-sensed image, this resolution is also often referred to as its pixel size. Resolution and pixel size thus describe the smallest area on the Earth's surface that a remote sensing instrument can view. MODIS views objects as small as 250 m in certain bands, while other MODIS bands have pixel sizes of 500 m or 1 km. For comparison, the ASTER satellite remote sensing instrument (which NSIDC does not collect data from) can resolve objects as small as 15 m. Resolution of an instrument depends on many factors, including its altitude in space, the wavelength that it is measuring, its method of collecting data, and its design.
Lastly, remote sensing data are often viewed as digital images, which involve the same concepts as computer screens or televisions, where three photon beamsred, green, and bluecreate all of the colors that we see. Any color can be generated by adding different relative amounts of these three primary colors (referred to as "RGB," for red, green, and blue).
The human eye cannot see beyond the visible portion of the electromagnetic spectrum; however, any combination of bands that measure radiance in ultraviolet, infrared, or microwave wavelengths can be assigned to the RGB bands to produce a color image. These images are referred to as false-color composites since they combine bands that are not in the visible portion of the spectrum.
True-color composites are created by displaying three bands that measure light in the red, green, and blue portions of the electromagnetic spectrum. A color photograph is a good example.
A simple greyscale image is generated by viewing one band at a time. Bright shades of grey correspond to places where the radiation is high in a given band, while dark shades of grey correspond to places where the radiation is low.
Many data products are the result of scientific analyses of the original remotely-sensed data. Snow extent and sea ice concentrations are examples. The images that result from these data usually include a legend that explains what the colors represent in the image. Below on the left is an image processed at NSIDC that shows sea ice concentrations derived from SSM/I data over the South Pole in June, 2002. (For a full resolution image, click here.) The color bar on the right-hand side of this image tells you what percentage of sea ice each color represents in the image. Below on the right is an image showing global sea surface temperatures derived from AMSR-E data:
A great place to learn about remote sensing and view images is NASA's Earth Observatory web site. They post a new image and brief explanation every day. You can also view an archive of their images. View this Earth Observatory page, too, for a great introduction to remote sensing principles.
Now that you have learned where remotely sensed data come from, let's discuss the three main components of ECS: ingest, archive, and distribution.

Antenna at a ground receiving station.
Image by the Alaska SAR Facility (ASF).
NSIDC's EOS data come from the Aqua, ICESat, and Terra satellites. Data are transmitted from satellites to ground receiving stations around the globe who, in turn, transmit the data to a central location: the Earth Observing System (EOS) Data and Operations System (EDOS) at the Goddard Space Flight Center in Greenbelt, Maryland. The raw data that EDOS collects are referred to as Level-0 data. EDOS transmits Level-0 data via ECS to the various DAACs.
At this point, science computing facilities (SCFs) are required to process the raw Level-0 data into products that are ultimately distributed to users. SCFs correct for various systematic errors introduced by the satellite before the raw data are distributed. These errors are corrected using precisely recorded position and attitude data from the satellite during the time of data acquisition (ephemeris data), or calibrated against other known measurements (ancillary data). After these errors are corrected and the data are referenced to time and geographic location, the data are considered Level-1. Time- and georeferencing involve recording the time of data acquisition and the latitude and longitude coverage in a metadata file to be distributed with the data. Level-1 data is the lowest level of data that is distributed to most users.
Higher-level data products are processed from the Level-1 data. Level-2 data use scientific algorithms to calculate one or more geophysical parameters from the Level-1 data. Examples include snow cover, sea ice extent, sea surface temperature, land cover type, vegetation indices, aerosol and ozone distribution. Level-2 data gridded to a uniform map projection are called Level-3 data. Lastly, Level-4 data are model outputs or results from scientific analyses derived from multiple measurements of lower-level data, for example, climate change analyses. Higher levels of processing provide users with more value and information to the raw data collected by the satellite. Some users, however, do not need derived geophysical products for their intended application or may prefer to implement their own processing procedures based on their own scientific algorithms, map projection schemes, and analyses, and will therefore order the Level-1 data for these purposes.
NSIDC archives Level-0 EOS data received directly from EDOS and Level-1 through Level-3 EOS data products received from external SCFs.
NSIDC receives all levels of AMSR-E data. The Level-0 AMSR-E data come to us from EDOS. Level-1 AMSR-E data are processed at the Japanese Aerospace Exploration Agency (JAXA) in Japan, sent to NASA's Jet Propulsion Laboratory (JPL) Physical Oceanography DAAC (PO.DAAC) in Pasadena, California, and then sent to NSIDC. The higher-level AMSR-E data products (Levels-2 and above) come to us from the AMSR-E Science Invesigator-led Processing Systems (SIPS) at the Global Hydrology and Climate Center (GHCC) in Huntsville, Alabama.
We also receive all levels of ICESat/GLAS data. Again, the Level-0 data come from EDOS. Level-1 data and above come from the ICESat SIPS at the Goddard Space Flight Center.
NSIDC only receives MODIS Level-2 through Level-4 data, which come from the MODIS Data Processing System (MODAPS) SCF at the Goddard Space Flight Center in Greenbelt, Maryland.
NSIDC uses ECS to ingest all of the data listed above from EDOS, the AMSR-E SIPS, the ICESat SIPS, and MODAPS via File Transfer Protocol (FTP). Each data file has an associated metadata file that stores information such as time of acquisition, size, geographic coordinates, and other information that is important for a user to know.

StorageTek Powderhorn® 9310 tape library
Image by StorageTek.
The next step after ingesting data is to archive it. Archiving is the process of writing data to media for long-term storage. EOS data at NSIDC are archived on a StorageTek Powderhorn® tape library (pictured above). This automated tape library can hold up to 5,000 StorageTek 9940 tapes, each of which has a storage capacity of roughly 120 GB of data. Advanced Digital Information Corporation's (ADIC) AMASS (Archival Management And Storage System) software maintains a database of files and the location of tapes where the files reside. Tapes are stored within the Powderhorn in slots on wall panels. The Powderhorn can then read from or write to any tape in its inventory by moving the tape from its home slot into one of several tape drives via a robotic arm. Each tape has a unique bar code on it that the robotic arm can optically sense, so it knows what tape it handles at any given time. Communication between AMASS and the Powderhorn is handled via StorageTek's Automated Cartridge System Library Software (ACSLS).
Tapes in the Powderhorn are referred to as volumes. Collections of tapes called volume groups are defined to store specific kinds of data. The Powderhorn is organized into various volume groups, each consisting of several volumes of a specific data type.
Backup copies of several ECS data sets collected at NSIDC are created and stored at a secure, climate-controlled offsite storage facility to protect against unexpected data losses and ensure for their long-term archival. These backup tapes are also periodically refreshed as a further safeguard against aging and potential corruption of media. For other data sets, NSIDC has an agreement with one of our external data providers for them to recover our data in the case of loss or corruption.

Image by Mack Trucks, Inc.
Despite the above picture, NSIDC does not actually use Mack trucks to distribute ECS data! ECS can distribute data either electronically or on one of the following media options: CD-ROM, DVD, 8-mm tape, or DLT (Digital Linear Tape). Electronic distribution of data is via FTP pull or FTP push. FTP pull is when the data are staged locally to a machine at NSIDC, and the user initiates an FTP session to retrieve (or "pull") the data to his or her own computer. An FTP push is when the data are automatically transferred (or "pushed") to a user-specified computer and directory path. The FTP push method requires that the user own a computer with a dedicated internet IP address or host name where ECS can push the data. This option is not typical for most home personal computers. In the case of an FTP pull request, ECS sends the user an automatic e-mail specifying the directory path needed to log into an NSIDC server and collect the data. These data are staged for three days, then automatically deleted to preserve space. All ECS data from NSIDC are currently distributed free of charge.
Orders for data on hard media are created via the ECS Production Distribution System (PDS). Originally developed by the Land Processes DAAC (LP DAAC) in Sioux Falls, South Dakota, this software communicates between ECS and various CD-ROM and DVD burners, 8-mm tape drives, and Digital Linear Tape (DLT) drives. After NSIDC Operations Staff successfully generate an order using this software, the media and printouts are then handed over to User Services to mail to the user. A CD-ROM holds up to 640 MB, a DVD holds 4.7 GB, an 8-mm tape holds ~5 GB of uncompressed data, and a DLT tape holds up to 40 GB of uncompressed data using "DLT 8000" tape drives, which NSIDC uses for PDS.
Data are written to CD-ROMs and DVDs via a Rimage PerfectImage® AutoStar CD/DVD producer. This hardware, controlled by Rimage software on an IBM personal computer, is composed of several writable CD and DVD drives, a rotating tray to place blank media in, a robotic arm that places blank media from these trays into the proper drives, and a label writer that burns text and icons onto the face of the media once data are written to it. Below is a picture of this device:

Rimage PerfectImage® AutoStar CD/DVD producer
Image by Rimage Corporation.
Users initiate orders for ECS data from NSIDC through one of the following means, which are described in the following paragraphs:
The Earth Observing System (EOS) Data Gateway (EDG) is the main ECS search-and-order web site. The EDG allows you to search for all publicly available ECS data by data type, location, time period, and by various parameters within the data type(s) selected (for example, percent cloud cover). Low-resolution quick-look ("browse") images are available for many data types that give you a sense of what certain data files, or granules, contain before you decide to order them. Spatial coverage maps are produced in the EDG to show where one or more files are located on the Earth.
Spatial coverage map for four AMSR-E Level-2A
granules as displayed on the EDG.
This map can be rotated online to view different parts of the Earth.
Users also have the option to subset certain data sets via the EDG using the HEW Subsetting Appliance (HSA), developed by the University of Alabama in Huntsville. Data can be subsetted both geographically (i.e. using a specified latitude and longitude bounding box) and by desired parameters within the data. An advantage of subsetting is that it reduces the size of the data being distributed, thereby reducing the user's FTP transfer time and necessary storage space. Another advantage, of course, is that it allows you to receive data solely for the location you are interested in. For example, a single MODIS Level-3 file covers an area of 2,300 km by 2,300 km (roughly half the size of Rhode Island), but you may only be interested in getting data for a smaller 25 km by 25 km region within that file (e.g. the city of Providence). When ordering files on the EDG, subsetting will either be listed as available or unavailable, depending on the data type, and is only available via FTP pull or FTP push.
Due to the number of options and data sets possible with the EDG, some users may prefer to use NSIDC's less complicated Search 'N Order Web Interface (SNOWI). This simpler and perhaps more intuitive web site has similar capabilities to the EDG, except that it does not provide quick-look images, spatial coverage maps, or subsetting—and it only provides data via FTP pull or FTP push.
Rather than place orders via the EDG or SNOWI, users may also contact NSIDC User Services to set up subscriptions to have specific data automatically sent to or staged for them upon ingest at NSIDC. This distribution method is ideal for those users who wish to receive new data on a continual basis. Note that this option is only available for new data as they are ingested and not for data already archived at NSIDC. Though you can select a custom location, subscriptions currently do not allow subsetting.
Users may also directly download data via the NSIDC Data Pool, a large FTP server that holds current and popular data sets. The Data Pool is continually updated with new data as they are ingested at NSIDC; data older than a specified age are continually purged. So, the Data Pool can be thought of as a rolling archive. Users can browse the contents of the Data Pool either using the web site or by initiating an anonymous FTP session to "ftp://n0dps01u.ecs.nasa.gov." These features allow users to directly download data rather than ordering data through the EDG or SNOWI and waiting for their orders to be staged from our automated tape library. Subsetting of certain data types in the Data Pool is handled using the HDF-EOS to GeoTIFF converter (HEG), which also allows various data format and map projection conversions.
Data Pool allows you to bookmark a particular search to find out what new granules that meet your search criteria have been ingested into the Data Pool since your last visit. Also, the Data Pool FTP site is structured in such a way that users can automate their own processes to download specific data, which are organized into predictable subdirectories by instrument, data type, and date.
Lastly, users who prefer to run Unix command-line utilities for searching and ordering data may contact NSIDC User Services to set up a Machine-To-Machine Gateway (MTMGW) account. This method of distribution provides access to all of NSIDC's publicly available ECS data without forcing the user to click through a number of screens and options using the EDG, SNOWI, or Data Pool web sites. The disadvantages of using MTMGW, however, are that it does not provide subsetting, and the user must learn the syntax for running command-line search-and-order utilities. Users interested in this method of distribution are given a user account on an ECS computer at NSIDC from which they can run scripts for searching and ordering data via secure shell (ssh). Instructions and sample scripts are provided. In order to use MTMGW, the user must first accept an Operations Agreement and sign a form outlining the acceptable use of computer systems at NSIDC. Please contact NSIDC User Services to obtain the necessary documents if this method of distribution would be of interest to you.
Distribution is the key component to the ECS system. Ingest and archival of data have little purpose if there are no people who obtain and use these data. The ECS system can thus be likened to a library, and NSIDC Operators and User Services Representatives to its librarians. NSIDC plays an important role in "handing off" data that start with the satellite and end with students, scientists, and organizations who put those data to use.
Please see the EOSDIS Acronyms list for a general list of acronyms. The following acronyms are used in this document:
ACSLS - Automated Cartridge System Library Software
ADIC - Advanced Digital Information Corporation
a.k.a. - also known as
AMASS - Archival Management And Storage System
AMSR-E - Advanced Microwave Scanning Radiometer - Earth Observing System
ASTER - Advanced Spaceborne Thermal Emission and Reflection Radiometer
AVHRR - Advanced Very High Resolution Radiometer
COTS - commercial off-the-shelf
DAAC - Distributed Active Archive Center
DLT - Digital Linear Tape
ECS - EOSDIS Core System
EDG - EOS Data Gateway
EDOS - EOS Data and Operations System
EOS - Earth Observing System
EOSDIS - Earth Observing System Data and Information System
ETM+ - Enhanced Thematic Mapper Plus
FTP - File Transfer Protocol
GHCC - Global Hydrology and Climate Center
GLAS - Geoscience Laser Altimeter System
GUI - Graphical User Interface
HEG - HDF-EOS to GeoTIFF converter
HSA - HEW Subsetting Appliance
ICESat - Ice, Cloud, and land Elevation Satellite
IP - Internet Protocol
JAXA - Japanese Aerospace Exploration Agency
JPL - Jet Propulsion Laboratory
LP DAAC - Land Processes DAAC
MODAPS - MODIS Data Processing System
MODIS - Moderate Resolution Imaging Spectroradiometer
MTMGW - Machine-To-Machine Gateway
NASA - National Aeronautics and Space Administration
NSIDC - National Snow and Ice Data Center
PDS - Production Distribution System
PO.DAAC - Physical Oceanography DAAC
NSIDC - National Snow and Ice Data Center
RGB - red, green, blue
SCF - Science Computing Facility
SIPS - Science Invesigator-led Processing Systems
SMMR - Scanning Multichannel Microwave Radiometer
SNOWI - Search 'N Order Web Interface
ssh - secure shell
SSM/I - Special Sensor Microwave/Imager
V0 - Version 0
Please see the Earth Observatory Glossary for a general list of earth science and remote sensing terms. The following terms are used in this document:
ancillary data - measurements from other sources or sensors used to calibrate remote sensing data.
archive - the transfer of ingested data onto a permanent storage device.
band- a.k.a. "channel"; a discrete range of wavelengths of the electromagnetic spectrum that a satellite remote sensing instrument measures.
browse image - a.k.a. "quick-look image"; a low-resolution image that gives the user a sense of what a data file contains.
channel- a.k.a. "band"; a discrete range of wavelengths of the electromagnetic spectrum that a satellite remote sensing instrument measures.
distribution - the transfer of archived data to users who request them.
electromagnetic spectrum - the entire array of electromagnetic radiation (e.g., ultraviolet, visible, infrared, and microwave).
ephemeris data - position and attitude data from the satellite during the time of remote sensing data acquisition.
false-color composite - an image that combines three bands that are not all in the visible portion of the spectrum.
FTP pull - an FTP session in which the user retrieves (or "pulls") data to his or her own computer.
FTP push - an FTP session in which data are automatically transferred (or "pushed") to a user-specified computer and directory path.
granule - the smallest data unit inventoried via ECS and distributed to users; typically, a granule is a single data file, though some granules may include multiple files.
ingest - the acquisition of data from external data providers.
Level-0 data - unprocessed remote sensing data.
Level-1 data - calibrated remote sensing data that have been referenced to time of acquisition and geographic location.
Level-2 data - geophysical data derived from Level-1 data.
Level-3 data - Level-2 data gridded to a uniform map projection.
Level-4 data - model outputs or results from scientific analyses derived from multiple measurements of lower-level remote sensing data.
metadata - information about remote sensing data that may include such things as time of acquisition, size, geographic coordinates, quality assessment, and other information that is important for a user of the data to know.
pixel size - a.k.a. "resolution"; the measure of the smallest object that a remote sensing instrument can "resolve," or view. This is captured in a remote sensing image as one picture element, or "pixel."
quick-look image - a.k.a. "browse image"; a low-resolution image that gives the user a sense of what a data file contains.
remote sensing - obtaining information about an object without actually coming into contact with it.
resolution - a.k.a. "pixel size"; the measure of the smallest object that a remote sensing instrument can "resolve," or view. This is captured in a remote sensing image as one picture element, or "pixel."
rolling archive - an archive that is continually updated with new data as they are ingested while data older than a specified age are continually purged.
servers - individual C++ programs that handle specific ECS tasks.
subset - to reduce a data file's contents to a specific geographic location and/or desired parameters within the data.
true-color composite - an image that combines three bands that measure light in the red, green, and blue portions of the electromagnetic spectrum.
volume - a tape of data in the archive.
volume group - a collection of tapes (or "volumes") in the archive that store a specific kind of data.