IntroductionAt the request of field researchers, investigators, GIS and image specialists, and data managers, we prepared the following data management practices that data collectors and providers should follow to improve the usability of their data sets. We assembled what we feel are the most important practices that researchers should implement to make their data sets ready to share with other researchers and to submit to the ORNL DAAC for archiving and distribution.
This guidance is provided for those who perform environmental measurements, compile data from various sources, prepare GIS coverages, and compile remote sensing images for environmental applications, although many of the practices may be useful for other data collection and archiving activities. These practices could be performed at any time during the preparation of the data set, but we suggest that researchers plan for them before measurements are taken and implement them during measurements.
During Data PreparationThe ORNL DAAC offers the following Best Practices that investigators should perform to improve the usability of their data.
- Define the Contents of Your Data Files
- Use Consistent Data Organization
- Use Consistent File Structure and Stable Formats For Tabular and Image Data
- Assign Descriptive File Names
- Perform Basic Quality Assurance
- Assign Descriptive Data Set Titles
- Provide Documentation
A version of these Best Practices were published in Cook et al. (2001).
In order to archive and distribute various data sets, we need metadata, which is information about the data we distribute. Metadata is used both to describe the data so that others can understand what it represents and to find data of interest. Metadata is used both to describe the data so that others can understand what it represents and to find data of interest. Metadata can be in the form of a document or a specially formatted list of the parameters, keywords, spatial and temporal extent, investigators, and other information about the data set. Please contact the DAAC by e-mail for assistance in preparing and formatting metadata.
The metadata accompanying your data should be written for a user 20 years into the future -- what does that person need to know to use your data properly? Prepare the metadata for a user who is unfamiliar with your project, methods, or observations.
A small amount of time invested in documenting your data will save money in the future. Data producers and users cannot afford to be without documented data. The initial expense of documenting data clearly outweighs the potential costs of duplicated or redundant data generation.
See the Best Practices section on Provide Documentation for a description of the metadata that should be in the data set documentation.
Submitting Data to the ORNL DAAC
All of the holdings at the ORNL DAAC are organized into what are called "data sets," a term used loosely to include all data archived at the DAAC. A data set includes all the information associated with a single research effort (typically the same investigator(s), same methods, possibly several sites or years). Data set components consist of the data and metadata stored in digital files. These files contain either tabular data, spatial data, or companion information.
Tabular data sets present and store your research results in list or spreadsheet style. These data will be stored in our archives as American Standard Code for Information Interchange (ASCII) text in your original column/row arrangement. Using ASCII file formats will ensure that your data are readable in the future.
- Example of Tabular Data: River Discharge (RivDIS)
- Example of Image Data: 1-Degree Land Cover Data Set, Southern Africa Subset
- Examples of Companion Information:
If you are ready to start preparing data sets to archive, please contact the DAAC for assistance.
Courtesy of American Scientist (Vol. 886, p. 525)