Data Management Overview
Welcome to the data management pages for data providers to the ORNL Distributed Active Archive Center (DAAC). These pages provide an overview of data management planning and preparation and offer practical methods to successfully share and archive your data.
- Plan – write a short data management plan while preparing your research proposal,
- Manage – assign logical, descriptive file names, define the contents of your data files, and use consistent data values when preparing your data,
- Archive – create metadata and documentation while finalizing your data to enhance search visibility and usability, and
- DAAC Curation – submit your data to the DAAC for active archival and use by the scientific community.
Benefits of Good Data Management PracticesWhy should you worry about good data management practices? Here are some short- and long-term benefits:
- Spend less time doing data management and more time doing research
- Easier to prepare and use data for yourself
- Collaborators can readily understand and use data files
- Long-term (data publication)
- Scientists outside your project can find, understand, and use your data to address broad questions
- You get credit for archived data products and their use in other papers
- Sponsors protect their investment
View a webinar on the benefits of good data management practices for or
Data Management Best PracticesThe ORNL DAAC has developed data management best practices for preparing environmental datasets for sharing and archival.
- Click on a best practice for more info
- Define the contents of your data files
- Assign descriptive dataset titles
- Assign descriptive file names
- Use consistent data organization
- Use stable file formats
- Preserve information
- Protect your data
- Provide documentation and metadata
- Perform basic quality assurance
Data management best practices
Follow these practices to improve your dataset's accessiblity and usability. These practices could be performed at any time during dataset preparation, but are most useful when considered during the project planning and implemented during data collection. These practices need not be completed sequentially.
Assign descriptive file names
Names should contain only numbers, letters, dashes, and underscores. Ideally, names may contain the project acronym, study title, location, investigator, year(s) of study, data type, version number, and file type.
Jump to Assign descriptive file names for more information.
Use consistent data organization
One organizational style is multiple rows, each with common-separated values. Another style is individual columns for each value. Be sure to provide a definition for all coded values or abbreviations.
Jump to Use consistent data organization for more information.
Define the contents of your data files
Provide names, units of measure, formats, and definitions of coded values. Be consistent.
Jump to Define the contents of your data files for more information.
Save raw data file with no transformations or analyses as "read-only". Use a scripted language to process data in a separate file.
Jump to Preserve information for more information.
Use stable file formats
Text-based comma separated values are ideal. Avoid proprietary formats that may not be readable in the future.
Jump to Use stable file formats for more information.
Perform basic quality assurance
Check that there are no missing values for key parameters. Scan and/or plot for impossible and anomalous values. Perform and review statistical summaries.
Jump to Perform basic quality assurance for more information.
Consider what a future investigator needs to know in order to obtain and use your data.
Jump to Provide documentation for more information.
Protect your data
Ensure that file transfers are done without error by comparing checksums before and after transfers. Create and test back-up copies often to prevent the disaster of lost data.
Jump to Protect your data for more information.
Assign descriptive dataset titles
A descriptive title should briefly describe your dataset and will help workers search for and identify your dataset as pertinent and useful for future research.
Jump to Assign descriptive dataset titles for more information.
A more detailed explanation of our Data Management Best Practices can be found in our Best Practices for Preparing Environmental Data Sets to Share and Archive.
Models are a significant aspect of environmental science. Model products contain the methodological detail of numerical modeling studies. These best practices and our recommendations for model archival should be followed for the archival of model products.
More details of our Data Management Best Practices can be found in Best Practices for Preparing Environmental Data Sets to Share and Archive, published by ORNL DAAC in 2010, and Environmental Data Management Best Practices Part 1 Tabular Data and Part 2 Geospatial Data, from the NASA Earthdata webinar series and presented by ORNL DAAC staff (hosted on YouTube).