Updated May 2, 2018
General format requirements:
- Use ".csv" as the file extension
- Preferred: File names in snake case i.e. lowercase with underscores: "example_file_name.csv"
- Single header row with descriptive column names
- Include one column that can be used as the unique identifier. The data type of that column can be either text or number.
- Do not include URLs in the data file.
- Columns delimited by commas
- No empty lines or rows
- Each row should have the same number of columns.
For spatial files:
- Latitude and longitude in separate columns with names: "latitude" and "longitude"
- Latitude and longitude in decimal degrees
- If a data file is not in WGS84 (EPSG:4326) CRS, use CoordX and CoordY as the column names
- If coordinates are not in WGS84 (EPSG:4326) CRS, an optional *.prj file (same as the Shapefile format) can be used to explicitly specify the CRS.
Data content and formatting:
- Column names in snake case. i.e. lowercase with underscores: "example_column_name" not spaces
- Missing data for numeric columns = -9999 or additional decimal 9’s to an appropriate level of precision
- Missing data for text columns = "NA"
- If a text field includes commas and quotes, save the file with the option of placing quotes around all text fields.
- Dates and times in 24 hour UTC. No local time zones unless UTC is also provided.
- Dates in YYYY-MM-DD format
- Times in hh:mm:ss (or hh:mm:ss+nn if time zone needs to be included) format
- Named sites and locations should have an associated geographic location
- Text and numeric data should not be mixed in the same column
(see Best Practices for Preparing Environmental Data Sets to Share and Archive).
This is an example of mixed text and numeric data:
|4 to 5||25|