Updated May 2, 2018
General format requirements:
- Use ".csv" as the file extension
- Preferred: File names in snake case i.e. lowercase with underscores: "example_file_name.csv"
- Single header row with descriptive column names
- Include one column that can be used as the unique identifier. The data type of that column can be either text or number.
- Do not include URLs in the data file.
- Columns delimited by commas
- No empty lines or rows
- Each row should have the same number of columns.
For spatial files:
- Latitude and longitude in separate columns with names: "latitude" and "longitude"
- Latitude and longitude in decimal degrees
- If a data file is not in WGS84 (EPSG:4326) CRS, use CoordX and CoordY as the column names
- If coordinates are not in WGS84 (EPSG:4326) CRS, an optional *.prj file (same as the Shapefile format) can be used to explicitly specify the CRS.
Data content and formatting:
- Column names in snake case. i.e. lowercase with underscores: "example_column_name" not spaces
- Missing data for numeric columns = -9999 or additional decimal 9’s to an appropriate level of precision
- Missing data for text columns = "NA"
- If a text field includes commas and quotes, save the file with the option of placing quotes around all text fields.
- Dates and times in 24 hour UTC. No local time zones unless UTC is also provided.
- Dates in YYYY-MM-DD format
- Times in hh:mm:ss (or hh:mm:ss+nn if time zone needs to be included) format
- Named sites and locations should have an associated geographic location
- Text and numeric data should not be mixed in the same column
(see Best Practices for Data Management).
This is an example of mixed text and numeric data:
estimated_depth | shrub_cover |
---|---|
4 | 30 |
6 | > 75 |
4 to 5 | 25 |
5 | 65 |