Documentation Revision Date: 2025-03-04
Dataset Version: 1
Summary
This dataset holds 904,472 audio recordings in waveform audio file format (WAV) format, one compressed zip archive holding 5753 photographs in JPEG format, and one file in comma separated values (CSV) format.
Citation
Clark, M., L. Salas, R. Snyder, A. Lee, A.A. Turner, C. Seymour, J. Measey, A. Ferraz, F.D. Schneider, and F.O. Adegbola. 2025. BioSCape: BioSoundSCape Acoustic Recordings, South Africa, 2023. ORNL DAAC, Oak Ridge, Tennessee, USA. https://doi.org/10.3334/ORNLDAAC/2372
Table of Contents
- Dataset Overview
- Data Characteristics
- Application and Derivation
- Quality Assessment
- Data Acquisition, Materials, and Methods
- Data Access
- References
Dataset Overview
This dataset holds in situ sound recordings from sites in Greater Cape Floristic Region (GCFR), South Africa from June to December 2023. The recordings were collected as part of the Biodiversity Survey of the Cape (BioSCape) project, a multi-agency, NASA-led research project that integrates airborne imaging spectroscopy and lidar with a suite of measurements of biodiversity. BioSoundSCape is a BioSCape subproject seeking to relate ground-based measures of bioacoustic diversity to remote imagery. AudioMoth recorders were deployed at sites for 4 to 10 days of data collection (median = 7), and programmed to record 1 min of every 10, thus providing temporal sampling through day and night. Each recording was saved in a waveform audio file format with 16-bit digitization depth and a 48 kHz sampling rate. The recordings contain a wide range of environmental sounds such as biophony (e.g., birds, frogs, insects), anthropophony (e.g,. automobiles, airplanes) and geophony (e.g,. wind, rain). Sampling locations were stratified with respect to elevation, broad land use/land cover types, and time since wildfire disturbance. Most sites were within protected fynbos and Afromontane forest ecosystems. There were 538 sites in the wet season and 543 sites in the dry season, with most sites co-located between seasons. All sites were located within AVIRIS-NG hyperspectral acquisitions and 61% of sites were in LVIS lidar acquisitions. The dataset includes site information in tabular form and photographs of field sites.
Project: Biodiversity Survey of the Cape (BioSCape)
The Biodiversity Survey of the Cape (BioSCape) is an international collaboration between South Africa and the United States to study biodiversity in South Africa’s Greater Cape Floristic Region (GCFR). The GCFR was selected due to two exceptional hotspots of both terrestrial and aquatic biodiversity. The GCFR is listed among the World’s 200 Significant Ecoregions. The BioSCape is an integrated field and airborne campaign occurring in 2023. The campaign will collect UV/visible to short wavelength infrared (UVSWIR) and thermal imaging spectroscopy and laser altimetry LiDAR data over terrestrial and aquatic targets using four airborne instruments: Airborne Visible InfraRed Imaging Spectrometer - Next Generation (AVIRIS-NG), Portable Remote Imaging SpectroMeter (PRISM), Land, Vegetation, and Ice Sensor (LVIS), and Hyperspectral Thermal Emission Spectrometer (HyTES). The anticipated airborne data set is unique in its size and scope and unprecedented in its instrument combination and level of detail. These airborne data will be accompanied by a range of biodiversity-related field observations. BioSCape’s primary objective is to understand the structure, function, and composition of the region’s ecosystems, and to learn about how and why they are changing in time and space.
Related Publications:
Cardoso, A.W., E. Hestir, J. Slingsby, C. Forbes, G. Moncrieff, W. Turner, A. Skowno, J. Nesslage, P. Brodrick, K. Gaddis, and A. Wilson. 2024. The Biodiversity Survey of the Cape (BioSCape): towards inclusive international biodiversity science. Preprint. https://doi.org/10.5281/zenodo.11126718
Turner, A.A., M.L. Clark, L. Salas, C. Seymour, R.L. Snyder, A.T.K. Lee, A. Ferraz, F. Schneider, J. Measey, J. Huisamen, D. Cloete, S.D. Hofmeyr, C. Hagen, D.F. Leland, W. Schackwitz, F. Adebgola, E. Hahndiek, G.S. Joseph, J. Van Rooi, M. Fuchs, S. Thomas, S. Madlala, J. Spiby, and P. Taljaard. 2024. BioSoundSCape: a bioacoustic dataset for the Fynbos biome. (in preparation).
Acknowledgements:
The authors thank the 31 volunteers and CapeNature Protected Area rangers who deployed and retrieved AudioMoths over two field campaigns. CapeNature and SANParks provided access to their protected areas.
The BioSCape Project is a multi-agency, NASA-led research project that integrates airborne imaging spectroscopy (AVIRIS NG) and lidar (LVIS) with a suite of surface measures of biodiversity. Funding support for BioSoundSCape was provided by NASA grants 80NSSC22K0830 and 80NSSC23K1459. BioSCape was supported by NASA grant 80NSSC21K0086, the South African government (NRF/SAEON), and the United Nations Educational, Scientific and Cultural Organization (UNESCO).
Data Characteristics
Spatial Coverage: Greater Cape Floristic Region (GCFR), South Africa
Spatial Resolution: Point - sounds within 50 m of audio recorder locations
Temporal Coverage: 2023-06-05 to 2023-12-16
Temporal Resolution: Recordings of 1-min duration taken every 10 min for 4-10 day periods (median = 7 days)
Site Boundaries: Latitude and longitude are given in decimal degrees.
Site | Westernmost Longitude | Easternmost Longitude | Northernmost Latitude | Southernmost Latitude |
---|---|---|---|---|
Greater Cape Floristic Region, South Africa | 18.0123 | 23.9154 | -31.3666 | -34.8155 |
Data File Information
This dataset holds 904,472 audio recordings in waveform audio file format (WAV) format, one compressed zip archive holding 5753 photographs in JPEG format, and one file in comma separated values (CSV) format.
The file naming convention for the audio recordings is <device>_<begin_date>_<rec_time>.WAV, where
- <device> = 8-character device number (e.g., s2lam001)
- <begin_date> = date when recordings began at site in YYMMDD
- <rec_time> = start date and time for 1-minute recording in YYYY-MM-DD_hh-mm
The combination of <device> and <begin_date> uniquely identifies each survey site, since each device can only be at one location on a specific date.
Example file name: s2lam001_230713_2023-07-13_12-30.WAV (sampling with device s2lam001, which began on 2023-07-13 at location s2lam001_230713; this specific recording started at 12:30 pm on 2023-07-13).
The file BioSCape_acoustic_sites.csv holds information about each site (Table 1). Missing data are coded as "NA" or "-9999" for text and numeric fields, respectively
The compressed zip archive BioSCape_acoustic_sites_photos.zip holds 5753 photographs taken at study sites. Images include the mounted recorder, views from the recorder location toward the N,E,S,W cardinal directions, and upward views for sites with forest canopy.
Within the archive, the file naming convention for these photographs is <device>_<begin_date>_<photo>.jpg (e.g., s2lam001_230713_west.jpg), where
- <photo> = "mounted" (an image of the recorder), the viewing direction "north", "east", "south", or "west", or "upward".
Table 1. Variables in BioSCape_acoustic_sites.csv.
Variable | Description |
---|---|
SiteID | Unique site ID, given as s2lamXXX_YYMMDD where s2lamXXX is the AudioMoth unit ID, and YYMMDD is the date of deployment. Most locations were sampled in both the wet and dry season of 2023. These sites have unique SiteIDs. Paired samples are tracked with the "PairID" field. |
AVIRIS | "Y" = located in a BioSCape AVIRIS-NG flight box; otherwise "N" (not in flight box) |
LVIS | "Y" = located in a BioSCape LVIS flight box; otherwise "N" (not in flight box) |
WetlandType | Wetland class: "Channelled valley-bottom", "Depression", "Estuary", "Floodplain", "River", "Seep", or "Wetland Flat". Classes were obtained from the NBA2018_National_Wetland_map5 GIS layer. |
LandCoverClass | Land cover class: "Forest-Woodland", "Grassland", "Shrubland", "Shrubland-Fynbos", "Shrubland-Karoo", "Shrubland-Karoo-Nama", "Bare", "Wetland", "Water", "Agriculture", "Urban-Residential", "Roads & Rail", or "Excluded" (planted forests, mines, and landfills). Classes were derived from the 2020 South African National Land Cover (SANLC) GIS layer. |
ElevationClass | Site elevation: "Low 0-500 m", "Medium 500-1000 m", or "High >1000 m" derived from SRTM DEM layer. |
FireClass | Fire history: "No data or No Fire", "1-to-6 years", "6-to-12 years", "12-to-17 years", "17-to-25 years", or "25+ years". Classes were derived from the CapeNature "veld age" GIS layer; these data are only available for sites in CapeNature reserves. |
FieldWetlandType | Wetland type from field: "Not near (100 m) any water", "Within 50 m of a river or stream", "In or near (25 m) wetland", or "In or near (10 m) seep". This field was only available for the dry season campaign and was not always entered. Entered by rangers/volunteers at the site. |
FieldVeldAge | An estimate of the age of the site, mainly for fynbos sites: "Young (burned, <6yrs)", "Intermediate (6-17 years)", and "Old (>17 yrs)". This field was only available for the dry season campaign and was not always entered. Entered by rangers/volunteers at the site. |
FieldAliensWithin20m | An estimate of alien plant species infestation: "Rare to Very Scattered", "Scattered to Medium", "Dense to Closed", or "None". This field was only available for the dry season campaign and was not always entered. Entered by rangers/volunteers at the site. |
Latitude | Latitude in decimal degrees, datum WGS 84 |
Longitude | Longitude in decimal degrees, datum WGS 84 |
PairID | Unique ID that indicates which wet and dry season sites belong to the same pair |
RecordingNum | Number of audio recordings available for the site |
Application and Derivation
Passive acoustic monitoring of the environment can provide information on overall ecosystem status and change as well as on sound-producing wildlife, including birds, amphibians, insects and mammals (Balantic and Donovan, 2020; Gibb et al., 2019). Bioacoustic analysis allows automatic detection of bird presence with greater sampling in time and space than with traditional bird observations (Campos-Cerqueira and Aide, 2016; Furnas and Callas, 2015), removes the influence of human presence on animal vocalization during sampling, and reduces individual observer bias. Autonomous recording units and bioacoustic analysis have been used to monitor bird diversity and broad soundscape components of anthrophony (e.g., cars, airplanes), geophony (e.g., wind, rain), and biophony (e.g., birds, insects, mammals) at a landscape-level scales (Clark et al., 2023; Quinn et al., 2024; Snyder et al., 2022). Field surveys may be combined with remotely sensed data on land cover and habitat structure to study the distribution of species (e.g., Burns et al., 2020).
Quality Assessment
Invalid recordings were removed from this collection.
Data Acquisition, Materials, and Methods
The Greater Cape Floristic Region (GCFR) is a distinctive and biologically diverse area in southwestern South Africa that includes the Fynbos biome, Succulent Karoo biome, and patches of AfroTemperate Forest in the east and in fire refuges (MacPherson et al., 2019; Figure 2). It is a biodiversity hotspot experiencing significant anthropogenic and climate-driven challenges. The Biodiversity Survey of the Cape (BioSCape) project is testing the potential for using remotely sensed data for measuring biological diversity. In situ data collection was coordinated with overflights of four airborne instruments acquiring hyperspectral imagery in the UVSWIR and thermal ranges as well as laser altimetry (lidar).
BioSoundSCape is a BioSCape sub-project seeking to relate ground-based measures of bioacoustic diversity to the spectral and structural information from remotely sensed data. AudioMoth devices (Hill et al., 2019) versions 1.1.0 and 1.2.0 were deployed for environmental sound recording (Figure 1). They typically record sounds from within a 50-m radius, but this distance can vary with topography, geophony, and vegetation structure (Somervuo et al., 2023).
Site selection and AudioMoth deployment
BioSoundSCape team members, volunteers, and Protected Area rangers placed 2 to 15 AudioMoths at clusters of sample sites in 71 winter (wet season, June-August 2023) and 64 spring (dry season, October-December 2023) deployments. Sound data were collected at 569 unique sample sites: 539 sites in the wet season and 535 sites in the dry season, with 505 of these sites co-located as wet-dry season pairs.
Most sampling locations were co-located with existing vegetation or bird survey plots previously established by the BioSCape team. The remaining locations were selected using randomized stratified sampling within flight boxes (footprints) of AVIRIS-NG imagery. Stratification was based on land cover classes derived from the 2020 South African National Land Cover (SANLC) GIS layer: Forest-Woodland, Grassland, Shrubland, Shrubland-Fynbos, Shrubland-Karoo, Shrubland-Karoo-Nama, Bare, Wetland, Water, Agriculture, Urban-Residential, Roads & Rail, and Excluded (planted forests, mines and landfills). All sites were located within BioSCape AVIRIS-NG hyperspectral acquisitions, and 61% were in LVIS acquisitions (Figure 3).
The site data include location coordinates, date, time, and attribute information about the deployment point: approximate height of AudioMoth, estimated age of vegetation (time since last fire), presence of invasive alien plants, proximity to a wetland (Table 1), and photographs of the deployment site in all four cardinal directions as well as a photograph of the mounted AudioMoth. Depending on vegetation at the site, AudioMoths were mounted on large tree or shrub branches, bundles of small shrub stems, or rock crevices (Figure 1). After 4 to 10 days of data collection (median = 7), teams retrieved devices, downloaded the audio data, and prepared the AudioMoths for subsequent deployments.
AudioMoths were programmed to record one minute of audio at the start of every ten-minute interval. This sampling strategy balanced the needs of gathering comprehensive acoustic data with memory, power, and battery life constraints, allowing a representative sample of the acoustic environment throughout the study period. Each recording was saved in a waveform audio file format with 16-bit digitization depth and a 48 kHz sampling rate. The recordings contain a wide range of environmental sounds such as biophony (e.g., birds, frogs, insects), anthropophony (e.g,. automobiles, airplanes) and geophony (e.g,. wind, rain).
Figure 2. The distribution of the Fynbos, Forest, and Succulent Karoo Biomes within the Greater Cape Floristic Region (GCFR), South Africa. Data from the South Africa National Vegetation Map.
Figure 3. Location of sampling points relative to AVIRIS-NG and LVIS flight boxes and vegetation types.
Data Access
These data are available through the Oak Ridge National Laboratory (ORNL) Distributed Active Archive Center (DAAC).
BioSCape: BioSoundSCape Acoustic Recordings, South Africa, 2023
Contact for Data Center Access Information:
- E-mail: uso@daac.ornl.gov
- Telephone: +1 (865) 241-3952
References
Balantic, C.M., and T.M. Donovan. 2019. Statistical learning mitigation of false positives from template-detected data in automated acoustic wildlife monitoring. Bioacoustics 29:296–321. https://doi.org/10.1080/09524622.2019.1605309
Burns, P., M. Clark, L. Salas, S. Hancock, D. Leland, P. Jantz, R. Dubayah, and S.J. Goetz. 2020. Incorporating canopy structure from simulated GEDI Lidar Into bird species distribution models. Environmental Research Letters 15:095002. https://doi.org/10.1088/1748-9326/ab80ee
Campos-Cerqueira, M., and T.M. Aide. 2016. Improving distribution data of threatened species by combining acoustic monitoring and occupancy modelling. Methods in Ecology and Evolution 7:1340–1348. https://doi.org/10.1111/2041-210X.12599
Cardoso, A.W., E. Hestir, J. Slingsby, C. Forbes, G. Moncrieff, W. Turner, A. Skowno, J. Nesslage, P. Brodrick, K. Gaddis, and A. Wilson. 2024. The Biodiversity Survey of the Cape (BioSCape): towards inclusive international biodiversity science. Preprint. https://doi.org/10.5281/zenodo.11126718
Clark, M.L., L. Salas, S. Baligar, C.A. Quinn, R.L. Snyder, D. Leland, W. Schackwitz, S.J. Goetz, and S. Newsamt. 2023. The effect of soundscape composition on bird vocalization classification in a citizen science bioscape monitoring project. Ecological Informatics 75:102065. https://doi.org/10.1016/j.ecoinf.2023.102065
Furnas, B.J., and R.L. Callas. 2015. Using automated recorders and occupancy models to monitor common forest birds across a large geographic region. The Journal of Wildlife Management 79:325–337. https://doi.org/10.1002/jwmg.821
Gibb, R., E. Browning, P. Glover-Kapfer, and K.E. Jones. 2019. Emerging opportunities and challenges for passive acoustics in ecological assessment and monitoring. Methods in Ecology and Evolution 10:169–185. https://doi.org/10.1111/2041-210X.13101
Hill, A.P., P. Prince, J.L. Snaddon, C.P. Doncaster, and A. Rogers. 2019. Audiomoth: a low-cost acoustic device for monitoring bioscape and the environment. Hardwarex 6:E00073. https://doi.org/10.1016/j.ohx.2019.E00073
MacPherson, A. J., L. Gillson, and M. T. Hoffman. 2019. Between- and within-biome resistance and resilience at the fynbos-forest ecotone, South Africa. The Holocene 29:1801–1816. https://doi.org/10.1177/0959683619862046
Quinn, C.A., P. Burns, P. Jantz, L. Salas, S.J. Goetz, and M.L. Clark. 2024. Soundscape mapping: understanding regional spatial and temporal patterns of soundscapes incorporating remotely-sensed predictors and wildfire disturbance. Environmental Research: Ecology 3:025002. https://doi.org/10.1088/2752-664X/ad4bec
Somervuo, P., P. Lauha, and T. Lokki. 2023. Effects of landscape and distance in automatic audio based bird species identification. The Journal of the Acoustical Society of America 154:245–254. https://doi.org/10.1121/10.0020153
Turner, A.A., M.L. Clark, L. Salas, C. Seymour, R.L. Snyder, A.T.K. Lee, A. Ferraz, F. Schneider, J. Measey, J. Huisamen, D. Cloete, S.D. Hofmeyr, C. Hagen, D.F. Leland, W. Schackwitz, F. Adebgola, E. Hahndiek, G.S. Joseph, J. Van Rooi, M. Fuchs, S. Thomas, S. Madlala, J. Spiby, and P. Taljaard. 2024. BioSoundSCape: a bioacoustic dataset for the Fynbos biome. (in preparation).
Snyder, R., M. Clark, L. Salas, W. Schackwitz, D. Leland, T.Stephens, T. Erickson, T. Tuffli, M. Tuffli, and K. Clas. 2022. The Soundscapes to Landscapes Project: development of a bioacoustics-based monitoring workflow with multiple citizen scientist contributions. Citizen Science: Theory and Practice 7:24. https://doi.org/10.5334/cstp.391