Supported Datasets and Annotations
This table is provided as a guide for users to select appropriate datasets. The list of annotations omits some metadata for brevity, and we document the dataset’s primary annotations only. To access comprehensive details and API documentation for each dataset, please consult the section dataset loaders within the documentation.
“Downloadable” possible values:
✅ Freely downloadable
📺 Youtube Links only
❌ Not available
Tasks Codes (More information at the bottom of the page):
Please note that you can click on each tag to access more information related to that specific usecase.
Annotation Types
The table above provides annotation types as a guide for choosing appropriate datasets. Here we provide a rough guide to the types in this table, but we strongly recommend reading the dataset specific documentation to ensure the data is as you expect. To see how these annotation types are implemented in Soundata see Annotations.
Events
Sound events with a start time
, end time
, label
, and confidence
. Events are used to represent annotations for:
Sound Event Detection (SED) - strong labels
Spatial Events
Spatial events represent annotations used for various applications, including spatial event detection and tracking. Similar to Sound Events, Spatial Events include essential attributes such as start time
, end time
, label
, and confidence
to characterize and annotate spatial phenomena. This can be extended to include additional attributes specific to the application, such as geographical coordinates (latitude, longitude), altitude, direction (azimuth and elevation), and distance from reference points.
Spatial events are used to represent annotations for:
Sound Event Detection (SED) + Sound Event Localization (SEL)