Supported Datasets and Annotations

This table is provided as a guide for users to select appropriate datasets. The list of annotations omits some metadata for brevity, and we document the dataset’s primary annotations only.

“Downloadable” possible values:

  • ✅ : Freely downloadable

  • 🔑 : Available upon request

  • 📺 : Youtube Links only

  • ❌ : Not available

Find the API documentation for each of the below datasets in Initialize a dataset.

Dataset

Downloadable?

Annotations

Clips

Hours

License

3D-MARCo

  • audio: ✅

  • annotations: ✅

Tags

26

0.3

https://licensebuttons.net/l/by-nc/3.0/80x15.png

DCASE23-task2

  • audio: ✅

  • annotations: ✅

Tags

174

21

https://licensebuttons.net/l/by/4.0/80x15.png

DCASE23-Task4B

  • audio: ✅

  • annotations: ✅

Events

49

3.16

https://licensebuttons.net/l/by-nc/3.0/80x15.png

DCASE23-Task6a

  • audio: ✅

  • annotations: ✅

Tags

6974

43.2

https://licensebuttons.net/l/by/4.0/80x15.png

DCASE23-Task6b

  • audio: ✅

  • annotations: ✅

Tags

6974

43.2

https://licensebuttons.net/l/by/4.0/80x15.png

DCASE-bioacoustic

  • audio: ✅

  • annotations: ✅

Events

174

21

https://licensebuttons.net/l/by/4.0/80x15.png

DCASE-birdVox20k

  • audio: ✅

  • annotations: ✅

Tags

20,000

55.5

https://licensebuttons.net/l/by/4.0/80x15.png
EigenScape
(HOA 25 ch)
  • audio: ✅

  • annotations: ✅

Tags

64

10.7

https://licensebuttons.net/l/by/4.0/80x15.png
EigenScape Raw
(32 ch)
  • audio: ✅

  • annotations: ✅

Tags

64

10.7

https://licensebuttons.net/l/by/4.0/80x15.png

ESC-50

  • audio: ✅

  • annotations: ✅

Tags

2000

2.8

https://licensebuttons.net/l/by-nc/3.0/80x15.png

Freefield1010

  • audio: ✅

  • annotations: ✅

Tags

7690

21.3

https://licensebuttons.net/l/by/4.0/80x15.png

FSD50K

  • audio: ✅

  • annotations: ✅

Tags

51197

108.3

https://licensebuttons.net/l/by/4.0/80x15.png

FSDnoisy18K

  • audio: ✅

  • annotations: ✅

Tags

18532

42.5

https://licensebuttons.net/l/by/4.0/80x15.png

SINGA:PURA

  • audio: ✅

  • annotations: ✅

Events

6547

18.2

https://licensebuttons.net/l/by-sa/4.0/80x15.png
STARSS
2022
  • audio: ✅

  • annotations: ✅

Spatial Events

121

5

https://img.shields.io/badge/License-MIT-blue.svg
TAU NIGENS
SSE 2020
  • audio: ✅

  • annotations: ✅

Spatial Events

800

15

https://licensebuttons.net/l/by-nc/4.0/80x15.png
TAU NIGENS
SSE 2021
  • audio: ✅

  • annotations: ✅

Spatial Events

800

15

https://licensebuttons.net/l/by-nc/4.0/80x15.png
TUT Sound
Events 2017
  • audio: ✅

  • annotations: ✅

Events

32

2.02

Custom

TAU SSE
2019
  • audio: ✅

  • annotations: ✅

Spatial Events

500

8.3

Custom

TAU Urban
Acoustic Scenes
2019
  • audio: ✅

  • annotations: ✅

Tags

22800

63.3

Custom

TAU Urban
Acoustic Scenes
2020 Mobile
  • audio: ✅

  • annotations: ✅

Tags

34915

97

Custom

TAU Urban
Acoustic Scenes
2022 Mobile
  • audio: ✅

  • annotations: ✅

Tags

349150

97

:tau2022:`\ `

URBAN-SED

  • audio: ✅

  • annotations: ✅

Events

10000

27.8

https://licensebuttons.net/l/by/4.0/80x15.png

UrbanSound8K

  • audio: ✅

  • annotations: ✅

Tags

8732

8.75

https://licensebuttons.net/l/by-nc/4.0/80x15.png

Warblrb10k

  • audio: ✅

  • annotations: ✅

Tags

10,000

28

https://licensebuttons.net/l/by/4.0/80x15.png

Annotation Types

The table above provides annotation types as a guide for choosing appropriate datasets. Here we provide a rough guide to the types in this table, but we strongly recommend reading the dataset specific documentation to ensure the data is as you expect. To see how these annotation types are implemented in Soundata see Annotations.

Tags

One or more string labels with corresponding confidence values. Tags do not have start or end times, and span the full duration of the clip. Tags are used to represent annotations for: * Acoustic Scene Classification (ASC) * Sound Event Classification (SEC) * Sound Event Detection (SED) - weak labels

When every Tags annotation in a dataset contains exactly one label, it is typically a multi-class problem. When Tags annotations contain varying numbers of labels (including 0), it is typically a multi-label problem.

Events

Sound events with a start time, end time, label, and confidence. Events are used to represent annotations for: * Sound Event Detection (SED) - strong labels