Identifier: SLR99

Summary: A human nonverbal vocalization dataset by Deeply Inc.

Category: Audio

License: Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)

Download: VocalCharacterizer.tar.gz [45M]   ( Vocal characterizer dataset )   Mirrors: [US]  

About this resource:

  • Volume(full set): ~0.6(~57) hours, ~800(~70,000) utterances, ~500(~1500) speakers
  • Format: 16kHz, 16-bit, mono
  • Device: Android phones

The Vocal Characterizer Dataset is a human nonverbal vocalization dataset crowdsourced by the general public in South Korea. Also, the dataset includes metadata such as age, sex, noise level, and quality of utterance.
16 classes of Included human nonverbal sound contain 'teeth-chattering', teeth-grinding', 'tongue-clicking', 'nose-blowing', 'coughing', 'yawning', 'throat clearing', 'sighing', 'lip-popping', 'lip-smacking', 'panting', 'crying', 'laughing', 'sneezing', 'moaning', and 'screaming'.

The dataset is a subset(approximately 1%) of a much bigger dataset which were recorded under the same environment as this public dataset.
                                               {"label": 0, "speakerID": "87LX", "age": 19, "sex": 0, "location": 0, "quality": 0, "noise": 0},

label   : {0: 'teeth-chattering', 1: 'teeth-grinding', 2: 'tongue-clicking', 3: 'nose-blowing', 
           4: 'coughing', 5: 'yawning', 6: 'throat-clearing', 7: 'sighing', 8: 'lip-popping', 
           9: 'lip-smacking', 10: 'panting', 11: 'crying', 12: 'laughing', 13: 'sneezing', 14: 'moaning', 15: screaming'}
Sex     : {0: 'Female', 1: 'Male'}
Location: {0: 'indoor', 1: 'outdoor'}
Quality : {0: 'High', 1: 'Low'}
Noise   : {0: 'Noiseless', 1: 'Noisy'}

You can cite the data as follows:
  title={{Deeply vocal characterizer dataset}},
  author={Deeply Inc.},

