openslr.org

Open Speech and Language Resources

LibriTTS-R

Identifier: SLR141

Summary: Sound quality improved version of the LibriTTS corpus which is a large-scale corpus of English speech designed for TTS use

Category: Speech

License: CC BY 4.0

Downloads (use a mirror closer to you):
doc.tar.gz [error getting size] (Documents of LibriTTS-R ) Mirrors: [EU] [EU] [CN]
dev_clean.tar.gz [error getting size] (Development set, clean speech ) Mirrors: [EU] [EU] [CN]
dev_other.tar.gz [error getting size] (Development set, more challenging speech ) Mirrors: [EU] [EU] [CN]
test_clean.tar.gz [error getting size] (Test set, "clean" speech ) Mirrors: [EU] [EU] [CN]
test_other.tar.gz [error getting size] (Test set, "other" speech ) Mirrors: [EU] [EU] [CN]
train_clean_100.tar.gz [error getting size] (Training set derived from the original materials of the train-clean-100 subset of LibriSpeech ) Mirrors: [EU] [EU] [CN]
train_clean_360.tar.gz [error getting size] (Training set derived from the original materials of the train-clean-360 subset of LibriSpeech ) Mirrors: [EU] [EU] [CN]
train_other_500.tar.gz [error getting size] (Training set derived from the original materials of the train-other-500 subset of LibriSpeech ) Mirrors: [EU] [EU] [CN]
libritts_r_failed_speech_restoration_examples.tar.gz [error getting size] (Lists of files where speech restoration failed ) Mirrors: [EU] [EU] [CN]
md5sum.txt [509 bytes] (Checksums of the individual files ) Mirrors: [EU] [EU] [CN]

About this resource:

LibriTTS-R [1] is a sound quality improved version of the LibriTTS corpus (http://www.openslr.org/60/) which is a multi-speaker English corpus of approximately 585 hours of read English speech at 24kHz sampling rate, published in 2019. The constituent samples of LibriTTS-R are identical to those of LibriTTS, with only the sound quality improved. To improve sound quality, a speech restoration model, Miipher proposed by Yuma Koizumi [2], was used.

For more information, refer to the paper [1]. If you use the LibriTTS-R corpus in your work, please cite the dataset paper [1] where it was introduced.

Audio samples of the ground-truth and TTS generated samples are available at the demo page: https://google.github.io/df-conformer/librittsr/

[1] Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Michiel Bacchiani, Yu Zhang, Wei Han, and Ankur Bapna, "LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus," arXiv, 2023.
[2] Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang, Wei Han, Ankur Bapna, and Michiel Bacchiani, "Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations," arXiv, 2023.