Show simple item record

High quality TTS data for four South African languages (af, st, tn, xh)
This data set contains multi-speaker TTS high quality transcribed audio data for four languages of South Africa: Afrikaans, Sesotho, Setswana and isiXhosa. The data set consists of wave files, and a TSV file transcribing the audio. In each folder, the file line_index.tsv contains a FileID, which in turn contains the UserID and the Transcription of audio in the file. The data set has had some quality checks, but there might still be errors. This data set was collected by as a collaboration between North-West University and Google. See LICENSE.txt file for license information. Copyright 2017 Google, Inc.
Daniel Povey
dpovey@gmail.com
Google; North-West University
Attribution-ShareAlike 4.0 International (CC BY-SA 4.0): https://creativecommons.org/licenses/by-sa/4.0/
Afrikaans; isiXhosa; Setswana; Sesotho
Google; North-West University
TTS
https://hdl.handle.net/20.500.12185/527
Speech
Multilingual Speech Corpus
Audio files
1
3.31GB
TSV; WAV
OpenSLR (Open Speech and Language Resources)
Resource Catalogue
Resource Index
afr; xho; sot; tsn
2020-01-14T09:53:43Z
2020-01-14T09:53:43Z
2017


Files in this item

Thumbnail
Thumbnail
Thumbnail
Thumbnail

This item appears in the following Collection(s)

  • Resource Catalogue [350]
    A collection of language resources available for download from the RMA of SADiLaR. The collection mostly consists of resources developed with funding from the Department of Arts and Culture.
  • Resource Index [412]
    A collection of language resource metadata mostly collected during the NHN funded technology audit of 2009, as well as the SADiLaR technology audit of 2018. Not all resources in this collection are available for download.

Show simple item record