Search
Now showing items 1-10 of 227
Lwazi isiNdebele TTS corpus
(Meraka Institute, CSIR, 2013-03-27) ~ - Resource Catalogue
Orthographic and phonemically aligned transcriptions
NCHLT Tshivenda Speech Corpus
(Meraka Institute, CSIR; North-West University, 2014-07-08) ~ - Resource Catalogue
Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.
African Speech Technology Coloured-Afrikaans Speech Corpus
(North-West University; Stellenbosch University; University of Transkei; University of Free State (Qwa-Qwa campus); Rhodes University; University of KwaZulu-Natal; University of Western Cape, 2014-12-11) ~ - Resource Catalogue
African Speech Technology speech and transcription data for the Coloured-Afrikaans database. The "speech" directory contains Afrikaans speech as spoken ...
NCHLT Tshivenda Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Collection of source text documents, genre classified text documents, raw corpus, clean corpus, lexicon, frequency list and named-entity lists developed ...
AuCoPro Splitting Dataset
(North-West University; Centre for Text Technology (CTexT); Tilburg Centre for Cognition and Communication, 2015-01-07) ~ - Resource Catalogue
The AuCoPro Splitting dataset contains compounds annotated with their compound boundaries and linking morphemes for Afrikaans and Dutch.
Setswana multi-speaker TTS corpus
(MuST, NWU, 2018-02-28) ~ - Resource Index
The aim of this corpus was to investigate the implementation of a high-quality TTS system using multiple voices recorded using a low-cost process (i.e. ...
Fannie Sebolela Oral Corpus
(Department of African Languages - University of Pretoria, 2015-01-27) ~ - Resource Index
Tape recordings and transcriptions of 13 mother tongue speakers. Transcribed orthographically.
Lwazi isiZulu TTS corpus
(Meraka Institute, CSIR, 2013-03-27) ~ - Resource Catalogue
Orthographic and phonemically aligned transcriptions
NCHLT Sepedi Annotated Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.
African Speech Technology isiZulu Text Corpus
(North-West University; Stellenbosch University; University of Transkei; University of Free State (Qwa-Qwa campus); Rhodes University; University of KwaZulu-Natal; University of Western Cape, 2015-01-07) ~ - Resource Catalogue
Monolingual text corpus developed during the African Speech Technology project.