Search
Now showing items 81-90 of 227
African Speech Technology English Text Corpus
(North-West University; Stellenbosch University; University of Transkei; University of Free State (Qwa-Qwa campus); Rhodes University; University of KwaZulu-Natal; University of Western Cape, 2015-01-07) ~ - Resource Catalogue
Monolingual text corpus developed during the African Speech Technology project.
Xitsonga Custom Dictionary for Government Domain
(North-West University; Centre for Text Technology (CTexT), 2013-02-22) ~ - Resource Catalogue
Word list developed as a custom dictionary for use in the spelling checkers as part of the spelling checker project for the Department of Arts and ...
IsiXhosa multi-speaker TTS corpus
(MuST, NWU, 2018-02-27) ~ - Resource Index
The aim of this corpus was to investigate the implementation of a high-quality TTS system using multiple voices recorded using a low-cost process (i.e. ...
NCHLT Speech II Corpus
(Meraka Institute, CSIR, 2016-05-09) ~ - Resource Catalogue
The speech corpus generated from aligned audio samples from National Parliament using Hansard transcriptions are provided in terms of audio and ...
Lwazi Setswana Pronuncation Dictionary
(Meraka Institute, CSIR, 2013-04-01) ~ - Resource Catalogue
General phonemic pronunciations for frequently occurring words in SA languages. Dictionaries were developed to be practically usable for speech technology ...
Multilingual Information Communication Technology Terminology List
(Terminology Coordination Section of the National Language Service, Department of Arts and Culture, 2017-03-03) ~ - Resource Index
132 English source terms with their equivalents in the ten other official South African languages. Originally initiated by the Department of Communications, ...
Setswana Genre Classification Corpus
(Trifonius, 2013-06-19) ~ - Resource Catalogue
Contains training and testing data for Genre Classification for Setswana.
South African Broadcast News (SABN) Corpus
(Stellenbosch University; CSIR, 2018-02-27) ~ - Resource Index
The corpus consists of approximately 20 hours of audio recordings from one of the country's main radio news channels, SAFM. Bulletins ...
Lwazi III English TTS Corpus
(Meraka Institute, CSIR, 2016-06-17) ~ - Resource Catalogue
Complete audio recordings with orthographic transcriptions. TTS corpus for standard SA dialect. This corpus was created to enable the building of a TTS voice.
Tagger Parameter file for RF-Tagger (Schmid and Laws 2005)
(Institute for Information Science and Natural Language Processing, University of Hildesheim, Germany, 2018-02-21) ~ - Resource Index
The tagger parameter file is trained on an excerpt of the Pretoria Sepedi Corpus (D. Prinsloo, University of Pretoria): Here, about 5000 tokens were ...