Search
Now showing items 1-10 of 57
Lwazi III English TTS Corpus
(Meraka Institute, CSIR, 2016-06-17) ~ - Resource Catalogue
Complete audio recordings with orthographic transcriptions. TTS corpus for standard SA dialect. This corpus was created to enable the building of a TTS voice.
NCHLT English Speech Corpus
(Meraka Institute, CSIR; North-West University, 2014-07-08) ~ - Resource Catalogue
Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.
NCHLT Speech II Corpus
(Meraka Institute, CSIR, 2016-05-09) ~ - Resource Catalogue
The speech corpus generated from aligned audio samples from National Parliament using Hansard transcriptions are provided in terms of audio and ...
African Speech Technology English-English Speech Corpus
(North-West University; Stellenbosch University; University of Transkei; University of Free State (Qwa-Qwa campus); Rhodes University; University of KwaZulu-Natal; University of Western Cape, 2014-12-11) ~ - Resource Catalogue
African Speech Technology speech and transcription data for the English-English database. The "speech" directory contains English speech as spoken by ...
Lwazi Telephony Platform
(Meraka Institute, CSIR, 2013-07-15) ~ - Resource Catalogue
Lwazi is a robust telephony platform aiming to facilitate speedy development of experimental applications without sacrificing power by combining Asterisk ...
African Speech Technology Indian-English Speech Corpus
(North-West University; Stellenbosch University; University of Transkei; University of Free State (Qwa-Qwa campus); Rhodes University; University of KwaZulu-Natal; University of Western Cape, 2014-12-11) ~ - Resource Catalogue
African Speech Technology speech and transcription data for the Indian-English database. The "speech" directory contains English speech as spoken by ...
NCHLT English Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2016-09-09) ~ - Resource Catalogue
Collection consisting of a clean corpus, lexicon, frequency list and named-entity lists developed during the NCHLT Text project.
NCHLT Optical Character Recognition for South African Languages
(North-West University; Centre for Text Technology (CTexT), 2017-02-23) ~ - Resource Catalogue
An OCR system is an application that enables one to convert scanned paper documents into editable and searchable texts. The engine analyses the structure ...
NCHLT South African Language Identifier
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
A graphical user interface and command line tool to automatically classify a document, paragraph, sentence or phrase as one of the eleven official South ...
NCHLT-inlang Pronunciation Dictionaries
(Meraka Institute, CSIR; North-West University, 2014-07-04) ~ - Resource Catalogue
Broad phonemic transcriptions for 15,000 generic words in each of 11 languages. Each dictionary has an associated rule set for generating pronunciations ...