Search
Now showing items 1-10 of 34
Lwazi II Proper Name Call Routing Telephone Corpus
(Meraka Institute, CSIR; North-West University, 2015-11-20) ~ - Resource Catalogue
Short prompts of proper names and language names collected via the telephone network.
NCHLT-inlang Pronunciation Dictionaries
(Meraka Institute, CSIR; North-West University, 2014-07-04) ~ - Resource Catalogue
Broad phonemic transcriptions for 15,000 generic words in each of 11 languages. Each dictionary has an associated rule set for generating pronunciations ...
Lwazi English ASR corpus
(Meraka Institute, CSIR, 2013-04-02) ~ - Resource Catalogue
Complete audio recordings and orthographic transcriptions used for Lwazi speech recognition systems.
Autshumato English-isiZulu Translation Memory
(North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ - Resource Catalogue
Translation memory from English (EN-GB) to isiZulu, in the government domain for use in the Autshumato ITE application.
Autshumato English-Afrikaans Translation Memory
(North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ - Resource Catalogue
Translation memory from English (EN-GB) to Afrikaans, in the government domain for use in the Autshumato ITE application.
Autshumato English-Sesotho sa Leboa Translation Memory
(North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ - Resource Catalogue
Translation memory from English (EN-GB) to Sesotho sa Leboa, in the government domain for use in the Autshumato ITE application.
South African Multilingual Proper Names (Multipron) Corpus
(Molo Afrika Speech Technologies, 2013-10-03) ~ - Resource Catalogue
Audio, orthographic and auditory verified broad phonemic transcriptions of proper names in four languages, produced by speakers of the same four languages.
Autshumato English-Xitsonga Parallel Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-12-11) ~ - Resource Catalogue
Aligned English-Xitsonga parallel corpus. The data is given as two seperate UTF-8 text files; with each segment on a newline.
NCHLT English Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2016-09-09) ~ - Resource Catalogue
Collection consisting of a clean corpus, lexicon, frequency list and named-entity lists developed during the NCHLT Text project.
Lwazi English Pronunciation Dictionary
(Meraka Institute, CSIR, 2013-04-01) ~ - Resource Catalogue
General phonemic pronunciations for frequently occurring words in SA languages. Dictionaries were developed to be practically usable for speech technology ...