Search
Now showing items 71-80 of 227
NCHLT Tshivenda Named Entity Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.
Lwazi Tshivenda TTS corpus
(Meraka Institute, CSIR, 2013-03-27) ~ - Resource Catalogue
Orthographic and phonemically aligned transcriptions
Autshumato English-Sesotho sa Leboa Parallel Corpora
(North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ - Resource Catalogue
Parallel corpora aligned on sentence level through a combination of automatic and manual alignment techniques. The parallel corpora were obtained from ...
DSAE Print Citations Database
(Dictionary Unit for South African English; Rhodes University, 2018-02-05) ~ - Resource Index
C . 300 000 index cards collected from 1969 onwards to document English words with a unique meaning / usage in South Africa, as research process for A ...
Sepedi Speech Corpora
(University of Limpopo (Turfloop Campus), 2015-01-27) ~ - Resource Index
A corpus of Sesotho sa Leboa telephone speech data collected from mother tongue speakers of the standard version of Sesotho sa Leboa for the purpose ...
Lwazi Siswati Pronunciation Dictionary
(Meraka Institute, CSIR, 2013-04-01) ~ - Resource Catalogue
General phonemic pronunciations for frequently occurring words in SA languages. Dictionaries were developed to be practically usable for speech technology ...
Sesotho multi-speaker TTS corpus
(MuST, NWU, 2018-02-28) ~ - Resource Index
The aim of this corpus was to investigate the implementation of a high-quality TTS system using multiple voices recorded using a low-cost process (i.e. ...
SAE Pronunciation Dictionary
(Stellenbosch Universtity, 2015-01-27) ~ - Resource Index
Pronunciation dictionary compiled from newspaper text and radio news transcriptions. Dictionary to be used for the development of a large vocabulary ...
NCHLT Xitsonga Speech Corpus
(Meraka Institute, CSIR; North-West University, 2014-07-08) ~ - Resource Catalogue
Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.
Lwazi Siswati TTS corpus
(Meraka Institute, CSIR, 2013-03-27) ~ - Resource Catalogue
Orthographic and phonemically aligned transcriptions