Search
Now showing items 161-170 of 238
NCHLT isiNdebele Auxiliary Speech Corpus
(CSIR Meraka Institute; North-West University, 2019-06-01) ~ - Resource Catalogue
The corpus contains orthographically transcribed broadband speech in each of South Africa's eleven official languages. Transcriptions are provided in ...
Lwazi English TTS corpus
(Meraka Institute, CSIR, 2013-03-27) ~ - Resource Catalogue
Orthographic and phonemically aligned transcriptions
Autshumato Sesotho sa Leboa-English Translation Memory
(North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ - Resource Catalogue
Translation memory from Sesotho sa Leboa to English (EN-GB), in the government domain for use in the Autshumato ITE application.
African Speech Technology Coloured-English Speech Corpus
(North-West University; Stellenbosch University; University of Transkei; University of Free State (Qwa-Qwa campus); Rhodes University; University of KwaZulu-Natal; University of Western Cape, 2014-12-11) ~ - Resource Catalogue
African Speech Technology speech and transcription data for the Coloured English database. The "speech" directory contains English speech as spoken by ...
NCHLT English Auxiliary Speech Corpus
(CSIR Meraka Institute; North-West University, 2019-06-01) ~ - Resource Catalogue
The corpus contains orthographically transcribed broadband speech in each of South Africa's eleven official languages. Transcriptions are provided in ...
Autshumato isiZulu-English Translation Memory
(North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ - Resource Catalogue
Translation memory from IsiZulu to English (EN-GB), in the government domain for use in the Autshumato ITE application.
Autshumato English-Afrikaans Parallel Corpora
(North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ - Resource Catalogue
Parallel corpora aligned on sentence level through a combination of automatic and manual alignment techniques. The parallel corpora were obtained from ...
NCHLT Sesotho Named Entity Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.
Lwazi Afrikaans TTS corpus
(Meraka Institute, CSIR, 2013-03-27) ~ - Resource Catalogue
Orthographic and phonemically aligned transcriptions
NCHLT isiZulu Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Collection of source text documents, genre classified text documents, raw corpus, clean corpus, lexicon, frequency list and named-entity lists developed ...