Search
Now showing items 161-170 of 172
Lwazi Xitsonga ASR corpus
(Meraka Institute, CSIR, 2013-04-02) ~ - Resource Catalogue
Complete audio recordings and orthographic transcriptions used for Lwazi speech recognition systems.
NCHLT Tshivenda Annotated Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.
Sesotho sa Leboa Genre Classification Corpus
(Trifonius, 2013-06-19) ~ - Resource Catalogue
Contains training and testing data for Genre Classification for Sesotho sa Leboa.
isiNdebele Custom Dictionary for Government Domain
(North-West University; Centre for Text Technology (CTexT), 2013-02-22) ~ - Resource Catalogue
Word list developed as a custom dictionary for use in the spelling checkers as part of the spelling checker project for the Department of Arts and ...
isiNdebele Genre Classification Corpus
(Trifonius, 2013-06-19) ~ - Resource Catalogue
Contains training and testing data for Genre Classification for isiNdebele.
Setswana Test suite and Treebank
(North-West University, 2018-03-27) ~ - Resource Catalogue
The main aim of the PhD study "A computational syntactic analysis of Setswana"(AS Berg, May 2018) is the computational syntactic analysis of the Setswana ...
Afribooms Afrikaans Dependency Treebank
(North-West University; Centre for Text Technology (CTexT); Katholieke Universiteit Leuven (Belgium), 2015-02-10) ~ - Resource Catalogue
This is the annotated corpus developed for Afrikaans for the Afribooms project. The corpus includes annotations for lemma, part-of-speech (POS) and ...
Lagos-NWU Yoruba Speech Corpus
(North-West University; Centre for Text Technology (CTexT); University of Lagos (Nigeria), 2015-02-06) ~ - Resource Catalogue
This speech corpus consisting of 16 female speakers and 17 male speakers was recorded in Lagos, Nigeria for the purpose of speech recognition research. ...
Autshumato Xitsonga Monolingual Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-12-12) ~ - Resource Catalogue
Xitsonga monolingual corpus as deliverable of the Autshumato project. The data is given as a UTF-8 text file; with each sentence on a newline.
NOTE: ...
Autshumato Setswana Monolingual Corpora
(North-West University; Centre for Text Technology (CTexT), 2016-10-28) ~ - Resource Catalogue
Setswana monolingual corpus as a deliverable of the Autshumato project. The data is given as a UTF-8 text file; with each sentence on a new line.
NOTE: ...