Search
Now showing items 1-10 of 17
Autshumato Xitsonga Frequency Word List
(North-West University; Centre for Text Technology (CTexT), 2014-12-12) ~ - Resource Catalogue
A list of the most frequent Xitsonga words as deliverable of the Autshumato project.
NCHLT-inlang Pronunciation Dictionaries
(Meraka Institute, CSIR; North-West University, 2014-07-04) ~ - Resource Catalogue
Broad phonemic transcriptions for 15,000 generic words in each of 11 languages. Each dictionary has an associated rule set for generating pronunciations ...
Autshumato Xitsonga Monolingual Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-12-12) ~ - Resource Catalogue
Xitsonga monolingual corpus as deliverable of the Autshumato project. The data is given as a UTF-8 text file; with each sentence on a newline.
Lwazi Xitsonga TTS corpus
(Meraka Institute, CSIR, 2013-03-27) ~ - Resource Catalogue
Orthographic and phonemically aligned transcriptions
Autshumato English-Xitsonga Parallel Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-12-11) ~ - Resource Catalogue
Aligned English-Xitsonga parallel corpus. The data is given as two seperate UTF-8 text files; with each segment on a newline.
NCHLT Xitsonga Speech Corpus
(Meraka Institute, CSIR; North-West University, 2014-07-08) ~ - Resource Catalogue
Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.
Xitsonga Custom Dictionary for Government Domain
(North-West University; Centre for Text Technology (CTexT), 2013-02-22) ~ - Resource Catalogue
Word list developed as a custom dictionary for use in the spelling checkers as part of the spelling checker project for the Department of Arts and ...
NCHLT Xitsonga Named Entity Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.
NCHLT Xitsonga Annotated Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.
Lwazi Xitsonga Pronunciation Dictionary
(Meraka Institute, CSIR, 2013-04-01) ~ - Resource Catalogue
General phonemic pronunciations for frequently occurring words in SA languages. Dictionaries were developed to be practically usable for speech technology ...