Search
Now showing items 91-100 of 240
Autshumato English-Setswana Parallel Corpora
(North-West University; Centre for Text Technology (CTexT), 2016-10-28) ~ - Resource Catalogue
Aligned English-Setswana parallel corpus. This set contains data that was translated by professional translators, data that was sourced as translated ...
Multilingual Natural Sciences & Technology Terminology List (Grade 4 - 6)
(Terminology Coordination Section of the National Language Service, Department of Arts and Culture, 2017-03-03) ~ - Resource Index
2756 English source terms with their equivalents in the ten other official South African languages. The list was populated from terms excerpted from ...
Denominal adjectives in Afrikaans dataset
(South African Centre for Digital Language Resources, 2020-05-15) ~ - Resource Catalogue
This dataset contain a collection of Afrikaans denominal adjectives that were extracted from the Virtual Institute for Afrikaans' corpus portal. The ...
Autshumato Text Anonymiser
(North-West University; Centre for Text Technology (CTexT), 2013-06-20) ~ - Resource Catalogue
Anonymises text by classifying and replacing sensitive information such as person names, business names, place names, monetary values, phone numbers, ...
NCHLT Setswana Morphological Decomposer
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Morphological decomposer developed during the NCHLT Text project.
CTexT Multilingual Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2015-02-03) ~ - Resource Index
Document level aligned corpora for machine translation purposes.
Unisa South African Spoken and Signed Language Corpus
(University of South Africa, 2018-02-28) ~ - Resource Index
This resource comprises annotated transcriptions of audio and video segments of the Xhosa section of the spoken corpus project SOUTHTALK (Southern African ...
UNISA English/Zulu Parallel Corpus
(University of South Africa, 2018-02-28) ~ - Resource Index
The resource comprises sentence aligned and tokenized parallel text in English and Zulu. The text was extracted from the following sources: an adapted ...
NCHLT Xitsonga Named Entity Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.
NCHLT Xitsonga Annotated Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.