Search
Now showing items 11-20 of 36
Autshumato English-Sesotho sa Leboa Translation Memory
(North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ - Resource Catalogue
Translation memory from English (EN-GB) to Sesotho sa Leboa, in the government domain for use in the Autshumato ITE application.
Autshumato English-Sesotho sa Leboa Parallel Corpora
(North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ - Resource Catalogue
Parallel corpora aligned on sentence level through a combination of automatic and manual alignment techniques. The parallel corpora were obtained from ...
Sepedi Speech Corpora
(University of Limpopo (Turfloop Campus), 2015-01-27) ~ - Resource Index
A corpus of Sesotho sa Leboa telephone speech data collected from mother tongue speakers of the standard version of Sesotho sa Leboa for the purpose ...
Multilingual Information Communication Technology Terminology List
(Terminology Coordination Section of the National Language Service, Department of Arts and Culture, 2017-03-03) ~ - Resource Index
132 English source terms with their equivalents in the ten other official South African languages. Originally initiated by the Department of Communications, ...
Tagger Parameter file for RF-Tagger (Schmid and Laws 2005)
(Institute for Information Science and Natural Language Processing, University of Hildesheim, Germany, 2018-02-21) ~ - Resource Index
The tagger parameter file is trained on an excerpt of the Pretoria Sepedi Corpus (D. Prinsloo, University of Pretoria): Here, about 5000 tokens were ...
Multilingual HIV/AIDS Terminology List
(Terminology Coordination Section of the National Language Service, Department of Arts and Culture, 2017-02-15) ~ - Resource Index
586 English source terms with their equivalents in the ten other official South African languages. The list was compiled in collaboration with subject ...
NCHLT Sepedi Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Collection of source text documents, genre classified text documents, raw corpus, clean corpus, lexicon, frequency list and named-entity lists developed ...
NCHLT Sepedi Speech Corpus
(Meraka Institute, CSIR; North-West University, 2014-07-08) ~ - Resource Catalogue
Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.
NCHLT Sepedi Named Entity Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.
Multilingual Natural Sciences & Technology Terminology List (Grade 4 - 6)
(Terminology Coordination Section of the National Language Service, Department of Arts and Culture, 2017-03-03) ~ - Resource Index
2756 English source terms with their equivalents in the ten other official South African languages. The list was populated from terms excerpted from ...