Search
Now showing items 21-30 of 58
Autshumato English-Setswana Parallel Corpora
(North-West University; Centre for Text Technology (CTexT), 2016-10-28) ~ - Resource Catalogue
Aligned English-Setswana parallel corpus. This set contains data that was translated by professional translators, data that was sourced as translated ...
Autshumato Text Anonymiser
(North-West University; Centre for Text Technology (CTexT), 2013-06-20) ~ - Resource Catalogue
Anonymises text by classifying and replacing sensitive information such as person names, business names, place names, monetary values, phone numbers, ...
NCHLT Setswana Morphological Decomposer
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Morphological decomposer developed during the NCHLT Text project.
NCHLT Setswana word2vec-Skipgram embeddings
(North-West University; Centre for Text Technology (CTexT), 2023-05-01)
Static word embeddings for the Skipgram flavour of the word2vec (w2v) architecture (Mikolov et al., 2013). The embedding provides real-valued vector ...
NCHLT Setswana Named Entity Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.
Human Language Technology Audit 2017/18
(CSIR, 2018-08-31)
This document reports on all work conducted in the 2017/18 Audit of human language technology (HLT) resources available in South Africa project. The ...
African Wordnet: Setswana 1.0
(UNISA, 2017-06-20) ~ - Resource Catalogue
Developed using the expand model with Princeton WordNet 2.0 as basis.Each wordnet contains synsets with at least the following fields:\nWord form (lemma; ...
NCHLT Setswana Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Collection of source text documents, genre classified text documents, raw corpus, clean corpus, lexicon, frequency list and named-entity lists developed ...
DictionaryMaker
(Meraka Institute, CSIR, 2013-07-15) ~ - Resource Catalogue
The purpose of the DictionaryMaker system is to facilitate the creation of an electronic pronunciation dictionary in a target language, as originally ...
Setswana Custom Dictionary for Government Domain
(North-West University; Centre for Text Technology (CTexT), 2013-02-22) ~ - Resource Catalogue
Word list developed as a custom dictionary for use in the spelling checkers as part of the spelling checker project for the Department of Arts and ...