Search
Now showing items 21-30 of 45
Autshumato Text Anonymiser
(North-West University; Centre for Text Technology (CTexT), 2013-06-20) ~ - Resource Catalogue
Anonymises text by classifying and replacing sensitive information such as person names, business names, place names, monetary values, phone numbers, ...
Autshumato English-isiZulu Parallel Corpora
(North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ - Resource Catalogue
Parallel corpora aligned on sentence level through a combination of automatic and manual alignment techniques. The parallel corpora were obtained from ...
Translate.org.za isiZulu - isiXhosa Corpus 2012
(Translate.org.za, 2013-06-19) ~ - Resource Catalogue
isiZulu-isiXhosa translation memory.
NCHLT isiZulu Annotated Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.
NCHLT isiZulu word2vec-Skipgram embeddings
(North-West University; Centre for Text Technology (CTexT), 2023-05-01)
Static word embeddings for the Skipgram flavour of the word2vec (w2v) architecture (Mikolov et al., 2013). The embedding provides real-valued vector ...
Human Language Technology Audit 2017/18
(CSIR, 2018-08-31)
This document reports on all work conducted in the 2017/18 Audit of human language technology (HLT) resources available in South Africa project. The ...
isiZulu Genre Classification Corpus
(Trifonius, 2013-06-19) ~ - Resource Catalogue
Contains training and testing data for Genre Classification for isiZulu.
Autshumato isiZulu-English Translation Memory
(North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ - Resource Catalogue
Translation memory from IsiZulu to English (EN-GB), in the government domain for use in the Autshumato ITE application.
NCHLT isiZulu Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Collection of source text documents, genre classified text documents, raw corpus, clean corpus, lexicon, frequency list and named-entity lists developed ...
Autshumato TMX Integrator
(North-West University; Centre for Text Technology (CTexT), 2013-06-20) ~ - Resource Catalogue
Utility to merge multiple translation memories over a network using Subversion