Search
Now showing items 31-40 of 64
NCHLT Xitsonga Lemmatiser
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Lemmatiser developed during the NCHLT Text project. \n\n
Available in the Readme.txt - Input format: Text data (encoding: UTF8 without BOM), one ...
Human Language Technology Audit 2017/18
(CSIR, 2018-08-31)
This document reports on all work conducted in the 2017/18 Audit of human language technology (HLT) resources available in South Africa project. The ...
Multilingual Arts & Culture Intermediate Phase Terminology List
(Terminology Coordination Section of the National Language Service, Department of Arts and Culture, 2017-03-03) ~ - Resource Index
550 English source terms with their equivalents in the ten other official South African languages. The list was compiled in collaboration with subject ...
NCHLT Xitsonga RoBERTa language model
(North-West University; Centre for Text Technology (CTexT), 2023-05-01)
Contextual masked language model based on the RoBERTa architecture (Liu et al., 2019). The model is trained as a masked language model and not fine-tuned ...
Autshumato Monolingual Xitsonga Corpus
(CTexT® (Centre for Text Technology, North-West University), 2022-09-30)
Monolingual corpus for Xitsonga. The data is given as a single UTF-8 text file, with each segment on a newline. The data was specifically selected and ...
Autshumato Machine Translation Web Service (MTWS)
(Centre for Text Technology; North-West University, 2018-03-01) ~ - Resource Index
The MTWS is a unified interface through which anyone can gain access to the MT systems developed in the Autshumato project. It can provide sentence, ...
Multilingual Life Orientation Intermediate Phase Terminology List
(Terminology Coordination Section of the National Language Service, Department of Arts and Culture, 2017-03-03) ~ - Resource Index
1628 English source terms with their equivalents in the ten other official South African languages. The terms were excerpted from life orientation ...
Multilingual Parliamentary / Political Terminology List
(Terminology Coordination Section of the National Language Service, Department of Arts and Culture, 2017-03-03) ~ - Resource Index
502 English source terms with their equivalents in the ten other official South African languages. The project built on a 2003 initiative of the national ...
NCHLT Xitsonga word2vec-Skipgram embeddings
(North-West University; Centre for Text Technology (CTexT), 2023-05-01)
Static word embeddings for the Skipgram flavour of the word2vec (w2v) architecture (Mikolov et al., 2013). The embedding provides real-valued vector ...
NCHLT Text Web Services
(SADiLaR; North-West University, 2018-03-01) ~ - Resource Index
A web service that provides access to seven core technologies in ten South African languages, including:
* Tokenisers
* Sentence separators
* ...