Search
Now showing items 41-50 of 74
Sepedi Tokeniser
(University of South Africa, 2015-01-28) ~ - Resource Index
Pre-processing for Sesotho sa Leboa morphology as a disjunctively written language. (morphemes are already separated) as pre-cursor for morphological ...
NCHLT Setswana RoBERTa language model
(North-West University; Centre for Text Technology (CTexT), 2023-05-01)
Contextual masked language model based on the RoBERTa architecture (Liu et al., 2019). The model is trained as a masked language model and not fine-tuned ...
NCHLT Setswana fastText-CBoW embeddings
(North-West University; Centre for Text Technology (CTexT), 2023-05-01)
Static word and subword embeddings for the continuous bag of words (CBoW) flavour of the fastText architecture (Bojanowski et al., 2017). The embedding ...
NCHLT Setswana fastText-Skipgram embeddings
(North-West University; Centre for Text Technology (CTexT), 2023-05-01)
Static word and subword embeddings for the Skipgram flavour of the fastText architecture (Bojanowski et al., 2017). The embedding provides real-valued ...
Autshumato Machine Translation Web Service (MTWS)
(Centre for Text Technology; North-West University, 2018-03-01) ~ - Resource Index
The MTWS is a unified interface through which anyone can gain access to the MT systems developed in the Autshumato project. It can provide sentence, ...
Multilingual Life Orientation Intermediate Phase Terminology List
(Terminology Coordination Section of the National Language Service, Department of Arts and Culture, 2017-03-03) ~ - Resource Index
1628 English source terms with their equivalents in the ten other official South African languages. The terms were excerpted from life orientation ...
Multilingual Parliamentary / Political Terminology List
(Terminology Coordination Section of the National Language Service, Department of Arts and Culture, 2017-03-03) ~ - Resource Index
502 English source terms with their equivalents in the ten other official South African languages. The project built on a 2003 initiative of the national ...
NCHLT Text Web Services
(SADiLaR; North-West University, 2018-03-01) ~ - Resource Index
A web service that provides access to seven core technologies in ten South African languages, including:
* Tokenisers
* Sentence separators
* ...
Open Spell (v1.0)
(Meraka Institute, CSIR; TEIR; ICSI at University of California (Berkeley), 2013-07-01) ~ - Resource Index
Open Spell is spelling game that provides spelling exercises (in the language education domain) to teach spelling skills to schoolchildren between the ...
LID classifier
(CSIR, 2018-03-02) ~ - Resource Index
An LID service that allows for token level classification in the 11 official languages of South Africa