Search
Now showing items 121-130 of 238
NCHLT Afrikaans Morphological Decomposer
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Morphological decomposer developed during the NCHLT Text project.
NCHLT Siswati Morphological Decomposer
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Morphological decomposer developed during the NCHLT Text project.
NCHLT isiXhosa fastText-Skipgram embeddings
(North-West University; Centre for Text Technology (CTexT), 2023-05-01)
Static word and subword embeddings for the Skipgram flavour of the fastText architecture (Bojanowski et al., 2017). The embedding provides real-valued ...
NCHLT isiNdebele FLAIR-forward embeddings
(North-West University; Centre for Text Technology (CTexT), 2023-05-01)
Contextual word/string embeddings for the forward flavour of the FLAIR architecture (Akbik et al., 2018). The embedding provides real-valued vector ...
NCHLT isiZulu word2vec-Skipgram embeddings
(North-West University; Centre for Text Technology (CTexT), 2023-05-01)
Static word embeddings for the Skipgram flavour of the word2vec (w2v) architecture (Mikolov et al., 2013). The embedding provides real-valued vector ...
NCHLT Sesotho word2vec-CBOW embeddings
(North-West University; Centre for Text Technology (CTexT), 2023-05-01)
Static word embeddings for the continuous bag of words (CBoW) flavour of the word2vec (w2v) architecture (Mikolov et al., 2013). The embedding provides ...
NCHLT isiXhosa GloVe embeddings
(North-West University; Centre for Text Technology (CTexT), 2023-05-01)
Static word embedding model based on the Global Vectors architecture (Pennington et al., 2014). The embeddings provide real-valued vector representations ...
Human Language Technology Audit 2017/18
(CSIR, 2018-08-31)
This document reports on all work conducted in the 2017/18 Audit of human language technology (HLT) resources available in South Africa project. The ...
African Wordnet: Setswana 1.0
(UNISA, 2017-06-20) ~ - Resource Catalogue
Developed using the expand model with Princeton WordNet 2.0 as basis.Each wordnet contains synsets with at least the following fields:\nWord form (lemma; ...
NCHLT Setswana Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Collection of source text documents, genre classified text documents, raw corpus, clean corpus, lexicon, frequency list and named-entity lists developed ...