Search
Now showing items 231-240 of 345
NCHLT Sepedi word2vec-Skipgram embeddings
(North-West University; Centre for Text Technology (CTexT), 2023-05-01)
Static word embeddings for the Skipgram flavour of the word2vec (w2v) architecture (Mikolov et al., 2013). The embedding provides real-valued vector ...
NCHLT Sesotho Named Entity Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.
NCHLT isiZulu Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Collection of source text documents, genre classified text documents, raw corpus, clean corpus, lexicon, frequency list and named-entity lists developed ...
Autshumato Monolingual Xitsonga Corpus
(CTexT® (Centre for Text Technology, North-West University), 2022-09-30)
Monolingual corpus for Xitsonga. The data is given as a single UTF-8 text file, with each segment on a newline. The data was specifically selected and ...
Autshumato Machine Translation Web Service (MTWS)
(Centre for Text Technology; North-West University, 2018-03-01) ~ - Resource Index
The MTWS is a unified interface through which anyone can gain access to the MT systems developed in the Autshumato project. It can provide sentence, ...
Multilingual Life Orientation Intermediate Phase Terminology List
(Terminology Coordination Section of the National Language Service, Department of Arts and Culture, 2017-03-03) ~ - Resource Index
1628 English source terms with their equivalents in the ten other official South African languages. The terms were excerpted from life orientation ...
Multilingual Parliamentary / Political Terminology List
(Terminology Coordination Section of the National Language Service, Department of Arts and Culture, 2017-03-03) ~ - Resource Index
502 English source terms with their equivalents in the ten other official South African languages. The project built on a 2003 initiative of the national ...
NCHLT isiNdebele word2vec-CBOW embeddings
(North-West University; Centre for Text Technology (CTexT), 2023-05-01)
Static word embeddings for the continuous bag of words (CBoW) flavour of the word2vec (w2v) architecture (Mikolov et al., 2013). The embedding provides ...
NCHLT Xitsonga word2vec-Skipgram embeddings
(North-West University; Centre for Text Technology (CTexT), 2023-05-01)
Static word embeddings for the Skipgram flavour of the word2vec (w2v) architecture (Mikolov et al., 2013). The embedding provides real-valued vector ...
NCHLT Siswati FLAIR-backward embeddings
(North-West University; Centre for Text Technology (CTexT), 2023-05-01)
Contextual word/string embeddings for the backward flavour of the FLAIR architecture (Akbik et al., 2018). The embedding provides real-valued vector ...