Search
Now showing items 21-30 of 32
NCHLT Tshivenda Annotated Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.
USAf National Language Resources Audit 2023
(South African Centre for Digital Language Resources, 2023-10)
This report documents the findings of a comprehensive language resources audit conducted by the South African Centre for Digital Language Resources ...
Multilingual Linguistic Terminology
(UNISA, 2022-09-20)
Multilingual Linguistic Terminology Project
Termbanks of Linguistic terminology for South African languages
Version 1.0
https://linguistictermino ...
Autshumato English-Tshivenḓa Parallel Corpora
(North-West University; Centre for Text Technology (CTexT), 2023-12-12)
Aligned parallel corpora for the following language pair: English-Tshivenḓa. Data was crawled from various multilingual government websites, sourced ...
Autshumato Monolingual Tshivenḓa Corpus
(North-West University; Centre for Text Technology (CTexT), 2023-12-12)
Monolingual corpus for Tshivenḓa. The data is given as a single UTF-8 text file, with each segment on a newline.
Morphologically annotated corpus for Tshivenḓa
(Centre for Text Technology (CTexT), 2024-01-31)
NCHLT corpus of morphologically annotated tokens in Tshivenḓa converted to the tags used during phases 1 and 2 of the SADiLaR-II project.
The data is ...
CTexTools 2
(North-West University, Centre for Text Technology (CTexT); South African Department of Arts and Culture, 2018-05-24) ~ - Resource Catalogue
CTexTools is a corpus query and manipulation tool primarily for the official South African languages. The tool supports the creation of frequency and ...
Autshumato Machine Translation Evaluation Set
(North-West University; Centre for Text Technology (CTexT); Department of Arts and Culture, South Africa, 2017-12-15) ~ - Resource Catalogue
Comparable evaluation data for use in automatic machine translation evaluations. The evaluation set consists of 500 sentences translated separately by ...
African Wordnet version 1.0
(UNISA, 2022-09-20)
Developed using the expand model with Princeton WordNet 3.1 as basis.
Please see https://africanwordnet.wordpress.com/ for all details on the project. ...
CTexT Alignment Interface Pro
(North-West University; Centre for Text Technology (CTexT), 2013-06-21) ~ - Resource Catalogue
Utility application for the manual alignment of source texts. Pro version allows for the editing of the segments.