Search
Now showing items 31-40 of 46
Woefzela
(Meraka Institute, CSIR, 2014-07-04) ~ - Resource Catalogue
The primary purpose of the Woefzela software application is to record a list of prompts by a number of different speakers. The resultant output is then ...
NCHLT Tshivenda Annotated Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.
USAf National Language Resources Audit 2023
(South African Centre for Digital Language Resources, 2023-10)
This report documents the findings of a comprehensive language resources audit conducted by the South African Centre for Digital Language Resources ...
Generic Multilingual Academic Wordlists with Definitions
(SADiLaR; ICELDA, 2022)
This multilingual generic academic wordlist has been developed to serve as a resource to students to assist with building a vocabulary and decoding ...
Multilingual Linguistic Terminology
(UNISA, 2022-09-20)
Multilingual Linguistic Terminology Project
Termbanks of Linguistic terminology for South African languages
Version 1.0
https://linguistictermino ...
Autshumato English-Tshivenḓa Parallel Corpora
(North-West University; Centre for Text Technology (CTexT), 2023-12-12)
Aligned parallel corpora for the following language pair: English-Tshivenḓa. Data was crawled from various multilingual government websites, sourced ...
Autshumato Monolingual Tshivenḓa Corpus
(North-West University; Centre for Text Technology (CTexT), 2023-12-12)
Monolingual corpus for Tshivenḓa. The data is given as a single UTF-8 text file, with each segment on a newline.
Morphologically annotated corpus for Tshivenḓa
(Centre for Text Technology (CTexT), 2024-01-31)
NCHLT corpus of morphologically annotated tokens in Tshivenḓa converted to the tags used during phases 1 and 2 of the SADiLaR-II project.
The data is ...
CTexTools 2
(North-West University, Centre for Text Technology (CTexT); South African Department of Arts and Culture, 2018-05-24) ~ - Resource Catalogue
CTexTools is a corpus query and manipulation tool primarily for the official South African languages. The tool supports the creation of frequency and ...
Autshumato Machine Translation Evaluation Set
(North-West University; Centre for Text Technology (CTexT); Department of Arts and Culture, South Africa, 2017-12-15) ~ - Resource Catalogue
Comparable evaluation data for use in automatic machine translation evaluations. The evaluation set consists of 500 sentences translated separately by ...