Search
Now showing items 91-100 of 105
Autshumato English-Sesotho Parallel Corpora
(CTexT® (Centre for Text Technology, North-West University), 2022-09-30)
Aligned parallel corpora for the language pair English-Sesotho. The data is given as two separate UTF-8 text files, with each aligned segment on a ...
Autshumato Monolingual Sesotho Corpus
(CTexT® (Centre for Text Technology, North-West University), 2022-09-30)
Monolingual corpus for Sesotho. The data is given as a single UTF-8 text file, with each segment on a newline. The data was specifically selected and ...
CTexTools 2
(North-West University, Centre for Text Technology (CTexT); South African Department of Arts and Culture, 2018-05-24) ~ - Resource Catalogue
CTexTools is a corpus query and manipulation tool primarily for the official South African languages. The tool supports the creation of frequency and ...
Autshumato Machine Translation Evaluation Set
(North-West University; Centre for Text Technology (CTexT); Department of Arts and Culture, South Africa, 2017-12-15) ~ - Resource Catalogue
Comparable evaluation data for use in automatic machine translation evaluations. The evaluation set consists of 500 sentences translated separately by ...
African Wordnet version 1.0
(UNISA, 2022-09-20)
Developed using the expand model with Princeton WordNet 3.1 as basis.
Please see https://africanwordnet.wordpress.com/ for all details on the project. ...
Sesotho function word speech data
(Centre for Text Technology, North-West University, 2019-05-28) ~ - Resource Catalogue
The primary aim of this speech data set was to study the role of tone in the function word "ke" in the minimal pairs "ke motho" and in the function word ...
NWU TransTips 1.0
(Centre for Text Technology (CTexT), 2013-07-01) ~ - Resource Index
TransTips is a PHP programming script that browses a web page for terms in the database. The translation of words contained in the database is linked ...
Sesotho vowel speech data set
(Centre for Text Technology, North-West University, 2019-05-28) ~ - Resource Catalogue
The primary aim of this speech dataset was to collect a representative set of words in which all the Sesotho vowels are present. Some of them are ...
Sesotho tone data set
(Centre for Text Technology, North-West University, 2019-05-28) ~ - Resource Catalogue
These recordings are of male and female speakers (11 for tasks 1 and 2; 10 for task 3) of the QwaQwa region (Eastern Free State). Ages of the speakers ...
W-NORM
(North-West University; Centre for Text Technology (CTexT), 2015-06-30) ~ - Resource Catalogue
W-NORM is a graphical user interface (GUI), written in Perl and GTK2, for the Vowels 1.2 package. Vowels 1.2 is written in the R programming language ...