Search
Now showing items 221-230 of 238
NCHLT Siswati Phrase Chunk Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
Phrase chunk annotated data for the NCHLT Text Resource Development: Phase II Project. The phrase chunk annotated data is a subset of the 50,000 tokens ...
Autshumato Setswana Monolingual Corpora
(North-West University; Centre for Text Technology (CTexT), 2016-10-28) ~ - Resource Catalogue
Setswana monolingual corpus as a deliverable of the Autshumato project. The data is given as a UTF-8 text file; with each sentence on a new line.
Monolingual isiXhosa corpus
(North-West University - Centre for Text Technology (CTexT), 2019-11-30) ~ - Resource Catalogue
Monolingual corpus for isiXhosa. The data is given as a single UTF-8 text file, with each segment on a newline.
The dataset contains existing data ...
Woefzela
(Meraka Institute, CSIR, 2014-07-04) ~ - Resource Catalogue
The primary purpose of the Woefzela software application is to record a list of prompts by a number of different speakers. The resultant output is then ...
Lwazi isiZulu ASR corpus
(Meraka Institute, CSIR, 2013-04-02) ~ - Resource Catalogue
Complete audio recordings and orthographic transcriptions used for Lwazi speech recognition systems.
Lwazi Xitsonga ASR corpus
(Meraka Institute, CSIR, 2013-04-02) ~ - Resource Catalogue
Complete audio recordings and orthographic transcriptions used for Lwazi speech recognition systems.
NCHLT Tshivenda Annotated Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.
NCHLT isiZulu Morphological Decomposer
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Morphological decomposer developed during the NCHLT Text project.
CTexT Alignment Interface
(North-West University; Centre for Text Technology (CTexT), 2013-06-21) ~ - Resource Catalogue
Utility application for the manual alignment of source texts.
Autshumato ITE
(North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ - Resource Catalogue
Integrated Translation Environment. Combines multiple translation tools into one environment.