Search

Now showing items 51-60 of 238

Lwazi Sepedi TTS corpus

Daniel van Niekerk; Etienne Barnard; Marelie Davel; Aby Louw; Alta de Waal (Meraka Institute, CSIR, 2013-03-27) ~ Resource Catalogue

Orthographic and phonemically aligned transcriptions

Lwazi Xitsonga TTS corpus

Daniel van Niekerk; Etienne Barnard; Marelie Davel; Aby Louw; Alta de Waal (Meraka Institute, CSIR, 2013-03-27) ~ Resource Catalogue

Orthographic and phonemically aligned transcriptions

Autshumato English-Afrikaans Translation Memory

Cindy McKellar; Handré Groenewald (North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ Resource Catalogue

Translation memory from English (EN-GB) to Afrikaans, in the government domain for use in the Autshumato ITE application.

Lwazi isiXhosa ASR corpus

Charl van Heerden; Etienne Barnard; Jaco Badenhorst; Marelie Davel (Meraka Institute, CSIR, 2013-04-02) ~ Resource Catalogue

Complete audio recordings and orthographic transcriptions used for Lwazi speech recognition systems.

Autshumato English-Sesotho sa Leboa Translation Memory

Cindy McKellar; Marissa Griesel; Handré Groenewald (North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ Resource Catalogue

Translation memory from English (EN-GB) to Sesotho sa Leboa, in the government domain for use in the Autshumato ITE application.

South African Multilingual Proper Names (Multipron) Corpus

Etienne Barnard; Marelie Davel; Oluwapelumi Giwa; Nadia Barnard; Jean-Pierre Martens; Derik Thirion (Molo Afrika Speech Technologies, 2013-10-03) ~ Resource Catalogue

Audio, orthographic and auditory verified broad phonemic transcriptions of proper names in four languages, produced by speakers of the same four languages.

Lara2

Martin Puttkammer; Martin Schlemmer (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue

Tool for annotating texts with lemma, part of speech and morphological analysis information

NCHLT isiXhosa Lemmatiser

Martin Puttkammer; Martin Schlemmer; Ruan Bekker (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue

Lemmatiser developed during the NCHLT Text project. \n\n Available in the Readme.txt - Input format: Text data (encoding: UTF8 without BOM), one ...

NCHLT Tshivenda Phrase Chunk Annotated Corpus

S.L. Tshikota; M.E. Takalani; A. Nyoni; Roald Eiselen (North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ Resource Catalogue

Phrase chunk annotated data for the NCHLT Text Resource Development: Phase II Project. The phrase chunk annotated data is a subset of the 50,000 tokens ...

Autshumato English-Xitsonga Parallel Corpora

Wikus Pienaar; Wildrich Fourie; Cindy McKellar (North-West University; Centre for Text Technology (CTexT), 2014-12-11) ~ Resource Catalogue

Aligned English-Xitsonga parallel corpus. The data is given as two seperate UTF-8 text files; with each segment on a newline.

View previous page
1
. . .
3
4
5
6
7
8
9
. . .
24
View next page