Search

Now showing items 61-70 of 172

Lwazi Siswati TTS corpus

Daniel van Niekerk; Etienne Barnard; Marelie Davel; Aby Louw; Alta de Waal (Meraka Institute, CSIR, 2013-03-27) ~ Resource Catalogue

Orthographic and phonemically aligned transcriptions

African Speech Technology English Text Corpus

CatchWord Language and Speech Technologies (Pty) Ltd (North-West University; Stellenbosch University; University of Transkei; University of Free State (Qwa-Qwa campus); Rhodes University; University of KwaZulu-Natal; University of Western Cape, 2015-01-07) ~ Resource Catalogue

Monolingual text corpus developed during the African Speech Technology project.

Xitsonga Custom Dictionary for Government Domain

Martin Puttkammer; Nico Oosthuizen; Wikus Pienaar (North-West University; Centre for Text Technology (CTexT), 2013-02-22) ~ Resource Catalogue

Word list developed as a custom dictionary for use in the spelling checkers as part of the spelling checker project for the Department of Arts and ...

NCHLT Speech II Corpus

Jaco Badenhorst; Febe de Wet; Neil Kleynhans; Thipe Modipa (Meraka Institute, CSIR, 2016-05-09) ~ Resource Catalogue

The speech corpus generated from aligned audio samples from National Parliament using Hansard transcriptions are provided in terms of audio and ...

Lwazi Setswana Pronuncation Dictionary

Marelie Davel (Meraka Institute, CSIR, 2013-04-01) ~ Resource Catalogue

General phonemic pronunciations for frequently occurring words in SA languages. Dictionaries were developed to be practically usable for speech technology ...

Setswana Genre Classification Corpus

Gerhard van Huyssteen; D.P. Snyman (Trifonius, 2013-06-19) ~ Resource Catalogue

Contains training and testing data for Genre Classification for Setswana.

Lwazi III English TTS Corpus

Aby Louw; Georg Schlünz (Meraka Institute, CSIR, 2016-06-17) ~ Resource Catalogue

Complete audio recordings with orthographic transcriptions. TTS corpus for standard SA dialect. This corpus was created to enable the building of a TTS voice.

NCHLT Siswati Named Entity Annotated Corpus

B.B. Malangwane; M.N. Kekana; S.S. Sedibe; B.C. Ndhlovu; Roald Eiselen (North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ Resource Catalogue

Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.

NCHLT Siswati Annotated Text Corpora

Martin Puttkammer; Martin Schlemmer; Ruan Bekker (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue

Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.

NCHLT isiNdebele Speech Corpus

Charl van Heerden; Etienne Barnard; Jaco Badenhorst; Marelie Davel; Alta de Waal (Meraka Institute, CSIR; North-West University, 2014-07-08) ~ Resource Catalogue

Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.

View previous page
1
. . .
4
5
6
7
8
9
10
. . .
18
View next page