Search
Now showing items 11-20 of 28
NCHLT Afrikaans Named Entity Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.
Lwazi II Cross-lingual Proper Name Corpus
(Meraka Institute, CSIR; North-West University, 2015-11-20) ~ - Resource Catalogue
Prompted audio recordings of personal names in different languages, produced by 20 speakers with different language backgrounds.
Afrikaans Genre Classification Corpus
(Trifonius, 2013-06-19) ~ - Resource Catalogue
Contains training and testing data for Genre Classification for Afrikaans.
Lwazi Afrikaans Pronunciation Dictionary
(Meraka Institute, CSIR, 2013-04-01) ~ - Resource Catalogue
General phonemic pronunciations for frequently occurring words in SA languages. Dictionaries were developed to be practically usable for speech technology ...
South African Directory Enquiries (SADE) Name Corpus
(North-West University; Molo Afrika Speech Technologies; IntSyst Labs CC, 2015-09-07) ~ - Resource Catalogue
"Audio and tagged orthographic transcriptions of South African names produced by first-language speakers of 4 languages: Afrikaans, English, isiZulu, ...
Autshumato Afrikaans-English Translation Memory
(North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ - Resource Catalogue
Translation memory from Afrikaans to English (EN-GB), in the government domain for use in the Autshumato ITE application.
NCHLT Afrikaans Annotated Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.
Lwazi II Afrikaans TTS Corpus
(Meraka Institute, CSIR; North-West University, 2015-11-20) ~ - Resource Catalogue
Orthographic and phonemically aligned transcriptions
AuCoPro Semantics Dataset
(North-West University; Centre for Text Technology (CTexT); CLiPS Research Center, University of Antwerp, Belgium, 2015-01-07) ~ - Resource Catalogue
The AuCoPro Semantics dataset serves for the automatic semantic analysis of compounds. It contains semantically annotated noun-noun compounds (NN) from ...
Lwazi Afrikaans TTS corpus
(Meraka Institute, CSIR, 2013-03-27) ~ - Resource Catalogue
Orthographic and phonemically aligned transcriptions