Search
Now showing items 81-90 of 106
NCHLT Setswana Phrase Chunk Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
Phrase chunk annotated data for the NCHLT Text Resource Development: Phase II Project. The phrase chunk annotated data is a subset of the 50,000 tokens ...
Qfrency TTS phone mappings
(CSIR, 2018-03-02) ~ - Resource Index
TTS phone mappings between IPA, XSAMPA and our Qfrency internal format, standardised across all 11 SA languages. To be used in conjunction with the Lwazi ...
CorpusCatcher
(Translate.org.za, 2015-01-28) ~ - Resource Index
Corpus Catcher is a tool that is designed to crawl the web to retrieve data for inclusion in a corpus. It makes use of seed documents/wordlists to ...
Final year high school examination texts of South African home and first additional language subjects
(South African Centre for Digital Language Resources, 2022-11-16)
This data collection consists of reading comprehension and summary
writing texts. The texts comprise of the final year high school exam
texts for ...
NCHLT Part of Speech Taggers
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Part of speech taggers developed during the NCHLT Text project.
Available for the following languages: Afrikaans, English, isiNdebele, isiXhosa, isiZulu, ...
Lwazi II Sotho Pronunciation Dictionaries
(Meraka Institute, CSIR; North-West University, 2015-11-20) ~ - Resource Catalogue
Pronunciation dictionaries for Sepedi, Sesotho and Setswana with and without affricates, as well as the maps that were used to split the affricates into ...
Multilingual Illustrated Dictionary with interactive games
(Centre for Text Technology (CTexT); Pharos Dictionaries, 2013-07-01) ~ - Resource Index
Multilingual Illustrated Dictionary with interactive games and pronunciation for 7 of SA's official languages
Autshumato Setswana Monolingual Corpora
(North-West University; Centre for Text Technology (CTexT), 2016-10-28) ~ - Resource Catalogue
Setswana monolingual corpus as a deliverable of the Autshumato project. The data is given as a UTF-8 text file; with each sentence on a new line.
PSearch 1.1.
(North-West University; Centre for Text Technology (CTexT); Tilburg Centre for Cognition and Communication, 2015-01-30) ~ - Resource Index
PSearch is based on Paramsearch, a tool created by Antal van den Bosch for automatic algorithmic parameter optimisation for TiMBL and other machine ...
Intonation model for Bantu tone languages
(University of the Witwatersrand, 2015-01-27) ~ - Resource Index
Theoretical linguistic intonation model for Setswana-Sesotho and isiZulu, formulated as ste-wise rules in prose.