Search
Now showing items 41-50 of 58
NCHLT Setswana Phrase Chunk Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
Phrase chunk annotated data for the NCHLT Text Resource Development: Phase II Project. The phrase chunk annotated data is a subset of the 50,000 tokens ...
NCHLT Part of Speech Taggers
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Part of speech taggers developed during the NCHLT Text project.
Available for the following languages: Afrikaans, English, isiNdebele, isiXhosa, isiZulu, ...
Lwazi II Sotho Pronunciation Dictionaries
(Meraka Institute, CSIR; North-West University, 2015-11-20) ~ - Resource Catalogue
Pronunciation dictionaries for Sepedi, Sesotho and Setswana with and without affricates, as well as the maps that were used to split the affricates into ...
Autshumato Setswana Monolingual Corpora
(North-West University; Centre for Text Technology (CTexT), 2016-10-28) ~ - Resource Catalogue
Setswana monolingual corpus as a deliverable of the Autshumato project. The data is given as a UTF-8 text file; with each sentence on a new line.
Woefzela
(Meraka Institute, CSIR, 2014-07-04) ~ - Resource Catalogue
The primary purpose of the Woefzela software application is to record a list of prompts by a number of different speakers. The resultant output is then ...
USAf National Language Resources Audit 2023
(South African Centre for Digital Language Resources, 2023-10)
This report documents the findings of a comprehensive language resources audit conducted by the South African Centre for Digital Language Resources ...
Generic Multilingual Academic Wordlists with Definitions
(SADiLaR; ICELDA, 2022)
This multilingual generic academic wordlist has been developed to serve as a resource to students to assist with building a vocabulary and decoding ...
Multilingual Linguistic Terminology
(UNISA, 2022-09-20)
Multilingual Linguistic Terminology Project
Termbanks of Linguistic terminology for South African languages
Version 1.0
https://linguistictermino ...
Morphologically annotated corpus for Setswana
(Centre for Text Technology (CTexT), 2024-01-31)
NCHLT corpus of morphologically annotated tokens in Setswana converted to the tags used during phases 1 and 2 of the SADiLaR-II project.
The data is ...
CTexTools 2
(North-West University, Centre for Text Technology (CTexT); South African Department of Arts and Culture, 2018-05-24) ~ - Resource Catalogue
CTexTools is a corpus query and manipulation tool primarily for the official South African languages. The tool supports the creation of frequency and ...