Search
Now showing items 21-30 of 106
COVID-19 Multilingual Terminology
(City of Tshwane; South African Centre for Digital Language Resources (SADiLaR); Department of Science and Innovation; Pan South African Language Board (PanSALB), 2021-07)
COVID-19 multilingual terminology list document in all the South African languages. The development of this terminology list was initiated by City of ...
Autshumato Monolingual Setswana Corpus
(CTexT® (Centre for Text Technology, North-West University), 2022-09-30)
Monolingual corpus for Setswana. The data is given as a single UTF-8 text file, with each segment on a newline. The data was specifically selected and ...
Lara2
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Tool for annotating texts with lemma, part of speech and morphological analysis information
NCHLT Setswana FLAIR-backward embeddings
(North-West University; Centre for Text Technology (CTexT), 2023-05-01)
Contextual word/string embeddings for the backward flavour of the FLAIR architecture (Akbik et al., 2018). The embedding provides real-valued vector ...
High quality TTS data for four South African languages (af, st, tn, xh)
(Google; North-West University, 2017) ~ - Resource Catalogue
This data set contains multi-speaker TTS high quality transcribed audio data for four languages of South Africa: Afrikaans, Sesotho, Setswana and isiXhosa. ...
NCHLT Setswana FLAIR-forward embeddings
(North-West University; Centre for Text Technology (CTexT), 2023-05-01)
Contextual word/string embeddings for the forward flavour of the FLAIR architecture (Akbik et al., 2018). The embedding provides real-valued vector ...
Corpus of multilingual code-switched soap opera speech
(Stellenbosch University, 2020-02-28)
The corpus comprises 26.9 hours of annotated multilingual speech that contains examples of code-switching in isiZulu, isiXhosa, Setswana, Sesotho and ...
NCHLT Setswana Auxiliary Speech Corpus
(CSIR Meraka Institute; North-West University, 2019-06-01) ~ - Resource Catalogue
The corpus contains orthographically transcribed broadband speech in each of South Africa's eleven official languages. Transcriptions are provided in ...
Lwazi II Setswana TTS Corpus
(Meraka Institute, CSIR; North-West University, 2015-11-20) ~ - Resource Catalogue
Orthographic and phonemically aligned transcriptions.
Setswana Spelling Checker 1.1
(Centre for Text Technology (CTexT), 2013-07-01) ~ - Resource Index
Spelling checkers and hyphenators for South African languages compatible with Microsoft® Office 2000, XP, 2003, 2007, 2010 or Microsoft® Office 2013. ...