Search
Now showing items 21-30 of 51
Core technologies for conjunctively written South African languages
(North-West University, Centre for Language Technology (CTexT), 2021-03-31)
During this SADiLaR funded project, enriched corpora for the four official South African languages with a conjunctive orthography,
i.e. isiNdebele ...
Lwazi isiNdebele Pronunciation Dictionary
(Meraka Institute, CSIR, 2013-04-01) ~ - Resource Catalogue
General phonemic pronunciations for frequently occurring words in SA languages. Dictionaries were developed to be practically usable for speech technology ...
NCHLT isiNdebele Annotated Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.
PHONAAS
(North-West University; Centre for Text Technology (CTexT), 2015-06-30) ~ - Resource Catalogue
PHONAAS is a graphical user interface (GUI) tool, written in Perl and GTK2, using the R programming language and PRAAT to extract vowel formant data.
NCHLT isiNdebele FLAIR-forward embeddings
(North-West University; Centre for Text Technology (CTexT), 2023-05-01)
Contextual word/string embeddings for the forward flavour of the FLAIR architecture (Akbik et al., 2018). The embedding provides real-valued vector ...
Human Language Technology Audit 2017/18
(CSIR, 2018-08-31)
This document reports on all work conducted in the 2017/18 Audit of human language technology (HLT) resources available in South Africa project. The ...
W-NORM
(North-West University; Centre for Text Technology (CTexT), 2015-06-30) ~ - Resource Catalogue
W-NORM is a graphical user interface (GUI), written in Perl and GTK2, for the Vowels 1.2 package. Vowels 1.2 is written in the R programming language ...
DictionaryMaker
(Meraka Institute, CSIR, 2013-07-15) ~ - Resource Catalogue
The purpose of the DictionaryMaker system is to facilitate the creation of an electronic pronunciation dictionary in a target language, as originally ...
NCHLT isiNdebele Phrase Chunk Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
Phrase chunk annotated data for the NCHLT Text Resource Development: Phase II Project. The phrase chunk annotated data is a subset of the 50,000 tokens ...
Lwazi isiNdebele ASR corpus
(Meraka Institute, CSIR, 2013-04-02) ~ - Resource Catalogue
Complete audio recordings and orthographic transcriptions used for Lwazi speech recognition systems.