Search

Now showing items 21-30 of 105

South African Multilingual Proper Names (Multipron) Corpus

Etienne Barnard; Marelie Davel; Oluwapelumi Giwa; Nadia Barnard; Jean-Pierre Martens; Derik Thirion (Molo Afrika Speech Technologies, 2013-10-03) ~ Resource Catalogue

Audio, orthographic and auditory verified broad phonemic transcriptions of proper names in four languages, produced by speakers of the same four languages.

Lara2

Martin Puttkammer; Martin Schlemmer (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue

Tool for annotating texts with lemma, part of speech and morphological analysis information

Lwazi Sesotho ASR corpus

Charl van Heerden; Etienne Barnard; Jaco Badenhorst; Marelie Davel (Meraka Institute, CSIR, 2013-04-02) ~ Resource Catalogue

Complete audio recordings and orthographic transcriptions used for Lwazi speech recognition systems.

NCHLT Sesotho Annotated Text Corpora

Martin Puttkammer; Martin Schlemmer; Ruan Bekker (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue

Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.

High quality TTS data for four South African languages (af, st, tn, xh)

Unknown author (Google; North-West University, 2017) ~ Resource Catalogue

This data set contains multi-speaker TTS high quality transcribed audio data for four languages of South Africa: Afrikaans, Sesotho, Setswana and isiXhosa. ...

Lwazi II Sesotho TTS Corpus

Daniel van Niekerk; Georg Schlünz (Meraka Institute, CSIR; North-West University, 2015-11-20) ~ Resource Catalogue

Orthographic and phonemically aligned transcriptions.

NCHLT Sesotho GloVe embeddings

Roald Eiselen (North-West University; Centre for Text Technology (CTexT), 2023-05-01)

Static word embedding model based on the Global Vectors architecture (Pennington et al., 2014). The embeddings provide real-valued vector representations ...

Corpus of multilingual code-switched soap opera speech

van der Westhuizen, Ewald; Niesler, Thomas (Stellenbosch University, 2020-02-28)

The corpus comprises 26.9 hours of annotated multilingual speech that contains examples of code-switching in isiZulu, isiXhosa, Setswana, Sesotho and ...

Sesotho syllabification systems

Sibeko, Johannes; van Zaanen, Menno (South African Centre for Digital Language Resources, 2022-02-03)

This package contains two syllabification systems for Sesotho (rule-based and TeX-based).

Mburisano Covid-19 multilingual corpus

Marais, Laurette (CSIR Voice Computing, 2020-12-04)

This corpus was created to aid development of the AwezaMed Covid-19 speech-to-speech mobile application. The project within which it was created, ...

View previous page
1
2
3
4
5
6
. . .
11
View next page