Show simple item record

AwezaMed automatic speech recognition (ASR) test data
The corpus contains orthographically transcribed broadband speech in four official languages of South Africa: Afrikaans, English, isiXhosa and isiZulu. Respondents read a number of ASR prompts (10 or 20) in a real-world environment. Dataset includes 1 hour of test data
Karen Calteaux
KCalteaux@csir.co.za
Voice Computing (VC) Research Group at the CSIR Nextgen Enterprises and Institutions (NGEI)
Creative Commons Attribution 3.0 Unported (CC BY 3.0): https://creativecommons.org/licenses/by/3.0/legalcode
Afrikaans; English; isiXhosa; isiZulu
Bandehorst, Jaco
Van Niekerk, Nina; Calteaux, Karen
ASR; Test data; Speech corpora; AwezaMed
https://hdl.handle.net/20.500.12185/688
Annotated Speech Corpus
2025-02-12T11:12:33Z
2025-02-12T11:12:33Z
2020-12
Level 0


Files in this item

Thumbnail
Thumbnail
Thumbnail
Thumbnail

This item appears in the following Collection(s)

  • Resource Index [414]
    A collection of language resource metadata mostly collected during the NHN funded technology audit of 2009, as well as the SADiLaR technology audit of 2018. Not all resources in this collection are available for download.

Show simple item record