Please do not copy the URL from the browser for citation. The correct URL is 'https://hdl.handle.net/20.500.12185/688'

AwezaMed automatic speech recognition (ASR) test data

Files

AwezaMed_ASR_test_data_corpus.zip (54.39 MB)

README_Mburisano_AwezaMed_asr_test_data_mandetory_fields.txt (1.11 KB)

README_Mburisano_AwezaMed_asr_test_data.txt (5.4 KB)

LICENSE.txt (1.99 KB)

Deposit Licenses

license.txt (3.22 KB)

Date

2020-12

Authors

Bandehorst, Jaco

Publisher

Voice Computing (VC) Research Group at the CSIR Nextgen Enterprises and Institutions (NGEI)

Description

The corpus contains orthographically transcribed broadband speech in four official languages of South Africa: Afrikaans, English, isiXhosa and isiZulu. Respondents read a number of ASR prompts (10 or 20) in a real-world environment. Dataset includes 1 hour of test data

Keywords

ASR, Test data, Speech corpora, AwezaMed

License

Creative Commons Attribution 3.0 Unported (CC BY 3.0)

URI

https://hdl.handle.net/20.500.12185/688

Collections

Verification status

Level 0