Show simple item record

South African Directory Enquiries (SADE) Name Corpus
"Audio and tagged orthographic transcriptions of South African names produced by first-language speakers of 4 languages: Afrikaans, English, isiZulu, Sesotho. Utterances are tagged with speaker language, word language, speaker identity, speaker gender, broad phonemic pronunciation and pronunciation modality ('intended language')."
Marelie H. Davel
North-West University; Molo Afrika Speech Technologies; IntSyst Labs CC
Creative Commons Attribution 3.0 Unported License (CC BY 3.0):
Afrikaans; English; isiZulu; Sesotho
Charl van Heerden; Marelie Davel; Oluwapelumi Giwa; J.W.F Thirion
Anina Lambrechts; Bulelwa Matjene; Etienne Barnard; Marelie H.Davel; Nadia Barnard; Sarina le Roux; and various language practitioners from 'The Translation World'.
Thirion, J.W., van Heerden, C., Giwa, O. and Davel, M.H. 2019. The South African directory enquiries (SADE) name corpus. Language Resources and Evaluation, pp.1-30.
Multilingual speech corpora: annotated
494 Mb (zipped)
13h56m09s (40 speakers, each producing 400 utterances, 16,000 utterances in total)
Text; Microsoft Wav files
South African Directory Enquiry System
Telephone recordings
Resource Catalogue
Resource Index
afr; eng; zul; sot
2018-02-05T20:21:10Z; 2018-03-05T17:48:33Z
2018-02-05T20:21:10Z; 2018-03-05T17:48:33Z

Files in this item


This item appears in the following Collection(s)

  • Resource Catalogue [235]
    A collection of language resources available for download from the RMA of SADiLaR. The collection mostly consists of resources developed with funding from the Department of Arts and Culture.
  • Resource Index [375]
    A collection of language resource metadata mostly collected during the NHN funded technology audit of 2009, as well as the SADiLaR technology audit of 2018. Not all resources in this collection are available for download.

Show simple item record