Show simple item record

South African Broadcast News (SABN) Corpus
The corpus consists of approximately 20 hours of audio recordings from one of the country's main radio news channels, SAFM. Bulletins were broadcast between 1996 and 2006 and are a mix of news-reader speech, interviews, and crossings to reporters
Febe de Wet
Stellenbosch University; CSIR
broadcast news transcription; South African English; accents of English; under-resourced languages
Speech corpora
20 hours
The data comprises a collection of audio files. Each audio file corresponds to a news bulletin. Transcriptions of the audio are included in the data set in TextGrid format. All the 27 speakers are adults (8 male, 19 female).
Monolingual : Annotated : Unaligned
Resource Index

Files in this item


There are no files associated with this item.

This item appears in the following Collection(s)

  • Resource Index [412]
    A collection of language resource metadata mostly collected during the NHN funded technology audit of 2009, as well as the SADiLaR technology audit of 2018. Not all resources in this collection are available for download.

Show simple item record