NCHLT Speech II Corpus

Jaco Badenhorst; Febe de Wet; Neil Kleynhans; Thipe Modipa

Please do not copy the URL from the browser for citation. The correct URL is 'https://hdl.handle.net/20.500.12185/273'

NCHLT Speech II Corpus

Files

nchlt_speech_ii_corpus.zip (4.34 GB)

Date

2016-05-09

Authors

Jaco Badenhorst

Febe de Wet

Neil Kleynhans

Thipe Modipa

Publisher

Meraka Institute, CSIR

Description

The speech corpus generated from aligned audio samples from National Parliament using Hansard transcriptions are provided in terms of audio and transcriptions. The XML files provide the following metadata for each session: - audio filename - audio orthography - GOP (goodness of pronunciation) score - start time (seconds) - end time (seconds) The audio files are formatted as 16-bit Signed Integer PCM, single channel, and 16kHz sample rate.

License

Creative Commons Attribution 3.0 South Africa (CC BY 3.0 ZA)

URI

https://hdl.handle.net/20.500.12185/273

Collections

Resource Catalogue
Resource Index

Verification status

Level 0

Full item page

NCHLT Speech II Corpus

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

License

URI

Collections

Verification status