Creative Commons Attribution 3.0 South Africa (CC BY 3.0 ZA): http://creativecommons.org/licenses/by/3.0/za/Jaco BadenhorstFebe de WetNeil KleynhansThipe ModipaAlfred TshoaneGeorg SchlunzStanly RamunyisiRaymond MolapoNic de Vries2018-02-062018-03-052018-02-062018-03-052016-05-09https://hdl.handle.net/20.500.12185/273The speech corpus generated from aligned audio samples from National Parliament using Hansard transcriptions are provided in terms of audio and transcriptions. The XML files provide the following metadata for each session: - audio filename - audio orthography - GOP (goodness of pronunciation) score - start time (seconds) - end time (seconds) The audio files are formatted as 16-bit Signed Integer PCM, single channel, and 16kHz sample rate.5.6 GbText16 kHz16 bit*.wavengNCHLT Speech II CorpusData