Repository logoRepository logo
 

NCHLT Speech II Corpus

dc.contact.emailKCalteaux@csir.co.za
dc.contact.nameKaren Calteaux
dc.contributor.authorJaco Badenhorst
dc.contributor.authorFebe de Wet
dc.contributor.authorNeil Kleynhans
dc.contributor.authorThipe Modipa
dc.contributor.otherAlfred Tshoane
dc.contributor.otherGeorg Schlunz
dc.contributor.otherStanly Ramunyisi
dc.contributor.otherRaymond Molapo
dc.contributor.otherNic de Vries
dc.databaseMonolingual Speech Corpora: Annotated
dc.date.accessioned2018-02-06T09:46:40Z
dc.date.accessioned2018-03-05T15:23:12Z
dc.date.available2018-02-06T09:46:40Z
dc.date.available2018-03-05T15:23:12Z
dc.date.issued2016-05-09
dc.descriptionThe speech corpus generated from aligned audio samples from National Parliament using Hansard transcriptions are provided in terms of audio and transcriptions. The XML files provide the following metadata for each session: - audio filename - audio orthography - GOP (goodness of pronunciation) score - start time (seconds) - end time (seconds) The audio files are formatted as 16-bit Signed Integer PCM, single channel, and 16kHz sample rate.
dc.format.extent5.6 Gb
dc.format.mediumText
dc.format.medium16 kHz
dc.format.medium16 bit
dc.format.medium*.wav
dc.identifier.urihttps://hdl.handle.net/20.500.12185/273
dc.language.isoeng
dc.languagesEnglish
dc.media.categoryMonolingual speech corpora: Annotated
dc.media.typeSpeech
dc.projectNCHLT Speech II
dc.publisherMeraka Institute, CSIR
dc.rights.licenseCreative Commons Attribution 3.0 South Africa (CC BY 3.0 ZA): http://creativecommons.org/licenses/by/3.0/za/
dc.sourceAudio recordings smartphone-collected in non-studio environment
dc.sourceText prompts from various sources, predominantly from .gov.za (web)
dc.titleNCHLT Speech II Corpus
dc.typeData
dc.version1
local.collection.primaryResource Catalogue
local.collection.secondaryResource Index

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
nchlt_speech_ii_corpus.zip
Size:
4.34 GB
Format:
ZIP is an archive file format that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed.