Repository logoRepository logo
 

NCHLT Afrikaans Speech Corpus

dc.contact.emailKCalteaux@csir.co.za
dc.contact.nameKaren Calteaux
dc.contributor.authorCharl van Heerden
dc.contributor.authorEtienne Barnard
dc.contributor.authorJaco Badenhorst
dc.contributor.authorMarelie Davel
dc.contributor.authorAlta de Waal
dc.contributor.otherWillem Basson
dc.contributor.otherNic de Vries
dc.contributor.otherFebe de Wet
dc.contributor.otherThipe Modipa
dc.contributor.otherGehard van Huyssteen
dc.databaseMonolingual Speech Corpora: Annotated
dc.date.accessioned2018-02-06T08:51:42Z
dc.date.accessioned2018-03-05T17:34:09Z
dc.date.available2018-02-06T08:51:42Z
dc.date.available2018-03-05T17:34:09Z
dc.date.issued2014-07-08
dc.descriptionOrthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.
dc.format.extent4.6 Gb
dc.format.mediumUTF8
dc.format.medium16 kHz
dc.format.medium16 bit
dc.format.size2.348611111
dc.identifier.citationN.J. de Vries, M.H. Davel, J. Badenhorst, W.D. Basson, F. de Wet, E. Barnard and A. de Waal, "A smartphone-based ASR data collection tool for under-resourced languages", Speech Communication, Volume 56, January 2014, pp 119–131.
dc.identifier.islrn644-175-105-852-3
dc.identifier.urihttps://hdl.handle.net/20.500.12185/280
dc.language.isoafr
dc.languagesAfrikaans
dc.media.categoryMonolingual speech corpora: Annotated
dc.media.typeSpeech
dc.projectNCHLT Speech
dc.publisherMeraka Institute, CSIR
dc.publisherNorth-West University
dc.rights.licenseCreative Commons Attribution 3.0 Unported License (CC BY 3.0): http://creativecommons.org/licenses/by/3.0/legalcode
dc.sourceAudio recordings smartphone-collected in non-studio environment
dc.sourceText prompts from various sources, predominantly from .gov.za (web)
dc.stratum210 speakers (98 female/112 male). Prompted speech (3-5 word utterances read from a smartphone screen)
dc.titleNCHLT Afrikaans Speech Corpus
dc.typeData
dc.version1
local.collection.primaryResource Catalogue
local.collection.secondaryResource Index

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
nchlt.speech.corpus.afr.zip
Size:
4.51 GB
Format:
ZIP is an archive file format that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed.