NCHLT isiNdebele Speech Corpus

Charl van Heerden; Etienne Barnard; Jaco Badenhorst; Marelie Davel; Alta de Waal

NCHLT isiNdebele Speech Corpus

Please do not copy the URL from the browser for citation. The correct URL is 'https://hdl.handle.net/20.500.12185/272'

dc.contact.email	KCalteaux@csir.co.za
dc.contact.name	Karen Calteaux
dc.contributor.author	Charl van Heerden
dc.contributor.author	Etienne Barnard
dc.contributor.author	Jaco Badenhorst
dc.contributor.author	Marelie Davel
dc.contributor.author	Alta de Waal
dc.contributor.other	Willem Basson
dc.contributor.other	Nic de Vries
dc.contributor.other	Febe de Wet
dc.contributor.other	Thipe Modipa
dc.contributor.other	Gehard van Huyssteen
dc.database	Monolingual Speech Corpora: Annotated
dc.date.accessioned	2018-02-06T09:44:28Z
dc.date.accessioned	2018-03-05T15:20:08Z
dc.date.available	2018-02-06T09:44:28Z
dc.date.available	2018-03-05T15:20:08Z
dc.date.issued	2014-07-08
dc.description	Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.
dc.format.extent	4.5 Gb
dc.format.medium	UTF8
dc.format.medium	16 kHz
dc.format.medium	16 bit
dc.format.size	2.343055556
dc.identifier.citation	N.J. de Vries, M.H. Davel, J. Badenhorst, W.D. Basson, F. de Wet, E. Barnard and A. de Waal, "A smartphone-based ASR data collection tool for under-resourced languages", Speech Communication, Volume 56, January 2014, pp 119–131.
dc.identifier.islrn	818-971-863-561-0
dc.identifier.uri	https://hdl.handle.net/20.500.12185/272
dc.language.iso	nbl
dc.languages	isiNdebele
dc.media.category	Monolingual speech corpora: Annotated
dc.media.type	Speech
dc.project	NCHLT Speech
dc.publisher	Meraka Institute, CSIR
dc.publisher	North-West University
dc.rights.license	Creative Commons Attribution 3.0 Unported License (CC BY 3.0): http://creativecommons.org/licenses/by/3.0/legalcode
dc.source	Audio recordings smartphone-collected in non-studio environment
dc.source	Text prompts from various sources, predominantly from .gov.za (web)
dc.stratum	148 speakers (78 female/70 male). Prompted speech (3-5 word utterances read from a smartphone screen).
dc.title	NCHLT isiNdebele Speech Corpus
dc.type	Data
dc.version	1
local.collection.primary	Resource Catalogue
local.collection.secondary	Resource Index

Files

Original bundle

Now showing 1 - 1 of 1

Name:: nchlt.speech.corpus.nbl.zip
Size:: 4.35 GB
Format:: ZIP is an archive file format that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed.

Download

Collections

Resource Catalogue
Resource Index