NCHLT Siswati Auxiliary Speech Corpus
Please do not copy the URL from the browser for citation. The correct URL is 'https://hdl.handle.net/20.500.12185/515'
dc.contact.email | KCalteaux@csir.co.za | |
dc.contact.name | Karen Calteaux | |
dc.contributor.author | Febe de Wet | |
dc.contributor.author | Laura Martinus | |
dc.contributor.author | Jaco Badenhorst | |
dc.contributor.other | Charl van Heerder | |
dc.contributor.other | Etienne Barnard | |
dc.contributor.other | Marelie Davel | |
dc.contributor.other | Alta de Waal | |
dc.date.accessioned | 2019-07-17T06:49:56Z | |
dc.date.available | 2019-07-17T06:49:56Z | |
dc.date.issued | 2019-06-01 | |
dc.description | The corpus contains orthographically transcribed broadband speech in each of South Africa's eleven official languages. Transcriptions are provided in XML format. | |
dc.format.extent | Aux 1: 78:48:56 Aux 2: 167:42:11 | |
dc.format.medium | N/A | |
dc.format.size | Aux 1: 6.17 GB, Aux 2: 13.1 GB | |
dc.identifier.citation | Jaco Badenhorst, Laura Martinus and Febe de Wet, "BLSTM harvesting of auxiliary NCHLT speech data", In Proceedings of SAUPEC/ROBMECH/PRASA 2019, Bloemfontein, South Africa, January 2019. | |
dc.identifier.citation | Etienne Barnard, Marelie H. Davel, Charl van Heerden, Febe de Wet and Jaco Badenhorst, "The NCHLT Speech Corpus of the South African languages", In Proc. 4th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU), St Petersburg, Russia, May 2014. | |
dc.identifier.citation | Charl van Heerden, Marelie H. Davel and Etienne Barnard, "The semi-automated creation of stratified speech corpora", In Proc. Pattern Recognition Association of South Africa annual symposium (PRASA), Johannesburg, South Africa, Dec 2013, pp. 115-119. | |
dc.identifier.citation | N.J. de Vries, M.H. Davel, J. Badenhorst, W.D. Basson, F. de Wet, E. Barnard and A. de Waal, "A smartphone-based ASR data collection tool for under-resourced languages", Speech Communication, Volume 56, January 2014, pp. 119-131. | |
dc.identifier.citation | Marelie H. Davel, Charl van Heerden, and Etienne Barnard, "Validating Smartphone-Collected Speech Corpora", in In Proc. 3rd International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU), Cape Town, South Africa, May 2012, pp. 68-75. | |
dc.identifier.citation | C van Heerden, M.H. Davel and E. Barnard, "Medium-Vocabulary Speech Recognition for Under-Resourced Languages", in In Proc. 3rd International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU), Cape Town, South Africa, May 2012, pp. 146-151. | |
dc.identifier.citation | J. Badenhorst, A. De Waal and F. de Wet, "Quality measurements for mobile data collection in the developing world", in In Proc. 3rd International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU), Cape Town, South Africa, May 2012, pp. 139-145. | |
dc.identifier.uri | https://hdl.handle.net/20.500.12185/515 | |
dc.language.iso | ssw | |
dc.languages | Siswati | |
dc.media.category | Annotated Monolingual Speech Corpus | |
dc.media.type | Speech | |
dc.project | NCHLT Speech | |
dc.publisher | CSIR Meraka Institute | |
dc.publisher | North-West University | |
dc.rights.license | Creative Commons Attribution 3.0 Unported (CC BY 3.0): https://creativecommons.org/licenses/by/3.0/legalcode | |
dc.subject | Siswati; Speech corpora; Transcribed | |
dc.title | NCHLT Siswati Auxiliary Speech Corpus | |
dc.version | 1 | |
local.collection.primary | Resource Catalogue | |
local.collection.secondary | Resource Index |
Files
Original bundle
1 - 2 of 2
Loading...
- Name:
- ssw-aux1.zip
- Size:
- 6.18 GB
- Format:
- ZIP is an archive file format that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed.
Loading...
- Name:
- ssw-aux2.zip
- Size:
- 13.14 GB
- Format:
- ZIP is an archive file format that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed.