Creative Commons Attribution 3.0 Unported License (CC BY 3.0): http://creativecommons.org/licenses/by/3.0/legalcodeMarelie DavelCharl van HeerdenWillem BassonSimon KemishoThipe ModipaMpho KgampeEtienne BarnardMartin Puttkammervarious language practitioners from C-Trans (NWU)Translation World.2018-02-052018-03-052018-02-052018-03-052014-07-04E. Barnard, M. H. Davel, C. van Heerden, F. de Wet and J. Badenhorst, "The NCHLT corpus of the South African languages", in Proc. SLTU, May 2014.https://hdl.handle.net/20.500.12185/365Broad phonemic transcriptions for 15,000 generic words in each of 11 languages. Each dictionary has an associated rule set for generating pronunciations for unseen words.1.1 MbText: UTF8, tab-delimited text Pronunciations: X-SAMPA Audio: 44,100 bps, 16-bit mono wav encodingafrNCHLT-inlang Pronunciation DictionariesData744-144-734-416-815,000 words per language