DSI LogoSADiLaR Logo
Clarin-ZA Logo
View Item 
  •   SADiLaR
  • Language Resource Management Agency
  • Resource Catalogue
  • View Item
  •   SADiLaR
  • Language Resource Management Agency
  • Resource Catalogue
  • View Item
    • Login
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Search form

    Browse

    All of SADiLaR

    Communities & CollectionsTitleProjectMedia type

    This Collection

    TitleProjectMedia type

    NCHLT Speech II Corpus

    Thumbnail
    Download
    nchlt_speech_ii_corpus.zip (4.336Gb)
    MD5: 78b9cd0bd557fc0d101b00d0e1053c86

    License agreement

    By downloading this resource I accept and agree to the terms of use and the associated license conditions under which the resource is distributed.

    URI
    https://hdl.handle.net/20.500.12185/273
    Collections
    • Resource Catalogue [251]
    • Resource Index [409]
    Author(s)
    Jaco Badenhorst
    Febe de Wet
    Neil Kleynhans
    Thipe Modipa
    Metadata
    Show full item record
    Description
    The speech corpus generated from aligned audio samples from National Parliament using Hansard transcriptions are provided in terms of audio and transcriptions. The XML files provide the following metadata for each session: - audio filename - audio orthography - GOP (goodness of pronunciation) score - start time (seconds) - end time (seconds) The audio files are formatted as 16-bit Signed Integer PCM, single channel, and 16kHz sample rate.
    Contact person
    Karen Calteaux
    Contact person's e-mail address
    KCalteaux@csir.co.za
    Publisher(s)
    Meraka Institute, CSIR
    License
    Creative Commons Attribution 3.0 South Africa (CC BY 3.0 ZA)

    Copyright © 2018  SADiLaR. All Rights Reserved.
    Contact Us | Send Feedback
     

     


    Copyright © 2018  SADiLaR. All Rights Reserved.
    Contact Us | Send Feedback