DSI LogoSADiLaR Logo
Clarin-ZA Logo
View Item 
  •   SADiLaR
  • Language Resource Management Agency
  • Resource Catalogue
  • View Item
  •   SADiLaR
  • Language Resource Management Agency
  • Resource Catalogue
  • View Item
    • Login
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Search form

    Browse

    All of SADiLaR

    Communities & CollectionsTitleProjectMedia type

    This Collection

    TitleProjectMedia type

    High quality TTS data for four South African languages (af, st, tn, xh)

    Thumbnail
    Download
    Audio files and transcriptions for Afrikaans (906.7Mb)
    MD5: 51bb040ada5a69f298ba0c6073d294a7

    License agreement

    By downloading this resource I accept and agree to the terms of use and the associated license conditions under which the resource is distributed.

    Audio files and transcriptions for Sesotho (690.8Mb)
    MD5: 707db230fda449ba31b445e2b32f3f73

    License agreement

    By downloading this resource I accept and agree to the terms of use and the associated license conditions under which the resource is distributed.

    Audio files and transcriptions for Setswana (695.6Mb)
    MD5: abd84abaca6e40666bcbb1c65c2a0429

    License agreement

    By downloading this resource I accept and agree to the terms of use and the associated license conditions under which the resource is distributed.

    Audio files and transcriptions for isiXhosa (865.4Mb)
    MD5: f5cdbe3d0fbf6b8440ab3e6017200b18

    License agreement

    By downloading this resource I accept and agree to the terms of use and the associated license conditions under which the resource is distributed.

    URI
    https://hdl.handle.net/20.500.12185/527
    Collections
    • Resource Catalogue [251]
    • Resource Index [409]
    Author(s)
    Google
    North-West University
    Metadata
    Show full item record
    Description
    This data set contains multi-speaker TTS high quality transcribed audio data for four languages of South Africa: Afrikaans, Sesotho, Setswana and isiXhosa. The data set consists of wave files, and a TSV file transcribing the audio. In each folder, the file line_index.tsv contains a FileID, which in turn contains the UserID and the Transcription of audio in the file. The data set has had some quality checks, but there might still be errors. This data set was collected by as a collaboration between North-West University and Google. See LICENSE.txt file for license information. Copyright 2017 Google, Inc.
    Contact person
    Daniel Povey
    Contact person's e-mail address
    dpovey@gmail.com
    Publisher(s)
    Google
    North-West University
    License
    Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)

    Copyright © 2018  SADiLaR. All Rights Reserved.
    Contact Us | Send Feedback
     

     


    Copyright © 2018  SADiLaR. All Rights Reserved.
    Contact Us | Send Feedback