DSI LogoSADiLaR Logo
Clarin-ZA Logo
Search 
  •   SADiLaR
  • Language Resource Management Agency
  • Resource Catalogue
  • Search
  •   SADiLaR
  • Language Resource Management Agency
  • Resource Catalogue
  • Search
    • Login
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of SADiLaR

    Communities & CollectionsTitleProjectMedia type

    This Collection

    TitleProjectMedia type

    Filter

    Language

    Afrikaans (31)Dutch (2)
    English (63)
    isiNdebele (21)isiXhosa (23)isiZulu (30)Sepedi (3)Sesotho (27)Sesotho sa Leboa (Sepedi) (23)Setswana (23)Siswati (21)Tshivenda (21)Xitsonga (23)... View More

    Collection

    Resource Catalogue (57)

    Media type

    Speech (28)Text (35)

    Project

    "A multilingual corpus of code-switched South African speech", carried out on behalf of the Department of Arts and Culture of the Government of South Africa (1)African Speech Technology (6)Autshumato (18)Autshumato IV (1)Human Language Technology Audit 2017/18 (1)Lwazi (6)Lwazi II (3)Lwazi III (1)NCHLT Speech (4)NCHLT Speech II (1)NCHLT Text (2)NCHLT Text II (1)NCHLT Text III (2)... View More

    Resource type

    Applications (2)Data (34)Tools (16)

    Database

    Monolingual Speech Corpora: Annotated (7)Multilingual Text Corpora: Aligned (5)

    Search

    Show Advanced FiltersHide Advanced Filters

    Filters

    Use filters to refine the search results.

    Now showing items 1-10 of 63

    Filter options

    • Sort Options:
    • Relevance
    • Title Asc
    • Title Desc
    • Language Asc
    • Language Desc
    • Collection Asc
    • Collection Desc
    • Media type Asc
    • Media type Desc
    • Project Asc
    • Project Desc
    • Resource type Asc
    • Resource type Desc
    • Database Asc
    • Database Desc
    • Results Per Page:
    • 5
    • 10
    • 20
    • 40
    • 60
    • 80
    • 100
    Thumbnail

    Lwazi III English TTS Corpus 

    Aby Louw; Georg Schlünz (Meraka Institute, CSIR, 2016-06-17) ~ Resource Catalogue
    Complete audio recordings with orthographic transcriptions. TTS corpus for standard SA dialect. This corpus was created to enable the building of a TTS voice.
    Thumbnail

    NCHLT English Speech Corpus 

    Charl van Heerden; Etienne Barnard; Jaco Badenhorst; Marelie Davel; Alta de Waal (Meraka Institute, CSIR; North-West University, 2014-07-08) ~ Resource Catalogue
    Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.
    Thumbnail

    NCHLT Speech II Corpus 

    Jaco Badenhorst; Febe de Wet; Neil Kleynhans; Thipe Modipa (Meraka Institute, CSIR, 2016-05-09) ~ Resource Catalogue
    The speech corpus generated from aligned audio samples from National Parliament using Hansard transcriptions are provided in terms of audio and ...
    Thumbnail

    African Speech Technology English-English Speech Corpus 

    CatchWord Language and Speech Technologies (Pty) Ltd (North-West University; Stellenbosch University; University of Transkei; University of Free State (Qwa-Qwa campus); Rhodes University; University of KwaZulu-Natal; University of Western Cape, 2014-12-11) ~ Resource Catalogue
    African Speech Technology speech and transcription data for the English-English database. The "speech" directory contains English speech as spoken by ...
    Thumbnail

    Lwazi Telephony Platform 

    Mixo Shiburi; Louis Joubert; Richard Carlson; Tshepo Moganedi (Meraka Institute, CSIR, 2013-07-15) ~ Resource Catalogue
    Lwazi is a robust telephony platform aiming to facilitate speedy development of experimental applications without sacrificing power by combining Asterisk ...
    Thumbnail

    African Speech Technology Indian-English Speech Corpus 

    CatchWord Language and Speech Technologies (Pty) Ltd (North-West University; Stellenbosch University; University of Transkei; University of Free State (Qwa-Qwa campus); Rhodes University; University of KwaZulu-Natal; University of Western Cape, 2014-12-11) ~ Resource Catalogue
    African Speech Technology speech and transcription data for the Indian-English database. The "speech" directory contains English speech as spoken by ...
    Thumbnail

    NCHLT English Text Corpora 

    Martin Puttkammer; Martin Schlemmer; Wikus Pienaar; Ruan Bekker (North-West University; Centre for Text Technology (CTexT), 2016-09-09) ~ Resource Catalogue
    Collection consisting of a clean corpus, lexicon, frequency list and named-entity lists developed during the NCHLT Text project.
    Thumbnail

    NCHLT Optical Character Recognition for South African Languages 

    Martin Puttkammer; Justin Hocking; Roald Eiselen (North-West University; Centre for Text Technology (CTexT), 2017-02-23) ~ Resource Catalogue
    An OCR system is an application that enables one to convert scanned paper documents into editable and searchable texts. The engine analyses the structure ...
    Thumbnail

    NCHLT South African Language Identifier 

    Martin Puttkammer; Justin Hocking; Roald Eiselen (North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ Resource Catalogue
    A graphical user interface and command line tool to automatically classify a document, paragraph, sentence or phrase as one of the eleven official South ...
    Thumbnail

    NCHLT-inlang Pronunciation Dictionaries 

    Marelie Davel (Meraka Institute, CSIR; North-West University, 2014-07-04) ~ Resource Catalogue
    Broad phonemic transcriptions for 15,000 generic words in each of 11 languages. Each dictionary has an associated rule set for generating pronunciations ...
    • View previous page

    • 1
    • 2
    • 3
    • 4
    • . . .
    • 7
    • View next page


    Copyright © 2018  SADiLaR. All Rights Reserved.
    Contact Us | Send Feedback
     

     


    Copyright © 2018  SADiLaR. All Rights Reserved.
    Contact Us | Send Feedback