Welcome to the Language Resource Management Agency of SADiLaR. This repository provides access to all of the collections, data sets, tools and other language resources that are distributed by SADiLaR.

The repository will eventually replace all of the functionality of the original RMA site, with all of the resources available from the RMA, also available from this repository.

Select a community to browse its collections.

Language Resource Management Agency [327]
  • CTexTools 2 

    Eiselen, Roald, et al. (North-West University, Centre for Text Technology (CTexT); South African Department of Arts and Culture, 2018-06) ~ Resource Catalogue
    CTexTools is a corpus query and manipulation tool primarily for the official South African languages. The tool supports the creation of frequency and ...
  • Afrikaans speaking children's first lexical items 

    Brink, Nina (North-West University, 2018)
    Data collected for a master's study in Afrikaans linguistics. The data consist of the first lexical items of 21 Afrikaans speaking children. The lexical ...
  • Setswana Test suite and Treebank 

    Berg, Ansu (North-West University, 2018) ~ Resource Catalogue
    The main aim of the PhD study "A computational syntactic analysis of Setswana"(AS Berg, May 2018) is the computational syntactic analysis of the Setswana ...
  • Lwazi III isiZulu TTS Corpus 

    Aby Louw, et al. (Meraka Institute, CSIR, 2016-06-17) ~ Resource Catalogue
    Complete audio recordings with orthographic transcriptions. TTS corpus for standard SA dialect. This corpus was created to enable the building of a TTS voice.
  • Lwazi III isiXhosa TTS Corpus 

    Aby Louw, et al. (Meraka Institute, CSIR, 2016-06-17) ~ Resource Catalogue
    Complete audio recordings with orthographic transcriptions. TTS corpus for standard SA dialect. This corpus was created to enable the building of a TTS voice.
  • Lwazi III English TTS Corpus 

    Aby Louw, et al. (Meraka Institute, CSIR, 2016-06-17) ~ Resource Catalogue
    Complete audio recordings with orthographic transcriptions. TTS corpus for standard SA dialect. This corpus was created to enable the building of a TTS voice.
  • Lwazi III Afrikaans TTS Corpus 

    Aby Louw, et al. (Meraka Institute, CSIR, 2016-06-17) ~ Resource Catalogue
    Complete audio recordings with orthographic transcriptions. TTS corpus for standard SA dialect. This corpus was created to enable the building of a TTS voice.
  • NCHLT Speech II Corpus 

    Jaco Badenhorst, et al. (Meraka Institute, CSIR, 2016-05-09) ~ Resource Catalogue
    The speech corpus generated from aligned audio samples from National Parliament using Hansard transcriptions are provided in terms of audio and ...
  • NCHLT isiNdebele Speech Corpus 

    Charl van Heerden, et al. (Meraka Institute, CSIR; North-West University, 2014-07-08) ~ Resource Catalogue
    Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.
  • NCHLT Siswati Speech Corpus 

    Charl van Heerden, et al. (Meraka Institute, CSIR; North-West University, 2014-07-08) ~ Resource Catalogue
    Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.

View more