Filter by:

Now showing items 193-212 of 232

Filter options

    • NCHLT Tshivenda Lemmatiser 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Lemmatiser developed during the NCHLT Text project.
    • NCHLT Tshivenda Named Entity Annotated Corpus 

      S.L. Tshikota, et al. (North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ Resource Catalogue
      Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.
    • NCHLT Tshivenda Phrase Chunk Annotated Corpus 

      S.L. Tshikota, et al. (North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ Resource Catalogue
      Phrase chunk annotated data for the NCHLT Text Resource Development: Phase II Project. The phrase chunk annotated data is a subset of the 50,000 tokens ...
    • NCHLT Tshivenda Speech Corpus 

      Charl van Heerden, et al. (Meraka Institute, CSIR; North-West University, 2014-07-08) ~ Resource Catalogue
      Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.
    • NCHLT Tshivenda Text Corpora 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Collection of source text documents, genre classified text documents, raw corpus, clean corpus, lexicon, frequency list and named-entity lists developed ...
    • NCHLT Xitsonga Morphological Decomposer 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Morphological decomposer developed during the NCHLT Text project.
    • NCHLT Xitsonga Annotated Text Corpora 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.
    • NCHLT Xitsonga Auxiliary Speech Corpus 

      Febe de Wet, et al. (CSIR Meraka Institute; North-West University, 2019-06-01) ~ Resource Catalogue
      The corpus contains orthographically transcribed broadband speech in each of South Africa's eleven official languages. Transcriptions are provided in ...
    • NCHLT Xitsonga Lemmatiser 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Lemmatiser developed during the NCHLT Text project.
    • NCHLT Xitsonga Named Entity Annotated Corpus 

      N.C.P. Golele, et al. (North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ Resource Catalogue
      Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.
    • NCHLT Xitsonga Phrase Chunk Annotated Corpus 

      N.C.P. Golele, et al. (North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ Resource Catalogue
      Phrase chunk annotated data for the NCHLT Text Resource Development: Phase II Project. The phrase chunk annotated data is a subset of the 50,000 tokens ...
    • NCHLT Xitsonga Speech Corpus 

      Charl van Heerden, et al. (Meraka Institute, CSIR; North-West University, 2014-07-08) ~ Resource Catalogue
      Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.
    • NCHLT Xitsonga Text Corpora 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Collection of source text documents, genre classified text documents, raw corpus, clean corpus, lexicon, frequency list and named-entity lists developed ...
    • NCHLT-inlang Pronunciation Dictionaries 

      Marelie Davel (Meraka Institute, CSIR; North-West University, 2014-07-04) ~ Resource Catalogue
      Broad phonemic transcriptions for 15,000 generic words in each of 11 languages. Each dictionary has an associated rule set for generating pronunciations ...
    • PHONAAS 

      Wikus Pienaar, et al. (North-West University; Centre for Text Technology (CTexT), 2015-06-30) ~ Resource Catalogue
      PHONAAS is a graphical user interface (GUI) tool, written in Perl and GTK2, using the R programming language and PRAAT to extract vowel formant data.
    • Read Afrikaans Normal/ Read Afrikaans Fast 

      Wissing, Daan (Centre for Text Technology, North-West University, 2019-05-28) ~ Resource Catalogue
      The corpus contains speech of 127 mother tongue speakers of Afrikaans. Every speaker was asked to read a text fragment from a book or a newspaper (about ...
    • SADE Municipality Hotline IVR Prompts 

      Charl van Heerden, et al. (North-West University; Molo Afrika Speech Technologies; IntSyst Labs CC, 2015-09-07) ~ Resource Catalogue
      Audio and corresponding transcriptions for the SADE Municipality Hotline IVR prompts in English, Sesotho and isiZulu. The English SADE municipality ...
    • SADE v.1.0 Platform 

      Charl van Heerden, et al. (North-West University; Molo Afrika Speech Technologies; IntSyst Labs CC, 2015-09-07) ~ Resource Catalogue
      End-to-end directoy enquiries application (using Asterisk, UniMRPC and Kaldi). The municipality hotline example is implemented as an Asterisk Gateway ...
    • Sepedi Custom Dictionary for Government Domain 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2013-02-22) ~ Resource Catalogue
      Word list developed as a custom dictionary for use in the spelling checkers as part of the spelling checker project for the Department of Arts and ...
    • Sesotho Custom Dictionary for Government Domain 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2013-02-22) ~ Resource Catalogue
      Word list developed as a custom dictionary for use in the spelling checkers as part of the spelling checker project for the Department of Arts and ...