Filter by:

Now showing items 283-302 of 406

Filter options

    • NCHLT Text Web Services 

      Roald Eiselen (SADiLaR; North-West University, 2018-03-01) ~ Resource Index
      A web service that provides access to seven core technologies in ten South African languages, including: * Tokenisers * Sentence separators * ...
    • NCHLT Tshivenda Morphological Decomposer 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Morphological decomposer developed during the NCHLT Text project.
    • NCHLT Tshivenda Annotated Text Corpora 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.
    • NCHLT Tshivenda Auxiliary Speech Corpus 

      Febe de Wet, et al. (CSIR Meraka Institute; North-West University, 2019-06-01) ~ Resource Catalogue
      The corpus contains orthographically transcribed broadband speech in each of South Africa's eleven official languages. Transcriptions are provided in ...
    • NCHLT Tshivenda Lemmatiser 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Lemmatiser developed during the NCHLT Text project. \n\n Available in the Readme.txt - Input format: Text data (encoding: UTF8 without BOM), one ...
    • NCHLT Tshivenda Named Entity Annotated Corpus 

      S.L. Tshikota, et al. (North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ Resource Catalogue
      Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.
    • NCHLT Tshivenda Phrase Chunk Annotated Corpus 

      S.L. Tshikota, et al. (North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ Resource Catalogue
      Phrase chunk annotated data for the NCHLT Text Resource Development: Phase II Project. The phrase chunk annotated data is a subset of the 50,000 tokens ...
    • NCHLT Tshivenda Speech Corpus 

      Charl van Heerden, et al. (Meraka Institute, CSIR; North-West University, 2014-07-08) ~ Resource Catalogue
      Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.
    • NCHLT Tshivenda Text Corpora 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Collection of source text documents, genre classified text documents, raw corpus, clean corpus, lexicon, frequency list and named-entity lists developed ...
    • NCHLT Xitsonga Morphological Decomposer 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Morphological decomposer developed during the NCHLT Text project.
    • NCHLT Xitsonga Annotated Text Corpora 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.
    • NCHLT Xitsonga Auxiliary Speech Corpus 

      Febe de Wet, et al. (CSIR Meraka Institute; North-West University, 2019-06-01) ~ Resource Catalogue
      The corpus contains orthographically transcribed broadband speech in each of South Africa's eleven official languages. Transcriptions are provided in ...
    • NCHLT Xitsonga Lemmatiser 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Lemmatiser developed during the NCHLT Text project. \n\n Available in the Readme.txt - Input format: Text data (encoding: UTF8 without BOM), one ...
    • NCHLT Xitsonga Named Entity Annotated Corpus 

      N.C.P. Golele, et al. (North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ Resource Catalogue
      Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.
    • NCHLT Xitsonga Phrase Chunk Annotated Corpus 

      N.C.P. Golele, et al. (North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ Resource Catalogue
      Phrase chunk annotated data for the NCHLT Text Resource Development: Phase II Project. The phrase chunk annotated data is a subset of the 50,000 tokens ...
    • NCHLT Xitsonga Speech Corpus 

      Charl van Heerden, et al. (Meraka Institute, CSIR; North-West University, 2014-07-08) ~ Resource Catalogue
      Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.
    • NCHLT Xitsonga Text Corpora 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Collection of source text documents, genre classified text documents, raw corpus, clean corpus, lexicon, frequency list and named-entity lists developed ...
    • NCHLT-inlang Pronunciation Dictionaries 

      Marelie Davel (Meraka Institute, CSIR; North-West University, 2014-07-04) ~ Resource Catalogue
      Broad phonemic transcriptions for 15,000 generic words in each of 11 languages. Each dictionary has an associated rule set for generating pronunciations ...
    • NHN Zulu corpora 

      Unknown author (University of the Witwatersrand, 2015-01-07) ~ Resource Index
      A first step to building a corpus of POS-annotated Zulu texts.
    • NoteTaker (vSep2009) 

      Unknown author (Meraka Institute, CSIR, 2013-07-01) ~ Resource Index
      Replaces a number of dedicated devices for the blind. The Notetaker is really a communication and computing device for the blind and visually impaired ...