Filter by:

Now showing items 288-307 of 411

Filter options

    • NCHLT Tshivenda Auxiliary Speech Corpus 

      Febe de Wet, et al. (CSIR Meraka Institute; North-West University, 2019-06-01) ~ Resource Catalogue
      The corpus contains orthographically transcribed broadband speech in each of South Africa's eleven official languages. Transcriptions are provided in ...
    • NCHLT Tshivenda Lemmatiser 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Lemmatiser developed during the NCHLT Text project. \n\n Available in the Readme.txt - Input format: Text data (encoding: UTF8 without BOM), one ...
    • NCHLT Tshivenda Named Entity Annotated Corpus 

      S.L. Tshikota, et al. (North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ Resource Catalogue
      Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.
    • NCHLT Tshivenda Phrase Chunk Annotated Corpus 

      S.L. Tshikota, et al. (North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ Resource Catalogue
      Phrase chunk annotated data for the NCHLT Text Resource Development: Phase II Project. The phrase chunk annotated data is a subset of the 50,000 tokens ...
    • NCHLT Tshivenda Speech Corpus 

      Charl van Heerden, et al. (Meraka Institute, CSIR; North-West University, 2014-07-08) ~ Resource Catalogue
      Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.
    • NCHLT Tshivenda Text Corpora 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Collection of source text documents, genre classified text documents, raw corpus, clean corpus, lexicon, frequency list and named-entity lists developed ...
    • NCHLT Xitsonga Morphological Decomposer 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Morphological decomposer developed during the NCHLT Text project.
    • NCHLT Xitsonga Annotated Text Corpora 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.
    • NCHLT Xitsonga Auxiliary Speech Corpus 

      Febe de Wet, et al. (CSIR Meraka Institute; North-West University, 2019-06-01) ~ Resource Catalogue
      The corpus contains orthographically transcribed broadband speech in each of South Africa's eleven official languages. Transcriptions are provided in ...
    • NCHLT Xitsonga Lemmatiser 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Lemmatiser developed during the NCHLT Text project. \n\n Available in the Readme.txt - Input format: Text data (encoding: UTF8 without BOM), one ...
    • NCHLT Xitsonga Named Entity Annotated Corpus 

      N.C.P. Golele, et al. (North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ Resource Catalogue
      Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.
    • NCHLT Xitsonga Phrase Chunk Annotated Corpus 

      N.C.P. Golele, et al. (North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ Resource Catalogue
      Phrase chunk annotated data for the NCHLT Text Resource Development: Phase II Project. The phrase chunk annotated data is a subset of the 50,000 tokens ...
    • NCHLT Xitsonga Speech Corpus 

      Charl van Heerden, et al. (Meraka Institute, CSIR; North-West University, 2014-07-08) ~ Resource Catalogue
      Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.
    • NCHLT Xitsonga Text Corpora 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Collection of source text documents, genre classified text documents, raw corpus, clean corpus, lexicon, frequency list and named-entity lists developed ...
    • NCHLT-inlang Pronunciation Dictionaries 

      Marelie Davel (Meraka Institute, CSIR; North-West University, 2014-07-04) ~ Resource Catalogue
      Broad phonemic transcriptions for 15,000 generic words in each of 11 languages. Each dictionary has an associated rule set for generating pronunciations ...
    • NHN Zulu corpora 

      Unknown author (University of the Witwatersrand, 2015-01-07) ~ Resource Index
      A first step to building a corpus of POS-annotated Zulu texts.
    • NoteTaker (vSep2009) 

      Unknown author (Meraka Institute, CSIR, 2013-07-01) ~ Resource Index
      Replaces a number of dedicated devices for the blind. The Notetaker is really a communication and computing device for the blind and visually impaired ...
    • NWU TransTips 1.0 

      Unknown author (Centre for Text Technology (CTexT), 2013-07-01) ~ Resource Index
      TransTips is a PHP programming script that browses a web page for terms in the database. The translation of words contained in the database is linked ...
    • N|uu language archive 

      Collins, Christopher T, et al. (South African Centre for Digital Language Resources, 2022-11-15)
      This data collection contains recordings and transcriptions of the N|uu language. This includes N|uu recordings, South African Nama and a local variety ...
    • N|uu language archive 

      Collins, Chris, et al. (Collins, Chris; Sands, Bonny; Jones, Kerry, 2022-08-11)
      This collection contains information that forms the basis of the N|uu dictionary which contains a word list for N|uu with translations into Afrikaans, ...