Filter by:

Now showing items 182-201 of 353

Filter options

    • NCHLT Afrikaans Text Corpora 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Collection of source text documents, genre classified text documents, raw corpus, clean corpus, lexicon, frequency list and named-entity lists developed ...
    • NCHLT Siswati Morphological Decomposer 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Morphological decomposer developed during the NCHLT Text project.
    • NCHLT Afrikaans Annotated Text Corpora 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.
    • NCHLT Afrikaans Lemmatiser 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Lemmatiser developed during the NCHLT Text project.
    • NCHLT Afrikaans Morphological Decomposer 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Morphological decomposer developed during the NCHLT Text project.
    • NCHLT Afrikaans Named Entity Annotated Corpus 

      Gerhard van Huyssteen, et al. (North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ Resource Catalogue
      Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.
    • NCHLT Afrikaans Phrase Chunk Annotated Corpus 

      Gerhard van Huyssteen, et al. (North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ Resource Catalogue
      Phrase chunk annotated data for the NCHLT Text Resource Development: Phase II Project. The phrase chunk annotated data is a subset of the 50,000 tokens ...
    • NCHLT Afrikaans Speech Corpus 

      Charl van Heerden, et al. (Meraka Institute, CSIR; North-West University, 2014-07-08) ~ Resource Catalogue
      Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.
    • NCHLT English Speech Corpus 

      Charl van Heerden, et al. (Meraka Institute, CSIR; North-West University, 2014-07-08) ~ Resource Catalogue
      Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.
    • NCHLT English Text Corpora 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2016-09-09) ~ Resource Catalogue
      Collection consisting of a clean corpus, lexicon, frequency list and named-entity lists developed during the NCHLT Text project.
    • NCHLT isiNdebele Annotated Text Corpora 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.
    • NCHLT isiNdebele Lemmatiser 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Lemmatiser developed during the NCHLT Text project.
    • NCHLT isiNdebele Morphological Decomposer 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Morphological decomposer developed during the NCHLT Text project.
    • NCHLT isiNdebele Named Entity Annotated Corpus 

      K.S. Mahlangu, et al. (North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ Resource Catalogue
      Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.
    • NCHLT isiNdebele Phrase Chunk Annotated Corpus 

      K.S. Mahlangu, et al. (North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ Resource Catalogue
      Phrase chunk annotated data for the NCHLT Text Resource Development: Phase II Project. The phrase chunk annotated data is a subset of the 50,000 tokens ...
    • NCHLT isiNdebele Speech Corpus 

      Charl van Heerden, et al. (Meraka Institute, CSIR; North-West University, 2014-07-08) ~ Resource Catalogue
      Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.
    • NCHLT isiNdebele Text Corpora 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Collection of source text documents, genre classified text documents, raw corpus, clean corpus, lexicon, frequency list and named-entity lists developed ...
    • NCHLT isiXhosa Annotated Text Corpora 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.
    • NCHLT isiXhosa Lemmatiser 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Lemmatiser developed during the NCHLT Text project.
    • NCHLT isiXhosa Morphological Decomposer 

      Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue
      Morphological decomposer developed during the NCHLT Text project.