Browsing Resource Index by Title
Filter by:
Now showing items 236-255 of 411
-
NCHLT isiXhosa Morphological Decomposer
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~Resource Catalogue Morphological decomposer developed during the NCHLT Text project. -
NCHLT isiXhosa Named Entity Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~Resource Catalogue Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags. -
NCHLT isiXhosa Phrase Chunk Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~Resource Catalogue Phrase chunk annotated data for the NCHLT Text Resource Development: Phase II Project. The phrase chunk annotated data is a subset of the 50,000 tokens ... -
NCHLT isiXhosa Speech Corpus
(Meraka Institute, CSIR; North-West University, 2014-07-08) ~Resource Catalogue Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers. -
NCHLT isiXhosa Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~Resource Catalogue Collection of source text documents, genre classified text documents, raw corpus, clean corpus, lexicon, frequency list and named-entity lists developed ... -
NCHLT isiZulu Annotated Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~Resource Catalogue Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project. -
NCHLT isiZulu Auxiliary Speech Corpus
(CSIR Meraka Institute; North-West University, 2019-06-01) ~Resource Catalogue The corpus contains orthographically transcribed broadband speech in each of South Africa's eleven official languages. Transcriptions are provided in ... -
NCHLT isiZulu Lemmatiser
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~Resource Catalogue Lemmatiser developed during the NCHLT Text project. \n\n Available in the Readme.txt - Input format: Text data (encoding: UTF8 without BOM), one ... -
NCHLT isiZulu Morphological Decomposer
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~Resource Catalogue Morphological decomposer developed during the NCHLT Text project. -
NCHLT isiZulu Named Entity Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~Resource Catalogue Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags. -
NCHLT isiZulu Phrase Chunk Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~Resource Catalogue Phrase chunk annotated data for the NCHLT Text Resource Development: Phase II Project. The phrase chunk annotated data is a subset of the 50,000 tokens ... -
NCHLT isiZulu Speech Corpus
(Meraka Institute, CSIR; North-West University, 2014-07-08) ~Resource Catalogue Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers. -
NCHLT isiZulu Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~Resource Catalogue Collection of source text documents, genre classified text documents, raw corpus, clean corpus, lexicon, frequency list and named-entity lists developed ... -
NCHLT Optical Character Recognition for South African Languages
(North-West University; Centre for Text Technology (CTexT), 2017-02-23) ~Resource Catalogue An OCR system is an application that enables one to convert scanned paper documents into editable and searchable texts. The engine analyses the structure ... -
NCHLT Part of Speech Taggers
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~Resource Catalogue Part of speech taggers developed during the NCHLT Text project. Available for the following languages: Afrikaans, English, isiNdebele, isiXhosa, isiZulu, ... -
NCHLT Sepedi Morphological Decomposer
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~Resource Catalogue Morphological decomposer developed during the NCHLT Text project. -
NCHLT Sepedi Annotated Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~Resource Catalogue Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project. -
NCHLT Sepedi Auxiliary Speech Corpus
(CSIR Meraka Institute; North-West University, 2019-06-01) ~Resource Catalogue The corpus contains orthographically transcribed broadband speech in each of South Africa's eleven official languages. Transcriptions are provided in ... -
NCHLT Sepedi Lemmatiser
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~Resource Catalogue Lemmatiser developed during the NCHLT Text project. \n\n Available in the Readme.txt - Input format: Text data (encoding: UTF8 without BOM), one ... -
NCHLT Sepedi Named Entity Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~Resource Catalogue Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.