Search
Now showing items 1-10 of 23
NCHLT isiNdebele Phrase Chunk Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
Phrase chunk annotated data for the NCHLT Text Resource Development: Phase II Project. The phrase chunk annotated data is a subset of the 50,000 tokens ...
NCHLT isiNdebele Named Entity Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.
NCHLT isiNdebele Annotated Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.
NCHLT isiNdebele Morphological Decomposer
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Morphological decomposer developed during the NCHLT Text project.
NCHLT isiNdebele Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Collection of source text documents, genre classified text documents, raw corpus, clean corpus, lexicon, frequency list and named-entity lists developed ...
NCHLT Optical Character Recognition for South African Languages
(North-West University; Centre for Text Technology (CTexT), 2017-02-23) ~ - Resource Catalogue
An OCR system is an application that enables one to convert scanned paper documents into editable and searchable texts. The engine analyses the structure ...
NCHLT Part of Speech Taggers
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Part of speech taggers developed during the NCHLT Text project.
Available for the following languages: Afrikaans, English, isiNdebele, isiXhosa, isiZulu, ...
NCHLT South African Language Identifier
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
A graphical user interface and command line tool to automatically classify a document, paragraph, sentence or phrase as one of the eleven official South ...
Autshumato Multilingual Word and Phrase Translations
(North-West University; Centre for Text Technology (CTexT), 2016-01-20) ~ - Resource Catalogue
Word and phrase lists aligned from English to the other official South African languages.
Autshumato Text Anonymiser
(North-West University; Centre for Text Technology (CTexT), 2013-06-20) ~ - Resource Catalogue
Anonymises text by classifying and replacing sensitive information such as person names, business names, place names, monetary values, phone numbers, ...