Search
Now showing items 1-10 of 32
NCHLT Optical Character Recognition for South African Languages
(North-West University; Centre for Text Technology (CTexT), 2017-02-23) ~ - Resource Catalogue
An OCR system is an application that enables one to convert scanned paper documents into editable and searchable texts. The engine analyses the structure ...
NCHLT Tshivenda Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Collection of source text documents, genre classified text documents, raw corpus, clean corpus, lexicon, frequency list and named-entity lists developed ...
NCHLT Tagger
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
A graphical user interface and command line tool to automatically annotate running text with one or more linguistic tags:\n* Part of Speech\n* Named ...
NCHLT South African Language Identifier
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
A graphical user interface and command line tool to automatically classify a document, paragraph, sentence or phrase as one of the eleven official South ...
Autshumato TMS
(North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ - Resource Catalogue
Terminology Management System. Web application used by Terminologists and Administrators to capture, edit and export terminology.
Autshumato PDF Text Extractor
(North-West University; Centre for Text Technology (CTexT), 2013-06-20) ~ - Resource Catalogue
Utility application for extracting text out of a PDF document. The pages can also be extracted as images.
NCHLT Tshivenda Morphological Decomposer
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Morphological decomposer developed during the NCHLT Text project.
COVID-19 Multilingual Terminology
(City of Tshwane; South African Centre for Digital Language Resources (SADiLaR); Department of Science and Innovation; Pan South African Language Board (PanSALB), 2021-07)
COVID-19 multilingual terminology list document in all the South African languages. The development of this terminology list was initiated by City of ...
Lara2
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Tool for annotating texts with lemma, part of speech and morphological analysis information
NCHLT Tshivenda Phrase Chunk Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
Phrase chunk annotated data for the NCHLT Text Resource Development: Phase II Project. The phrase chunk annotated data is a subset of the 50,000 tokens ...