Search
Now showing items 1-10 of 42
NCHLT Optical Character Recognition for South African Languages
(North-West University; Centre for Text Technology (CTexT), 2017-02-23) ~ - Resource Catalogue
An OCR system is an application that enables one to convert scanned paper documents into editable and searchable texts. The engine analyses the structure ...
NCHLT Tagger
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
A graphical user interface and command line tool to automatically annotate running text with one or more linguistic tags:\n* Part of Speech\n* Named ...
NCHLT South African Language Identifier
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
A graphical user interface and command line tool to automatically classify a document, paragraph, sentence or phrase as one of the eleven official South ...
Lwazi Setswana ASR corpus
(Meraka Institute, CSIR, 2013-04-02) ~ - Resource Catalogue
Complete audio recordings and orthographic transcriptions used for Lwazi speech recognition systems.
Autshumato TMS
(North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ - Resource Catalogue
Terminology Management System. Web application used by Terminologists and Administrators to capture, edit and export terminology.
Autshumato PDF Text Extractor
(North-West University; Centre for Text Technology (CTexT), 2013-06-20) ~ - Resource Catalogue
Utility application for extracting text out of a PDF document. The pages can also be extracted as images.
NCHLT-inlang Pronunciation Dictionaries
(Meraka Institute, CSIR; North-West University, 2014-07-04) ~ - Resource Catalogue
Broad phonemic transcriptions for 15,000 generic words in each of 11 languages. Each dictionary has an associated rule set for generating pronunciations ...
NCHLT Setswana Annotated Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.
Lwazi Setswana TTS corpus
(Meraka Institute, CSIR, 2013-03-27) ~ - Resource Catalogue
Orthographic and phonemically aligned transcriptions
Lara2
(North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ - Resource Catalogue
Tool for annotating texts with lemma, part of speech and morphological analysis information