Search
Now showing items 1-10 of 50
Format Normaliser 1.0.
(North-West University; Centre for Text Technology (CTexT), 2015-01-30) ~ - Resource Index
Normalises input files to txt, utf8, replaces smart quotes with straight quotes, removes empty lines, etc.
NCHLT Optical Character Recognition for South African Languages
(North-West University; Centre for Text Technology (CTexT), 2017-02-23) ~ - Resource Catalogue
An OCR system is an application that enables one to convert scanned paper documents into editable and searchable texts. The engine analyses the structure ...
Autshumato English-Xitsonga Parallel Corpora
(CTexT® (Centre for Text Technology, North-West University), 2022-09-30)
Aligned parallel corpora for the language pair English-Xitsonga. The data is given as two separate UTF-8 text files, with each aligned segment on a ...
NCHLT Tagger
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
A graphical user interface and command line tool to automatically annotate running text with one or more linguistic tags:\n* Part of Speech\n* Named ...
Autshumato Xitsonga Frequency Word List
(North-West University; Centre for Text Technology (CTexT), 2014-12-12) ~ - Resource Catalogue
A list of the most frequent Xitsonga words as deliverable of the Autshumato project.
TurboAnnotate1.0
(Centre for Text Technology (CTexT), 2013-07-01) ~ - Resource Index
TurboAnnotate is a user-friendly annotating environment (i.e. tool) for bootstrapping linguistic data for machine-learning purposes, or for manually ...
NCHLT South African Language Identifier
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~ - Resource Catalogue
A graphical user interface and command line tool to automatically classify a document, paragraph, sentence or phrase as one of the eleven official South ...
Autshumato TMS
(North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ - Resource Catalogue
Terminology Management System. Web application used by Terminologists and Administrators to capture, edit and export terminology.
Autshumato PDF Text Extractor
(North-West University; Centre for Text Technology (CTexT), 2013-06-20) ~ - Resource Catalogue
Utility application for extracting text out of a PDF document. The pages can also be extracted as images.
Multilingual Soccer Terminology List
(Terminology Coordination Section of the National Language Service, Department of Arts and Culture, 2017-03-03) ~ - Resource Index
297 English source terms with their equivalents in the ten other official South African languages. On the eve of the 2010 FIFA World Cup, the list was ...