Repository logoRepository logo
 

Calomo

Loading...
Thumbnail Image

Date

2015-01-30

Authors

Menno van Zaanen

Journal Title

Journal ISSN

Volume Title

Publisher

North-West University
Centre for Text Technology (CTexT)

Abstract

Description

Calomo is a hyphenator for Afrikaans, which can be implemented in any NLP system. It takes as input a string, and produces as output an analysed string, without any tags. For example, the string "hondehokdak" ('dog house roof') will be analysed as "hon-de-hok-dak", where the hyphen indicates syllable boundaries on an orthographic level. Calomo is a C5 classifier, trained on data consisting of circa 40,000 words. The resulting decision tree and cases can be converted to C code by means of a script written by M.M. van Zaanen. This C code can then be implemented in any other system.

Keywords

Citation

License

Collections

Verification status

Level 0