CKarma
Title | CKarma |
Description | CKarma is a compound analyser for Afrikaans, to be used for the detection of word boundaries within compounds. It takes as input a string, and produces as output an analysed string, without any tags. For example, the string "hondehokdak" ('dog house roof') will be analysed as "hond _ e + hok + dak", where the plus sign indicates the beginning of an independent constituent, and the underscore the beginning of a dependent constituent (i.e. a valence morpheme). CKarma is a C5 classifier, trained on data consisting of circa 47,000 compound and 7,000 non-compounds. The resulting decision tree and cases can be converted to C code by means of a script written by MM van Zaanen. This C code can then be implemented in any other system. |
Contact name | Martin Puttkammer |
Contact email | Martin.Puttkammer@nwu.ac.za |
Publisher(s) | North-West University; Centre for Text Technology (CTexT) |
Language(s) | Afrikaans |
URI | https://hdl.handle.net/20.500.12185/145 |
Media type | Text |
Type | Modules |
Media category | Compound Analyser |
Version | N/A |
Primary collection | Resource Index |
ISO639 code | afr |
Submit date | 2018-02-05T07:33:07Z; 2018-03-05T14:58:06Z |
Date available | 2018-02-05T07:33:07Z; 2018-03-05T14:58:06Z |
Date created | 2015-01-30 |
Files in this item
Files | Size | Format | View |
---|---|---|---|
There are no files associated with this item. |
This item appears in the following Collection(s)
-
Resource Index [412]
A collection of language resource metadata mostly collected during the NHN funded technology audit of 2009, as well as the SADiLaR technology audit of 2018. Not all resources in this collection are available for download.