Linguistically enriched corpora for conjunctively written South African languages
License agreement
By downloading this resource I accept and agree to the terms of use and the associated license conditions under which the resource is distributed.
Download
MD5: 31c8f46a70b36c61d9a9e7c12f5b4cf9
License agreement
By downloading this resource I accept and agree to the terms of use and the associated license conditions under which the resource is distributed.
Collections
- Resource Index [412]
Author(s)
Puttkammer, Martin
Gaustad, Tanja
Metadata
Show full item recordDescription
This resource contains linguistically annotated data for four official South African languages with a conjunctive orthography from the Nguni family (isiNdebele, isiXhosa, isiZulu and Siswati) as well as English. The data set is parallel for all five languages and the Nguni languages have been annotated for three different types of linguistic information: morphology, part-of-speech and lemmas. We have also included the protocols and tagsets used during annotation.
Contact person
Tanja GaustadContact person's e-mail address
tanja.gaustad@nwu.ac.zaPublisher(s)
North-West University, Centre for Language Technology (CTexT)