Please do not copy the URL from the browser for citation. The correct URL is 'https://hdl.handle.net/20.500.12185/546'
Linguistically enriched corpora for conjunctively written South African languages
Loading...
Deposit Licenses
Date
2021-09
Authors
Puttkammer, Martin
Gaustad, Tanja
Journal Title
Journal ISSN
Volume Title
Publisher
North-West University, Centre for Language Technology (CTexT)
Abstract
Description
This resource contains linguistically annotated data for four official South African languages with a conjunctive orthography from the Nguni family (isiNdebele, isiXhosa, isiZulu and Siswati) as well as English. The data set is parallel for all five languages and the Nguni languages have been annotated for three different types of linguistic information: morphology, part-of-speech and lemmas. We have also included the protocols and tagsets used during annotation.
Keywords
Citation
https://doi.org/10.1016/j.dib.2022.107994
License
Collections
Verification status
Level 0