Creative Commons Attribution 4.0 InternationalGaustad, TanjaMcKellar, CindyGent, Sunny2026-03-262026-03-262026-03-31https://hdl.handle.net/20.500.12185/697This deliverable contains part-of-speech tagged data from five different text types for Xitsonga. The text types included are: - CAPS gr12 (Academic) - MA/PhD Theses (Academic) - Magazines (Non-Academic) - News (Non-Academic) - Novels (Fiction) The data is given as txt files where each line contains a token and the corresponding POS tag, tab separated. Each text type data file contains 11,000+ tokens, amounting to a total of 57,537 tokens for the language. Please see the included protocol for more details on the POS tags used. Please see Tanja Gaustad, Roald Eiselen, Cindy McKellar (2026). Extension of Linguistic Resources for South African Languages: Part-of-Speech Annotated Domain-Specific Data. Proceedings of the Seventh Workshop on Resources for African Indigenous Languages (RAIL) (collocated with LREC 2026) for more detailed information.text57537 tokensN/AXitsonga, POS annotated, domain-specific, annotated corpusXitsonga Domain corpus POS annotated (5 domains)1 Mb