menu.header.image.logo.govmenu.header.image.logo.clarin

Afrikaans Domain corpus POS annotated (5 domains)

Zusammenfassung

Beschreibung

This deliverable contains part-of-speech tagged data from five different text types for Afrikaans. The text types included are: - CAPS gr12 (Academic) - MA/PhD Theses (Academic) - Magazines (Non-Academic) - News (Non-Academic) - Novels (Fiction) The data is given as txt files where each line contains a token and the corresponding POS tag, tab separated. Each text type data file contains 11,000+ tokens, amounting to a total of 60,809 tokens for the language. Please see the included protocol for more details on the POS tags used.

Zitierform

item.page.rights.license

Creative Commons Attribution 4.0 International

item.page.verification

Level 0