Morphologically annotated corpus for Xitsonga
Please do not copy the URL from the browser for citation. The correct URL is 'https://hdl.handle.net/20.500.12185/672'
dc.contact.email | tanja.gaustad@nwu.ac.za | en_ZA |
dc.contact.name | T. Gaustad | en_ZA |
dc.contributor.author | Gaustad, Tanja | |
dc.contributor.other | McKellar, Cindy | |
dc.date.accessioned | 2024-03-27T08:25:13Z | |
dc.date.available | 2024-03-27T08:25:13Z | |
dc.date.issued | 2024-01-31 | |
dc.description | NCHLT corpus of morphologically annotated tokens in Xitsonga converted to the tags used during phases 1 and 2 of the SADiLaR-II project. The data is given as txt files. Each line consists of a token and the corresponding morphological analysis, tab separated. The file for Xitsonga contains a total of 69,584 tokens. All the data has been automatically converted, then manually checked and re-annotated where necessary by linguistic experts as well as quality controlled. Please see the included protocol for more details on the morphological tags used. | en_ZA |
dc.format | text | en_ZA |
dc.format.extent | 69,584 tokens | en_ZA |
dc.format.medium | N/A | en_ZA |
dc.format.size | 2Mb | en_ZA |
dc.identifier.uri | https://hdl.handle.net/20.500.12185/672 | |
dc.languages | Xitsonga | en_ZA |
dc.media.category | annotated text corpus | en_ZA |
dc.media.type | Text | en_ZA |
dc.project | Linguistic corpus enrichment for South African languages | en_ZA |
dc.publisher | Centre for Text Technology (CTexT) | en_ZA |
dc.rights.license | CC BY 4.0 | en_ZA |
dc.subject | morphology | en_ZA |
dc.subject | annotated | en_ZA |
dc.title | Morphologically annotated corpus for Xitsonga | en_ZA |
dc.version | 1.0 | en_ZA |
Files
Original bundle
1 - 3 of 3
Loading...
- Name:
- README.Morph.Final.2024-01-31.txt
- Size:
- 2.4 KB
- Format:
- Plain Text
- Description:
- Read Me
Loading...
- Name:
- Protocol.SADiLaR.MorphologicalAnalysisXitsonga.Final.2023-08-29.doc
- Size:
- 359.5 KB
- Format:
- Microsoft Word
- Description:
- Morphological Annotation Protocol for Xitsonga
Loading...
- Name:
- SADII-Ext.MorphDataNCHLTConverted.Final.2023-08-31.ts.txt
- Size:
- 1.65 MB
- Format:
- Plain Text
- Description:
- Morphologically annotated corpus for Xitsonga
License bundle
1 - 1 of 1
Loading...
- Name:
- license.txt
- Size:
- 3.22 KB
- Format:
- Item-specific license agreed upon to submission
- Description: