Autshumato Monolingual Tshivenḓa Corpus
Title | Autshumato Monolingual Tshivenḓa Corpus |
Description | Monolingual corpus for Tshivenḓa. The data is given as a single UTF-8 text file, with each segment on a newline. |
Contact name | Sunny Gent |
Contact email | sunny.gent@nwu.ac.za |
Publisher(s) | North-West University; Centre for Text Technology (CTexT) |
License | Creative Commons Attribution 4.0 International |
Language(s) | Tshivenda |
Author(s) | McKellar, Cindy |
Contributor | Puttkammer, Martin; Gaustad, Tanja; Gent, Sunny; van Heerden, Jacques |
Subject | Autshumato V; Tshivenḓa; Monolingual text corpora |
URI | https://hdl.handle.net/20.500.12185/681 |
Media type | Text |
Media category | Multilingual text corpora |
Format extent | 141,426 Tshivenḓa Segments & 2,870,916 Tshivenḓa Words |
Version | 3.0 (Final) |
Format size | 5.83Mb |
Project | Autshumato |
Submit date | 2024-03-27T08:27:10Z |
Date available | 2024-03-27T08:27:10Z |
Date created | 2023-12-12 |
Files in this item
This item appears in the following Collection(s)
-
Resource Catalogue [349]
A collection of language resources available for download from the RMA of SADiLaR. The collection mostly consists of resources developed with funding from the Department of Arts and Culture.