Repository logoRepository logo
 

Autshumato Setswana Monolingual Corpora

dc.contact.emailsunny.gent@nwu.ac.za
dc.contact.nameSunny Gent
dc.contributor.authorCindy McKellar
dc.contributor.otherRoald Eiselen
dc.contributor.otherWikus Pienaar
dc.databaseMultilingual Text Corpora: Aligned
dc.date.accessioned2018-02-05T20:22:41Z
dc.date.accessioned2018-03-05T17:50:20Z
dc.date.available2018-02-05T20:22:41Z
dc.date.available2018-03-05T17:50:20Z
dc.date.issued2016-10-28
dc.descriptionSetswana monolingual corpus as a deliverable of the Autshumato project. The data is given as a UTF-8 text file; with each sentence on a new line. NOTE: There is a newer version for English-Setswana Monolingual Corpus. See https://hdl.handle.net/20.500.12185/584
dc.format.extent1.52 Mb (zipped)
dc.format.mediumText
dc.format.mediumUTF8
dc.format.sizeMonolingual Lines: 38 205. Monolingual Words (excludes punctuation and numbers): 879 248
dc.identifier.islrn818-990-855-312-3
dc.identifier.urihttps://hdl.handle.net/20.500.12185/413
dc.language.isotsn
dc.languagesSetswana
dc.media.categoryMonolingual text corpora: Unannotated
dc.media.typeText
dc.projectAutshumato
dc.publisherNorth-West University
dc.publisherCentre for Text Technology (CTexT)
dc.rights.licenseCreative Commons Attribution 2.5 South Africa License: http://creativecommons.org/licenses/by/2.5/za/legalcode
dc.sourceGovernment Documents
dc.stratumDetails provided in documentation.
dc.titleAutshumato Setswana Monolingual Corpora
dc.typeData
dc.version1
local.collection.primaryResource Catalogue
local.collection.secondaryResource Index

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
autshumato_setswana_monolingual_corpora.zip
Size:
1.52 MB
Format:
ZIP is an archive file format that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed.