Repository logoRepository logo
 

Autshumato English-Setswana Parallel Corpora

dc.contact.emailsunny.gent@nwu.ac.za
dc.contact.nameSunny Gent
dc.contributor.authorCindy McKellar
dc.contributor.otherRoald Eiselen
dc.contributor.otherWikus Pienaar
dc.databaseMultilingual Text Corpora: Aligned
dc.date.accessioned2018-02-05T20:22:42Z
dc.date.accessioned2018-03-05T17:49:36Z
dc.date.available2018-02-05T20:22:42Z
dc.date.available2018-03-05T17:49:36Z
dc.date.issued2016-10-28
dc.descriptionAligned English-Setswana parallel corpus. This set contains data that was translated by professional translators, data that was sourced as translated file pairs from translators and data obtained from Government websites and documents. The data is given as six separate UTF-8 text files; with each aligned sentence pair on a new line.
dc.format.extent9.02 Mb (zipped)
dc.format.mediumText
dc.format.mediumUTF8
dc.format.size159 000 bilingual segments 2 037 173 English words (excluding punctuation and numbers). 2 596 023 Setswana words (excluding punctuation and numbers).
dc.identifier.islrn379-219-829-093-2
dc.identifier.urihttps://hdl.handle.net/20.500.12185/404
dc.language.isoeng
dc.language.isotsn
dc.languagesEnglish
dc.languagesSetswana
dc.media.categoryMultilingual text corpora: Aligned
dc.media.typeText
dc.projectAutshumato
dc.publisherNorth-West University
dc.publisherCentre for Text Technology (CTexT)
dc.rights.licenseCreative Commons Attribution 2.5 South Africa License: http://creativecommons.org/licenses/by/2.5/za/legalcode
dc.sourceBased on documents from the South African government domain crawled from gov.za websites and collected from various language units.
dc.stratumDetails provided in documentation.
dc.titleAutshumato English-Setswana Parallel Corpora
dc.typeData
dc.version1
local.collection.primaryResource Catalogue
local.collection.secondaryResource Index

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
autshumato_english-setswana_parallel_corpora.zip
Size:
9.02 MB
Format:
ZIP is an archive file format that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed.