Repository logoRepository logo
 

Autshumato English-Xitsonga Manually Translated Parallel Corpora

dc.contact.emailsunny.gent@nwu.ac.za
dc.contact.nameSunny Gent
dc.contributor.authorWikus Pienaar
dc.contributor.authorWildrich Fourie
dc.contributor.authorCindy McKellar
dc.databaseMultilingual Text Corpora: Aligned
dc.date.accessioned2018-02-05T20:20:44Z
dc.date.accessioned2018-03-05T17:49:37Z
dc.date.available2018-02-05T20:20:44Z
dc.date.available2018-03-05T17:49:37Z
dc.date.issued2014-12-12
dc.descriptionAligned English-Xitsonga parallel corpus. The data is given as two seperate UTF-8 text files; with each segment on a newline.
dc.format.extent2.74 Mb
dc.format.mediumText
dc.format.mediumUTF8
dc.format.size92,396 bilingual segments. 771,688 English Words (excluding punctuation and numbers). 880,149 Xitsonga Words (excluding punctuation and numbers).
dc.identifier.citationMcKellar, C.A. 2014. An English-Xitsonga SMT system for the government domain. (In: Proceedings of the 2014 PRASA, RobMech and AfLaT International Joint Symposium, Cape Town, South Africa).
dc.identifier.islrn463-910-862-996-9
dc.identifier.urihttps://hdl.handle.net/20.500.12185/405
dc.language.isoeng
dc.language.isotso
dc.languagesEnglish
dc.languagesXitsonga
dc.media.categoryMultilingual text corpora: Aligned
dc.media.typeText
dc.projectAutshumato
dc.publisherNorth-West University
dc.publisherCentre for Text Technology (CTexT)
dc.rights.licenseCreative Commons Attribution 2.5 South Africa License: http://creativecommons.org/licenses/by/2.5/za/legalcode
dc.sourceBased on documents from the South African government domain crawled from gov.za websites and collected from various language units.
dc.stratumDetails provided in documentation.
dc.titleAutshumato English-Xitsonga Manually Translated Parallel Corpora
dc.typeData
dc.version1
local.collection.primaryResource Catalogue
local.collection.secondaryResource Index

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
en-ts.translationsonlycorpus.zip
Size:
2.74 MB
Format:
ZIP is an archive file format that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed.