Repository logoRepository logo
 

NCHLT isiNdebele Text Corpora

dc.contact.emailMartin.Puttkammer@nwu.ac.za
dc.contact.nameMartin Puttkammer
dc.contributor.authorMartin Puttkammer
dc.contributor.authorMartin Schlemmer
dc.contributor.authorWikus Pienaar
dc.contributor.authorRuan Bekker
dc.date.accessioned2018-02-05T20:25:51Z
dc.date.accessioned2018-03-05T17:46:18Z
dc.date.available2018-02-05T20:25:51Z
dc.date.available2018-03-05T17:46:18Z
dc.date.issued2014-05-30
dc.descriptionCollection of source text documents, genre classified text documents, raw corpus, clean corpus, lexicon, frequency list and named-entity lists developed during the NCHLT Text project.
dc.format.extent9.16 Mb
dc.format.mediumText
dc.format.mediumUTF8
dc.identifier.citationEiselen, E.R. & Puttkammer, M.J. 2014. Developing text resources for ten South African languages. (In Proceedings of the 9th International Conference on Language Resources and Evaluation, Reykjavik, Iceland. p. 3698-3703)
dc.identifier.islrn858-844-618-880-3
dc.identifier.urihttps://hdl.handle.net/20.500.12185/308
dc.language.isonbl
dc.languagesisiNdebele
dc.media.categoryMonolingual text corpora: Unannotated
dc.media.typeText
dc.projectNCHLT Text
dc.publisherNorth-West University
dc.publisherCentre for Text Technology (CTexT)
dc.rights.licenseCreative Commons Attribution 2.5 South Africa License: http://creativecommons.org/licenses/by/2.5/za/legalcode
dc.sourceBased on documents from the South African government domain crawled from gov.za websites and collected from various language units.
dc.stratumDetails provided in documentation.
dc.titleNCHLT isiNdebele Text Corpora
dc.typeData
dc.version1
local.collection.primaryResource Catalogue
local.collection.secondaryResource Index

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
corpora.nchlt.nr.zip
Size:
9.17 MB
Format:
ZIP is an archive file format that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed.