Repository logoRepository logo
 

NCHLT Siswati Named Entity Annotated Corpus

dc.contact.emailMartin.Puttkammer@nwu.ac.za
dc.contact.nameMartin Puttkammer
dc.contributor.authorB.B. Malangwane
dc.contributor.authorM.N. Kekana
dc.contributor.authorS.S. Sedibe
dc.contributor.authorB.C. Ndhlovu
dc.contributor.authorRoald Eiselen
dc.databaseMonolingual Text Corpora: Annotated
dc.date.accessioned2018-02-05T20:22:31Z
dc.date.accessioned2018-03-05T17:47:16Z
dc.date.available2018-02-05T20:22:31Z
dc.date.available2018-03-05T17:47:16Z
dc.date.issued2016-04-29
dc.descriptionNamed entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.
dc.format.extent20.8 Mb (zipped)
dc.format.mediumText
dc.format.mediumUTF8
dc.format.size18,201 annotated tokens (estimated 180,000 total tokens)
dc.identifier.citationEiselen, R. 2016. Government domain named entity recognition for South African languages. Proceedings of the 10th Language Resource and Evaluation Conference, Portorož, Slovenia.
dc.identifier.islrn436-529-883-400-6
dc.identifier.urihttps://hdl.handle.net/20.500.12185/346
dc.language.isossw
dc.languagesSiswati
dc.media.categoryMonolingual text corpora: Annotated
dc.media.typeText
dc.projectNCHLT Text II
dc.publisherNorth-West University
dc.publisherCentre for Text Technology (CTexT)
dc.rights.licenseCreative Commons Attribution 2.5 South Africa License: http://creativecommons.org/licenses/by/2.5/za/legalcode
dc.sourceBased on documents from the South African government domain crawled from gov.za websites and collected from various language units.
dc.stratumDetails provided in documentation.
dc.titleNCHLT Siswati Named Entity Annotated Corpus
dc.typeData
dc.version1
local.collection.primaryResource Catalogue
local.collection.secondaryResource Index

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
nchlt_siswati_named_entity_annotated_corpus.zip
Size:
20.87 MB
Format:
ZIP is an archive file format that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed.