Repository logoRepository logo
 

NCHLT Afrikaans Phrase Chunk Annotated Corpus

dc.contact.emailMartin.Puttkammer@nwu.ac.za
dc.contact.nameMartin Puttkammer
dc.contributor.authorGerhard van Huyssteen
dc.contributor.authorMartin Puttkammer
dc.contributor.authorE.B. Trollip
dc.contributor.authorJ.C. Liversage
dc.contributor.authorRoald Eiselen
dc.databaseMonolingual Text Corpora: Annotated
dc.date.accessioned2018-02-05T20:22:34Z
dc.date.accessioned2018-03-05T17:45:54Z
dc.date.available2018-02-05T20:22:34Z
dc.date.available2018-03-05T17:45:54Z
dc.date.issued2016-04-29
dc.descriptionPhrase chunk annotated data for the NCHLT Text Resource Development: Phase II Project. The phrase chunk annotated data is a subset of the 50,000 tokens annotated during the NCHLT text resource development project and consists of a minimum of 15,000 tokens annotated as one of the six phrase types described in the protocol.
dc.format.extent1.70 Mb (zipped)
dc.format.mediumText
dc.format.mediumUTF8
dc.format.size15,660 Phrase chunk token count
dc.identifier.citationEiselen, R. 2016. South African language resources: phrase chunkers. Proceedings of the 10th Language Resource and Evaluation Conference, Portorož, Slovenia.
dc.identifier.islrn214-658-983-483-8
dc.identifier.urihttps://hdl.handle.net/20.500.12185/300
dc.language.isoafr
dc.languagesAfrikaans
dc.media.categoryMonolingual text corpora: Annotated
dc.media.typeText
dc.projectNCHLT Text II
dc.publisherNorth-West University
dc.publisherCentre for Text Technology (CTexT)
dc.rights.licenseCreative Commons Attribution 2.5 South Africa License: http://creativecommons.org/licenses/by/2.5/za/legalcode
dc.sourceBased on documents from the South African government domain crawled from gov.za websites and collected from various language units.
dc.stratumDetails provided in documentation.
dc.titleNCHLT Afrikaans Phrase Chunk Annotated Corpus
dc.typeData
dc.version1
local.collection.primaryResource Catalogue
local.collection.secondaryResource Index

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
nchlt_afrikaans_phrase_chunk_annotated_corpus.zip
Size:
1.71 MB
Format:
ZIP is an archive file format that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed.