NCHLT isiNdebele RoBERTa language model

Roald Eiselen

Please do not copy the URL from the browser for citation. The correct URL is 'https://hdl.handle.net/20.500.12185/637'

NCHLT isiNdebele RoBERTa language model

Files

nr.RoBERTa.tar.gz (236.06 MB)

Date

2023-05-01

Authors

Roald Eiselen

Publisher

North-West University; Centre for Text Technology (CTexT)

Description

Contextual masked language model based on the RoBERTa architecture (Liu et al., 2019). The model is trained as a masked language model and not fine-tuned for any downstream process. The model can be used both as a masked LM or as an embedding model to provide real-valued vectorised respresentations of words or string sequences for isiNdebele text.

License

Creative Commons Attribution 4.0 International (CC-BY 4.0)

URI

https://hdl.handle.net/20.500.12185/637

Collections

Resource Catalogue

Verification status

Level 0

Full item page

NCHLT isiNdebele RoBERTa language model

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

License

URI

Collections

Verification status