Department of Science, Technology and InnovationCLARIN in South Africa
 

CTexT Afrikaans fastText CBoW String Embeddings

Loading...
Thumbnail Image

Date

2022-01-10

Authors

Eiselen, Roald

Journal Title

Journal ISSN

Volume Title

Publisher

Centre for Text Technology (CTexT)

Abstract

Description

The CTexT Afrikaans fastText CBoW String Embeddings is a 300 dimensional Afrikaans embedding model based on the Contunious Bag of Words fastText architecture that provides real-valued vector representations for Afrikaans text. The embedding was trained on a corpus of 230 million words.

Citation

Verification status

Level 0