CTexT Afrikaans fastText CBoW String Embeddings
Title | CTexT Afrikaans fastText CBoW String Embeddings |
Description | The CTexT Afrikaans fastText CBoW String Embeddings is a 300 dimensional Afrikaans embedding model based on the Contunious Bag of Words fastText architecture that provides real-valued vector representations for Afrikaans text. The embedding was trained on a corpus of 230 million words. |
Contact name | Roald Eiselen |
Contact email | Roald.Eiselen@nwu.ac.za |
Publisher(s) | Centre for Text Technology (CTexT) |
License | Creative Commons Attribution-Noncommercial 4.0 International (CC BY-NC 4.0): https://creativecommons.org/licenses/by-nc/4.0/ |
Language(s) | Afrikaans |
Author(s) | Eiselen, Roald |
Subject | Word embeddings; String embeddings; ; ; fastText; |
URI | https://hdl.handle.net/20.500.12185/550 |
Media type | Text |
Media category | Word embedding |
Format extent | 230 million words |
Version | 0.1 |
Format size | 3.31 Gb |
Format medium | N/A |
Submit date | 2022-02-02T06:56:56Z |
Date available | 2022-02-02T06:56:56Z |
Date created | 2022-01-10 |
Files in this item
This item appears in the following Collection(s)
-
Resource Catalogue [349]
A collection of language resources available for download from the RMA of SADiLaR. The collection mostly consists of resources developed with funding from the Department of Arts and Culture.