Search

Now showing items 41-50 of 79

NCHLT Afrikaans Annotated Text Corpora

Martin Puttkammer; Martin Schlemmer; Ruan Bekker (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue

Lemmatised, part of speech tagged and morphologically analysed corpora developed during the NCHLT Text project.

CTexT fastText Skipgram String Embeddings

Eiselen, Roald (Centre for Text Technology (CTexT), 2022-01-10)

The CTexT Afrikaans fastText Skipgram String Embeddings is a 300 dimensional Afrikaans embedding model based on the Skipgram fastText architecture that ...

Lwazi II Afrikaans TTS Corpus

Daniel van Niekerk; Alta de Waal; Georg Schlünz (Meraka Institute, CSIR; North-West University, 2015-11-20) ~ Resource Catalogue

Orthographic and phonemically aligned transcriptions

Gerhard van Huyssteen; Walter Daelemans; Ben Verhoeven (North-West University; Centre for Text Technology (CTexT); CLiPS Research Center, University of Antwerp, Belgium, 2015-01-07) ~ Resource Catalogue

The AuCoPro Semantics dataset serves for the automatic semantic analysis of compounds. It contains semantically annotated noun-noun compounds (NN) from ...

NCHLT Afrikaans word2vec-Skipgram embeddings

Roald Eiselen (North-West University; Centre for Text Technology (CTexT), 2023-05-01)

Static word embeddings for the Skipgram flavour of the word2vec (w2v) architecture (Mikolov et al., 2013). The embedding provides real-valued vector ...

NCHLT Afrikaans FLAIR-backward embeddings

Roald Eiselen (North-West University; Centre for Text Technology (CTexT), 2023-05-01)

Contextual word/string embeddings for the backward flavour of the FLAIR architecture (Akbik et al., 2018). The embedding provides real-valued vector ...

South African Multilingual Learner Corpus of Academic Texts (SAMuLCAT)

Van Dyk, Tobie (ICELDA; SADiLaR, 2021)

NOTE: THIS HAS BEEN SUPERSEDED. See https://hdl.handle.net/20.500.12185/585 The South African Multilingual Learner Corpus of Academic Texts (SAMuLCAT) ...

Autshumato English-Afrikaans Parallel Corpora

D.P. Snyman; Cindy McKellar; Handré Groenewald (North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ Resource Catalogue

Parallel corpora aligned on sentence level through a combination of automatic and manual alignment techniques. The parallel corpora were obtained from ...

Lwazi Afrikaans TTS corpus

Daniel van Niekerk; Etienne Barnard; Marelie Davel; Aby Louw; Alta de Waal (Meraka Institute, CSIR, 2013-03-27) ~ Resource Catalogue

Orthographic and phonemically aligned transcriptions

Speect

Daniel van Niekerk; Aby Louw (Meraka Institute, CSIR, 2013-07-15) ~ Resource Catalogue

Speect is a multilingual text-to-speech (TTS) system. It offers a full TTS system (text analysis which decodes the text, and speech synthesis, which ...

View previous page
1
. . .
2
3
4
5
6
7
8
View next page

Search

Filters

Filter options