Resource Catalogue: Recent submissions
Now showing items 11-20 of 350
-
Morphologically annotated corpus for Sepedi
(Centre for Text Technology (CTexT), 2024-01-31)NCHLT corpus of morphologically annotated tokens in Sepedi converted to the tags used during phases 1 and 2 of the SADiLaR-II project. The data is ... -
Morphologically annotated corpus for Setswana
(Centre for Text Technology (CTexT), 2024-01-31)NCHLT corpus of morphologically annotated tokens in Setswana converted to the tags used during phases 1 and 2 of the SADiLaR-II project. The data is ... -
Morphologically annotated corpus for Tshivenḓa
(Centre for Text Technology (CTexT), 2024-01-31)NCHLT corpus of morphologically annotated tokens in Tshivenḓa converted to the tags used during phases 1 and 2 of the SADiLaR-II project. The data is ... -
Morphologically annotated corpus for Xitsonga
(Centre for Text Technology (CTexT), 2024-01-31)NCHLT corpus of morphologically annotated tokens in Xitsonga converted to the tags used during phases 1 and 2 of the SADiLaR-II project. The data is ... -
POS annotated corpus with 5 different text types for isiZulu
(Centre for Text Technology (CTexT), 2024-01-31)This is a POS annotated corpus with 5 different text types for isiZulu. The text types included are: - CAPS gr12 (Academic) - https://www.educat ... -
POS annotated corpus in 5 different genres for Sepedi
(Centre for Text Technology (CTexT), 2024-01-31)This corpus contains POS annotated data in 5 different genres for Sepedi. The text types included are: - CAPS gr12 (Academic) - https://www.educ ... -
Multilingual Linguistic Terminology
(UNISA, 2022-09-20)Multilingual Linguistic Terminology Project Termbanks of Linguistic terminology for South African languages Version 1.0 https://linguistictermino ... -
USAf National Language Resources Audit 2023
(South African Centre for Digital Language Resources, 2023-10)This report documents the findings of a comprehensive language resources audit conducted by the South African Centre for Digital Language Resources ... -
Generic Multilingual Academic Wordlists with Definitions
(SADiLaR; ICELDA, 2022)This multilingual generic academic wordlist has been developed to serve as a resource to students to assist with building a vocabulary and decoding ... -
NCHLT isiZulu word2vec-Skipgram embeddings
(North-West University; Centre for Text Technology (CTexT), 2023-05-01)Static word embeddings for the Skipgram flavour of the word2vec (w2v) architecture (Mikolov et al., 2013). The embedding provides real-valued vector ...