Resource Index: Recent submissions
Now showing items 11-20 of 412
-
Autshumato English-Sesotho Parallel Corpora
(CTexT® (Centre for Text Technology, North-West University), 2022-09-30)Aligned parallel corpora for the language pair English-Sesotho. The data is given as two separate UTF-8 text files, with each aligned segment on a ... -
Autshumato English-Sepedi Parallel Corpora
(CTexT® (Centre for Text Technology, North-West University), 2022-09-30)Aligned parallel corpora for the language pair English-Sepedi. The data is given as two separate UTF-8 text files, with each aligned segment on a newline. ... -
Autshumato English-isiZulu Parallel Corpora
(CTexT® (Centre for Text Technology, North-West University), 2022-09-30)Aligned parallel corpora for the language pair English-isiZulu. The data is given as two separate UTF-8 text files, with each aligned segment on a ... -
Autshumato English-Afrikaans Parallel Corpora
(CTexT® (Centre for Text Technology, North-West University), 2022-09-30)Aligned parallel corpora for the language pair English-Afrikaans. The data is given as two separate UTF-8 text files, with each aligned segment on a ... -
Autshumato Monolingual isiNdebele Corpus
(North-West University; Centre for Text Technology (CTexT), 2021-01-31)Monolingual corpus for isiNdebele. The data is given as a single UTF-8 text file, with each segment on a newline. -
Autshumato English-isiNdebele Parallel Corpora
(North-West University; Centre for Text Technology (CTexT), 2021-01-31)Aligned parallel corpora for the following language pair: English-isiNdebele. Data was crawled from various multilingual government websites, sourced ... -
Autshumato Monolingual Tshivenḓa Corpus
(North-West University; Centre for Text Technology (CTexT), 2020-09-30)Monolingual corpus for Tshivenḓa. The data is given as a single UTF-8 text file, with each segment on a newline. -
Autshumato Monolingual Xitsonga Corpus
(CTexT® (Centre for Text Technology, North-West University), 2022-09-30)Monolingual corpus for Xitsonga. The data is given as a single UTF-8 text file, with each segment on a newline. The data was specifically selected and ... -
Autshumato English-Tshivenḓa Parallel Corpora
(North-West University; Centre for Text Technology (CTexT), 2020-09-30)Aligned parallel corpora for the following language pair: English-Tshivenḓa. Data was crawled from various multilingual government websites, sourced ... -
Final year high school examination texts of South African home and first additional language subjects
(South African Centre for Digital Language Resources, 2022-11-16)This data collection consists of reading comprehension and summary writing texts. The texts comprise of the final year high school exam texts for ...