Browsing Resource Catalogue by Title

African Wordnet: Setswana 1.0

African Wordnet Project (UNISA, 2017-06-20) ~ Resource Catalogue

Developed using the expand model with Princeton WordNet 2.0 as basis.Each wordnet contains synsets with at least the following fields:\nWord form (lemma; ...

African Wordnet: Tshivenda 1.0

African Wordnet Project (UNISA, 2017-06-20) ~ Resource Catalogue

Developed using the expand model with Princeton WordNet 2.0 as basis. Each wordnet contains synsets with at least the following fields:\nWord form (lemma; ...

Afrikaans Custom Dictionary for Government Domain

Gerhard van Huyssteen, et al. (North-West University; Centre for Text Technology (CTexT), 2013-02-22) ~ Resource Catalogue

Word list developed as a custom dictionary for use in the spelling checkers as part of the spelling checker project for the Department of Arts and ...

Afrikaans Genre Classification Corpus

Gerhard van Huyssteen, et al. (Trifonius, 2013-06-19) ~ Resource Catalogue

Contains training and testing data for Genre Classification for Afrikaans.

Afrikaans linking element dataset

Trollip, EB (North-West University, 2019) ~ Resource Catalogue

(Afrikaans follows English) This data set was compiled for a study in which the possible semantic content of Afrikaans linking elements was investigated. ...

Afrikaans speaking children's first lexical items

Brink, Nina (North-West University, 2018-05-17) ~ Resource Catalogue

Data collected for a master's study in Afrikaans linguistics. The data consist of the first lexical items of 21 Afrikaans speaking children. The lexical ...

Afrikaans text unit identification data

Puttkammer, Martin (Centre for Text Technology, North-West University, 2006) ~ Resource Catalogue

This dataset was developed during a masters degree and used in the development of a text unit identifier capable of tagging sentences, named-entities, ...

AuCoPro Semantics Dataset

Gerhard van Huyssteen, et al. (North-West University; Centre for Text Technology (CTexT); CLiPS Research Center, University of Antwerp, Belgium, 2015-01-07) ~ Resource Catalogue

The AuCoPro Semantics dataset serves for the automatic semantic analysis of compounds. It contains semantically annotated noun-noun compounds (NN) from ...

AuCoPro Splitting Dataset

Gerhard van Huyssteen, et al. (North-West University; Centre for Text Technology (CTexT); Tilburg Centre for Cognition and Communication, 2015-01-07) ~ Resource Catalogue

The AuCoPro Splitting dataset contains compounds annotated with their compound boundaries and linking morphemes for Afrikaans and Dutch.

Autshumato Afrikaans-English Translation Memory

Cindy McKellar, et al. (North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ Resource Catalogue

Translation memory from Afrikaans to English (EN-GB), in the government domain for use in the Autshumato ITE application.

Autshumato English-Afrikaans Parallel Corpora

D.P. Snyman, et al. (North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ Resource Catalogue

Parallel corpora aligned on sentence level through a combination of automatic and manual alignment techniques. The parallel corpora were obtained from ...

Autshumato English-Afrikaans Translation Memory

Cindy McKellar, et al. (North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ Resource Catalogue

Translation memory from English (EN-GB) to Afrikaans, in the government domain for use in the Autshumato ITE application.

Autshumato English-isiZulu Parallel Corpora

D.P. Snyman, et al. (North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ Resource Catalogue

Parallel corpora aligned on sentence level through a combination of automatic and manual alignment techniques. The parallel corpora were obtained from ...

Autshumato English-isiZulu Translation Memory

Cindy McKellar, et al. (North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ Resource Catalogue

Translation memory from English (EN-GB) to isiZulu, in the government domain for use in the Autshumato ITE application.

Autshumato English-Sesotho sa Leboa Parallel Corpora

D.P. Snyman, et al. (North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ Resource Catalogue

Parallel corpora aligned on sentence level through a combination of automatic and manual alignment techniques. The parallel corpora were obtained from ...

Autshumato English-Sesotho sa Leboa Translation Memory

Cindy McKellar, et al. (North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ Resource Catalogue

Translation memory from English (EN-GB) to Sesotho sa Leboa, in the government domain for use in the Autshumato ITE application.

Autshumato English-Setswana Parallel Corpora

Cindy McKellar (North-West University; Centre for Text Technology (CTexT), 2016-10-28) ~ Resource Catalogue

Aligned English-Setswana parallel corpus. This set contains data that was translated by professional translators, data that was sourced as translated ...

Autshumato English-Tshivenḓa Parallel Corpora

McKellar, Cindy (North-West University; Centre for Text Technology (CTexT), 2023-12-12)

Aligned parallel corpora for the following language pair: English-Tshivenḓa. Data was crawled from various multilingual government websites, sourced ...

Autshumato English-Xitsonga Manually Translated Parallel Corpora

Wikus Pienaar, et al. (North-West University; Centre for Text Technology (CTexT), 2014-12-12) ~ Resource Catalogue

Aligned English-Xitsonga parallel corpus. The data is given as two seperate UTF-8 text files; with each segment on a newline.

Autshumato English-Xitsonga Parallel Corpora

Wikus Pienaar, et al. (North-West University; Centre for Text Technology (CTexT), 2014-12-11) ~ Resource Catalogue

Aligned English-Xitsonga parallel corpus. The data is given as two seperate UTF-8 text files; with each segment on a newline.