Filter by:

Now showing items 21-40 of 349

Filter options

    • African Wordnet: Setswana 1.0 

      African Wordnet Project (UNISA, 2017-06-20) ~ Resource Catalogue
      Developed using the expand model with Princeton WordNet 2.0 as basis.Each wordnet contains synsets with at least the following fields:\nWord form (lemma; ...
    • African Wordnet: Tshivenda 1.0 

      African Wordnet Project (UNISA, 2017-06-20) ~ Resource Catalogue
      Developed using the expand model with Princeton WordNet 2.0 as basis. Each wordnet contains synsets with at least the following fields:\nWord form (lemma; ...
    • Afrikaans Custom Dictionary for Government Domain 

      Gerhard van Huyssteen, et al. (North-West University; Centre for Text Technology (CTexT), 2013-02-22) ~ Resource Catalogue
      Word list developed as a custom dictionary for use in the spelling checkers as part of the spelling checker project for the Department of Arts and ...
    • Afrikaans Genre Classification Corpus 

      Gerhard van Huyssteen, et al. (Trifonius, 2013-06-19) ~ Resource Catalogue
      Contains training and testing data for Genre Classification for Afrikaans.
    • Afrikaans linking element dataset 

      Trollip, EB (North-West University, 2019) ~ Resource Catalogue
      (Afrikaans follows English) This data set was compiled for a study in which the possible semantic content of Afrikaans linking elements was investigated. ...
    • Afrikaans speaking children's first lexical items 

      Brink, Nina (North-West University, 2018-05-17) ~ Resource Catalogue
      Data collected for a master's study in Afrikaans linguistics. The data consist of the first lexical items of 21 Afrikaans speaking children. The lexical ...
    • Afrikaans text unit identification data 

      Puttkammer, Martin (Centre for Text Technology, North-West University, 2006) ~ Resource Catalogue
      This dataset was developed during a masters degree and used in the development of a text unit identifier capable of tagging sentences, named-entities, ...
    • AuCoPro Semantics Dataset 

      Gerhard van Huyssteen, et al. (North-West University; Centre for Text Technology (CTexT); CLiPS Research Center, University of Antwerp, Belgium, 2015-01-07) ~ Resource Catalogue
      The AuCoPro Semantics dataset serves for the automatic semantic analysis of compounds. It contains semantically annotated noun-noun compounds (NN) from ...
    • AuCoPro Splitting Dataset 

      Gerhard van Huyssteen, et al. (North-West University; Centre for Text Technology (CTexT); Tilburg Centre for Cognition and Communication, 2015-01-07) ~ Resource Catalogue
      The AuCoPro Splitting dataset contains compounds annotated with their compound boundaries and linking morphemes for Afrikaans and Dutch.
    • Autshumato Afrikaans-English Translation Memory 

      Cindy McKellar, et al. (North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ Resource Catalogue
      Translation memory from Afrikaans to English (EN-GB), in the government domain for use in the Autshumato ITE application.
    • Autshumato English-Afrikaans Parallel Corpora 

      D.P. Snyman, et al. (North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ Resource Catalogue
      Parallel corpora aligned on sentence level through a combination of automatic and manual alignment techniques. The parallel corpora were obtained from ...
    • Autshumato English-Afrikaans Translation Memory 

      Cindy McKellar, et al. (North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ Resource Catalogue
      Translation memory from English (EN-GB) to Afrikaans, in the government domain for use in the Autshumato ITE application.
    • Autshumato English-isiZulu Parallel Corpora 

      D.P. Snyman, et al. (North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ Resource Catalogue
      Parallel corpora aligned on sentence level through a combination of automatic and manual alignment techniques. The parallel corpora were obtained from ...
    • Autshumato English-isiZulu Translation Memory 

      Cindy McKellar, et al. (North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ Resource Catalogue
      Translation memory from English (EN-GB) to isiZulu, in the government domain for use in the Autshumato ITE application.
    • Autshumato English-Sesotho sa Leboa Parallel Corpora 

      D.P. Snyman, et al. (North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ Resource Catalogue
      Parallel corpora aligned on sentence level through a combination of automatic and manual alignment techniques. The parallel corpora were obtained from ...
    • Autshumato English-Sesotho sa Leboa Translation Memory 

      Cindy McKellar, et al. (North-West University; Centre for Text Technology (CTexT), 2013-06-19) ~ Resource Catalogue
      Translation memory from English (EN-GB) to Sesotho sa Leboa, in the government domain for use in the Autshumato ITE application.
    • Autshumato English-Setswana Parallel Corpora 

      Cindy McKellar (North-West University; Centre for Text Technology (CTexT), 2016-10-28) ~ Resource Catalogue
      Aligned English-Setswana parallel corpus. This set contains data that was translated by professional translators, data that was sourced as translated ...
    • Autshumato English-Tshivenḓa Parallel Corpora 

      McKellar, Cindy (North-West University; Centre for Text Technology (CTexT), 2023-12-12)
      Aligned parallel corpora for the following language pair: English-Tshivenḓa. Data was crawled from various multilingual government websites, sourced ...
    • Autshumato English-Xitsonga Manually Translated Parallel Corpora 

      Wikus Pienaar, et al. (North-West University; Centre for Text Technology (CTexT), 2014-12-12) ~ Resource Catalogue
      Aligned English-Xitsonga parallel corpus. The data is given as two seperate UTF-8 text files; with each segment on a newline.
    • Autshumato English-Xitsonga Parallel Corpora 

      Wikus Pienaar, et al. (North-West University; Centre for Text Technology (CTexT), 2014-12-11) ~ Resource Catalogue
      Aligned English-Xitsonga parallel corpus. The data is given as two seperate UTF-8 text files; with each segment on a newline.