Now showing items 1-10 of 526

    • Ex Machina: Using NLP and statistical learning models to model eyewitness statements and choosing behaviour 

      Nortje, Alicia, et al. (Sadilar, 2019-05-07)
      This curated database includes data from various of empirical studies where eyewitness statements and descriptions were collected. The original studies, ...
    • Autshumato English-Tshivenḓa Parallel Corpora 

      McKellar, Cindy (North-West University; Centre for Text Technology (CTexT), 2023-12-12)
      Aligned parallel corpora for the following language pair: English-Tshivenḓa. Data was crawled from various multilingual government websites, sourced ...
    • Autshumato Monolingual Tshivenḓa Corpus 

      McKellar, Cindy (North-West University; Centre for Text Technology (CTexT), 2023-12-12)
      Monolingual corpus for Tshivenḓa. The data is given as a single UTF-8 text file, with each segment on a newline.
    • Morphologically annotated corpus for isiNdebele 

      Gaustad, Tanja (Centre for Text Technology (CTexT), 2024-01-31)
      NCHLT corpus of morphologically annotated tokens in isiNdebele converted to the tags used during phases 1 and 2 of the SADiLaR-II project. The data ...
    • Morphologically annotated corpus for isiXhosa 

      Gaustad, Tanja (Centre for Text Technology (CTexT), 2024-01-31)
      NCHLT corpus of morphologically annotated tokens in isiXhosa converted to the tags used during phases 1 and 2 of the SADiLaR-II project. The data is ...
    • Morphologically annotated corpus for isiZulu 

      Gaustad, Tanja (Centre for Text Technology (CTexT), 2024-01-31)
      NCHLT corpus of morphologically annotated tokens in isiZulu converted to the tags used during phases 1 and 2 of the SADiLaR-II project. The data is ...
    • Morphologically annotated corpus for Siswati 

      Gaustad, Tanja (Centre for Text Technology (CTexT), 2024-01-31)
      NCHLT corpus of morphologically annotated tokens in Siswati converted to the tags used during phases 1 and 2 of the SADiLaR-II project. The data is ...
    • Morphologically annotated corpus for Sesotho 

      Gaustad, Tanja (Centre for Text Technology (CTexT), 2024-01-31)
      NCHLT corpus of morphologically annotated tokens in Sesotho converted to the tags used during phases 1 and 2 of the SADiLaR-II project. The data is ...
    • Morphologically annotated corpus for Sepedi 

      Gaustad, Tanja (Centre for Text Technology (CTexT), 2024-01-31)
      NCHLT corpus of morphologically annotated tokens in Sepedi converted to the tags used during phases 1 and 2 of the SADiLaR-II project. The data is ...
    • Morphologically annotated corpus for Setswana 

      Gaustad, Tanja (Centre for Text Technology (CTexT), 2024-01-31)
      NCHLT corpus of morphologically annotated tokens in Setswana converted to the tags used during phases 1 and 2 of the SADiLaR-II project. The data is ...