Recently added

NCHLT Tshivenda Speech Corpus

Charl van Heerden, et al. (Meraka Institute, CSIR; North-West University, 2014-07-08) ~ Resource Catalogue

Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.

NCHLT isiZulu Speech Corpus

Charl van Heerden, et al. (Meraka Institute, CSIR; North-West University, 2014-07-08) ~ Resource Catalogue

Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.

NCHLT English Speech Corpus

Charl van Heerden, et al. (Meraka Institute, CSIR; North-West University, 2014-07-08) ~ Resource Catalogue

Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.

NCHLT Setswana Speech Corpus

Charl van Heerden, et al. (Meraka Institute, CSIR; North-West University, 2014-07-08) ~ Resource Catalogue

Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.

NCHLT Afrikaans Speech Corpus

Charl van Heerden, et al. (Meraka Institute, CSIR; North-West University, 2014-07-08) ~ Resource Catalogue

Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.

NCHLT isiXhosa Speech Corpus

Charl van Heerden, et al. (Meraka Institute, CSIR; North-West University, 2014-07-08) ~ Resource Catalogue

Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.

NCHLT Sesotho Speech Corpus

Charl van Heerden, et al. (Meraka Institute, CSIR; North-West University, 2014-07-08) ~ Resource Catalogue

Orthographically transcribed broadband speech corpus of approximately 56 hours, including a test suite of 8 speakers.

NCHLT isiZulu Lemmatiser

Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue

Lemmatiser developed during the NCHLT Text project. \n\n Available in the Readme.txt - Input format: Text data (encoding: UTF8 without BOM), one ...

NCHLT Afrikaans Lemmatiser

Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue

Lemmatiser developed during the NCHLT Text project. \n\n Available in the Readme.txt - Input format: Text data (encoding: UTF8 without BOM), one lowercase ...

NCHLT Part of Speech Taggers

Martin Puttkammer, et al. (North-West University; Centre for Text Technology (CTexT), 2014-05-30) ~ Resource Catalogue

Part of speech taggers developed during the NCHLT Text project. Available for the following languages: Afrikaans, English, isiNdebele, isiXhosa, isiZulu, ...

Resource Index: Recent submissions

NCHLT Tshivenda Speech Corpus ﻿

NCHLT isiZulu Speech Corpus ﻿

NCHLT English Speech Corpus ﻿

NCHLT Setswana Speech Corpus ﻿

NCHLT Afrikaans Speech Corpus ﻿

NCHLT isiXhosa Speech Corpus ﻿

NCHLT Sesotho Speech Corpus ﻿

NCHLT isiZulu Lemmatiser ﻿

NCHLT Afrikaans Lemmatiser ﻿

NCHLT Part of Speech Taggers ﻿

View previous page

View next page

NCHLT Tshivenda Speech Corpus

NCHLT isiZulu Speech Corpus

NCHLT English Speech Corpus

NCHLT Setswana Speech Corpus

NCHLT Afrikaans Speech Corpus

NCHLT isiXhosa Speech Corpus

NCHLT Sesotho Speech Corpus

NCHLT isiZulu Lemmatiser

NCHLT Afrikaans Lemmatiser

NCHLT Part of Speech Taggers