Unisa South African Spoken and Signed Language Corpus
This resource comprises annotated transcriptions of audio and video segments of the Xhosa section of the spoken corpus project SOUTHTALK (Southern African Spoken Language Corpus) under the auspices of the University of South Africa and the University of Gothenburg.
Gideon Kotzé
University of South Africa
corpus; Xhosa; transcribed audio; transcribed video
Monolingual text
524KB (audio); 800KB (video) (including all annotations and headers)
5246 untokenized words (audio); 34432 untokenized words (video). Annotations within the text itself were not removed.
It is not yet available, but each document contains information on the genders of the speakers, as well as their age and status of education. This information can be found in the header section of each file.
