Mburisano Covid-19 multilingual corpus
Please do not copy the URL from the browser for citation. The correct URL is 'https://hdl.handle.net/20.500.12185/536'
dc.contact.email | laurette.p@gmail.com | en_ZA |
dc.contact.name | Laurette Marais | en_ZA |
dc.contributor.author | Marais, Laurette | |
dc.contributor.other | Wilken, Ilana | |
dc.contributor.other | Van Niekerk, Nina | |
dc.contributor.other | Calteaux, Karen | |
dc.date.accessioned | 2021-03-02T09:39:39Z | |
dc.date.available | 2021-03-02T09:39:39Z | |
dc.date.issued | 2020-12-04 | |
dc.description | This corpus was created to aid development of the AwezaMed Covid-19 speech-to-speech mobile application. The project within which it was created, Mburisano, was funded by the Department of Sport, Arts and Culture (DSAC). A selection of English sentences was generated in consultation with medical domain experts, and these sentences were manually translated into all official South African languages. The sentences formed the basis of the rapid development of Grammatical Framework (GF) application grammars for all the languages, to aid spoken communication about Covid-19 with a particular focus on screening and triage. The corpus is presented as a limited domain, manually translated parallel corpus in all 11 official South African languages. The AwezaMed Covid-19 application can be found [here](https://play.google.com/store/apps/details?id=za.co.aweza.covid19&gl=ZA). | en_ZA |
dc.format | csv | en_ZA |
dc.format.extent | 283 x 11 utterances | en_ZA |
dc.format.size | 150kB | en_ZA |
dc.identifier.uri | https://hdl.handle.net/20.500.12185/536 | |
dc.languages | Afrikaans | en_ZA |
dc.languages | English | en_ZA |
dc.languages | isiNdebele | en_ZA |
dc.languages | isiXhosa | en_ZA |
dc.languages | isiZulu | en_ZA |
dc.languages | Sepedi | en_ZA |
dc.languages | Setswana | en_ZA |
dc.languages | Sesotho | en_ZA |
dc.languages | Siswati | en_ZA |
dc.languages | Tshivenda | en_ZA |
dc.languages | Xitsonga | en_ZA |
dc.media.category | multilingual text corpus | en_ZA |
dc.media.type | Text | en_ZA |
dc.project | Mburisano | en_ZA |
dc.publisher | CSIR Voice Computing | en_ZA |
dc.rights.license | Creative Commons Attribution 3.0 Unported (CC BY 3.0): https://www.creativecommons.org/licenses/by/3.0/ | en_ZA |
dc.subject | Covid-19 | en_ZA |
dc.title | Mburisano Covid-19 multilingual corpus | en_ZA |
Files
License bundle
1 - 1 of 1
Loading...
- Name:
- license.txt
- Size:
- 3.23 KB
- Format:
- Item-specific license agreed upon to submission
- Description: