Multilingual spelling checker lexicons
Please do not copy the URL from the browser for citation. The correct URL is 'https://hdl.handle.net/20.500.12185/694'
| dc.contact.email | martin.puttkammer@nwu.ac.za | |
| dc.contact.name | Martin Puttkammer | |
| dc.contributor.author | Centre for Text Technology, CTexT® | |
| dc.date.accessioned | 2025-11-21T14:11:54Z | |
| dc.date.available | 2025-11-21T14:11:54Z | |
| dc.date.issued | 2022-06-30 | |
| dc.description | Spelling checker lexicons for 10 South African languages. Lexicons created by collecting data from various sources and manually reviewed by language experts according to the standard written orthography. For each language there are four different lexicon files: abbreviations.<lang>.txt abbreviations and abbreviation compounds. lowercase.<lang>.txt words that are correct when written in lower case. offensive.<lang>.txt words that are potentially offensive, obscene, racist, or should not be suggested by a spelling checker for some other reason. uppercase.<lang>.txt words that should only be written with one or more capitalised characters, such as person and place names. | |
| dc.format | text | |
| dc.format.medium | N/A | |
| dc.identifier.uri | https://hdl.handle.net/20.500.12185/694 | |
| dc.languages | Afrikaans | |
| dc.languages | isiNdebele | |
| dc.languages | isiXhosa | |
| dc.languages | isiZulu | |
| dc.languages | Sepedi | |
| dc.languages | Setswana | |
| dc.languages | Sesotho | |
| dc.languages | Siswati | |
| dc.languages | Tshivenda | |
| dc.languages | Xitsonga | |
| dc.languages.other | N/A | |
| dc.media.category | multilingual lexicons | |
| dc.media.type | Text | |
| dc.project | CTexT Spelling Checkers | |
| dc.publisher | CTexT® (Centre for Text Technology) | |
| dc.rights.license | Creative Commons Attribution 4.0 International | |
| dc.subject | multilingual | |
| dc.subject | lexicon | |
| dc.subject | offensive | |
| dc.subject | abbreviations | |
| dc.title | Multilingual spelling checker lexicons | |
| dc.version | 1.0 |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- lexicons.zip
- Size:
- 4.73 MB
- Format:
- ZIP is an archive file format that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed.
License bundle
1 - 1 of 1
Loading...
- Name:
- license.txt
- Size:
- 3.22 KB
- Format:
- Item-specific license agreed upon to submission
- Description:


