Creative Commons Attribution 4.0 InternationalMcKellar, CindyPuttkammer, MartinGaustad, Tanjavan Heerden, JacquesGent, Sunny2022-12-152022-12-152020-09-30https://hdl.handle.net/20.500.12185/569Aligned parallel corpora for the following language pair: English-Tshivenḓa. Data was crawled from various multilingual government websites, sourced from translated material and created by translating English sentences into Tshivenḓa. The data is given as two separate UTF-8 text files, with each aligned segment on a newline.TxtSegments: 124,791 English Words: 2,003,583 Tshivenḓa Words:2,523,402AutshumatoTshivenḓaParallel CorporaAutshumato English-Tshivenḓa Parallel Corpora10 Mb