Creative Commons Attribution 4.0 International (CC-BY 4.0)Roald EiselenRico KoenAlbertus KrugerJacques van Heerden2023-07-282023-05-012023-07-282023-05-012023-05-01https://hdl.handle.net/20.500.12185/650Static word embeddings for the continuous bag of words (CBoW) flavour of the word2vec (w2v) architecture (Mikolov et al., 2013). The embedding provides real-valued vector representations for Sesotho text.Training data: Paragraphs: 535,853; Token count: 17,425,650; Vocab size: 34,888; Embedding dimensions: 600;stNCHLT Sesotho word2vec-CBOW embeddingsModules75.87MB (Zipped)