DSI LogoSADiLaR Logo
Clarin-ZA Logo
View Item 
  •   SADiLaR
  • Language Resource Management Agency
  • Resource Catalogue
  • View Item
  •   SADiLaR
  • Language Resource Management Agency
  • Resource Catalogue
  • View Item
    • Login
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Search form

    Browse

    All of SADiLaR

    Communities & CollectionsTitleProjectMedia type

    This Collection

    TitleProjectMedia type

    NCHLT Optical Character Recognition for South African Languages

    Thumbnail
    Download
    nchlt_optical_character_recognition.zip (103.8Mb)
    MD5: 024ec2bec53429e3aa221b4a916cd7e5

    License agreement

    By downloading this resource I accept and agree to the terms of use and the associated license conditions under which the resource is distributed.

    URI
    https://hdl.handle.net/20.500.12185/322
    Collections
    • Resource Catalogue [251]
    • Resource Index [409]
    Author(s)
    Martin Puttkammer
    Justin Hocking
    Roald Eiselen
    Metadata
    Show full item record
    Description
    An OCR system is an application that enables one to convert scanned paper documents into editable and searchable texts. The engine analyses the structure of document image and divides the page into elements such as blocks of texts, tables and images. These blocks are used to identify character image patterns which are used to advance several hypotheses about the character possibilities. These hypotheses are used to produce different character, word and line level variations and associated probabilities. The set of probability hypotheses are then searched to find the most likely combination of characters, words and lines to produce a textual representation of the image.
    Contact person
    Martin Puttkammer
    Contact person's e-mail address
    Martin.Puttkammer@nwu.ac.za
    Publisher(s)
    North-West University
    Centre for Text Technology (CTexT)
    License
    Creative Commons Attribution 3.0 Unported License (CC BY 3.0)

    Copyright © 2018  SADiLaR. All Rights Reserved.
    Contact Us | Send Feedback
     

     


    Copyright © 2018  SADiLaR. All Rights Reserved.
    Contact Us | Send Feedback