Ocr gremlins

From SikhiWiki
Jump to navigationJump to search

When many old articles in English have been converted into electronic form using Optical character recognition or OCR then various error are common. This article list some of the errors that have been detected so far:

Real character Erroneous ocr character Examples
m rn - as small case RN "him" is recognised as "hirn" also "learn" as "leam"
th dl "this" is recognised as "dhis"
in m "insert" is recognised as "msert"
ri n "rift" is recognised as "nft"
u ii "must" is recognised as "miist" also ii). as u).
i l - small case 'L' "missed" is recognised as "mlssed"
1 l - the small case letter 'L' "145" is recognised as "L45" also "learned" as "1earned"
Z 2 "4325" is recognised as "43Z5" also "Zebra" as "2ebra"
add new here on this line examples here