Ocr gremlins

From SikhiWiki
Revision as of 13:00, 20 March 2010 by Hari singh (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

When many old articles in English have been converted into electronic form using Optical character recognition or OCR then various error are common. This article list some of the errors that have been detected so far:

Real character Erroneous ocr character Examples
m rn - as small case RN "him" is recognised as "hirn" also "learn" as "leam"
th dl "this" is recognised as "dhis"
in m "insert" is recognised as "msert"
ri n "rift" is recognised as "nft"
u ii "must" is recognised as "miist" also ii). as u).
i l - small case 'L' "missed" is recognised as "mlssed"
1 l - the small case letter 'L' "145" is recognised as "L45" also "learned" as "1earned"
Z 2 "4325" is recognised as "43Z5" also "Zebra" as "2ebra"
add new here on this line examples here