Accomplishments
An Overview of Feature Extraction Techniques in OCR for Indian Scripts Focused on Offline Handwriting
- Abstract
Optical Character Recognition (OCR) is an interesting and challenging field of research in pattern recognition, artificial intelligence and machine vision and is used in many real life applications. Optical character recognition is a type of document analysis where a scanned document image that contains either machine printed or handwritten script is input to an OCR software engine, is translated into editable, machine-readable digital text format. With the spread of computers in public and private sectors and individual homes, automatic processing of tabular application forms, bank cheques, tax forms, census forms and postal mails has gained importance. Such automation needs research and development of handwritten characters/numerals recognition for different languages or scripts. The field of OCR is divided into two parts, one is recognition of machine printed characters and the second is recognition of handwritten characters. Recognizing handwritten text is an important area of research because of its various application potentials. Feature extraction is very important step in the process of OCR. This manuscript gives a review of comparative study of different feature extraction techniques used in OCR.