Search In this Thesis
   Search In this Thesis  
العنوان
Bilingual document image analysis with font independent arabic text recognition/
الناشر
Ahmed Mahmoud El-Gammal,
المؤلف
El-Gammal, Ahmed Mahmoud.
هيئة الاعداد
مشرف / محمد عبد الحميد اسماعيل
باحث / حمد محمود الجمال
مناقش / عبد المنعم بلال
مناقش / مجدى ناجى
الموضوع
Digital techniques. Image processing.
تاريخ النشر
1996 .
عدد الصفحات
170 p.:
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
الهندسة
تاريخ الإجازة
1/1/1996
مكان الإجازة
جامعة الاسكندريه - كلية الهندسة - الالات الحاسبه والتحكم الالى
الفهرس
Only 14 pages are availabe for public view

from 169

from 169

Abstract

This thesis emphasizes two important components of bilingual (Ara¬bic/Latin) document image analysis. The first topic the thesis deals with is constructing a font and size independent Arabic character recognition system. The second topic is the analysis of documents that contain hybrid language environment (Arabic/Latin.)
A structural analysis approach is used to find a set of structural descrip¬tors that represent each script independently from the font or the size used. This set of structural descriptors are quantified to find a set of features. Two different classifiers are used in the system. The first classifier is used for script classification and the second is used for dot and diacritic classification. The results of the two classifiers are associated with each other according to topo¬logical and linguistic rules to generate character candidates. Then the final character recognition takes place using a regular grammar that describes how character is composed from the basic scripts.
Because of the different characteristics of Arabic and Latin languages, language of the text must be identified priori to the recognition phase. A set of measures to discriminate between text written in Arabic and text written in English is suggested and evaluated. Solving this problem is very important in building a bilingual document analysis system that is capable of processing documents containing hybrid Arabic/Latin languages.