Search In this Thesis
   Search In this Thesis  
العنوان
EXTRACTING A PHYSICAL DATA MODEL BASED ON SOME DATA MINING TECHNIQUES IN AIRLINES INDUSTRY \
المؤلف
Badawi, Hanaa Maher Mohamed Mohamed.
هيئة الاعداد
باحث / هناء ماهر محمد محمد بدوى
مشرف / هيام عبد العزيز علي الزاهد
مشرف / شاهيناز محمود الطباخ
مناقش / هيام عبد العزيز علي الزاهد
تاريخ النشر
2018.
عدد الصفحات
97 p. :
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
Information Systems
تاريخ الإجازة
1/1/2018
مكان الإجازة
جامعة عين شمس - كلية البنات - فيزياء –الحاسب الالي و تطبيقاته
الفهرس
Only 14 pages are availabe for public view

from 97

from 97

Abstract

In the airline industry, large scale data is generated every hour, it could be in structured format (uniform data table) or unstructured format (emails, SMS, photos, tweets, etc.) distributed among many repositories. It is very essential to get useful information from these enormous data sources to help in building promising marketing strategies. These datasets has to be combining together to build a successful marketing plan. Understanding the customer’s needs is crucial to business growth. But some Customers might experience a problem during their flight, for example: the flight delay in station, fraud payment in booking process and missing bags on arrival. These customers are becoming critical obstacle in success roadmap in today’s airline business.
Data mining can help in getting useful knowledge from customer’s data to extract information from customer emails or to predict flight delays. In this work we choose to study two issues as subject to data mining process, these two issues affect mostly on customer satisfaction; we had the issue of flight delay as sample of structured Data. Also the research had the issue of grievance handling dataset to study as sample of unstructured.
In part one; we focused on applying classification techniques for analysing the Flight delay pattern in Egypt Airline’s Flight dataset. We compared eight classifiers algorithms of the WEKA Data mining tool , four decision-tree-based classifiers (REPTree, Forest, Stump and J48) and other four rule-based classifiers (PART,DecisionTable ,OneR ,JRip). The four decision tree classifiers are evaluated and results showed that the REPTree have the best accuracy 80.3% with respect to Forest, Stump and J48. However, four decision rules based classifiers were compared and results show that PART provides best accuracy among studied rule-based classifiers with accuracy of 83.1%.
In part two, we built a model to extract the useful information from customer grievances data, to be used as business guide. Customer grievance system in EGYPTAIR called We-care , has large feeds of data which can be collected in datasets through various channels such as e-mail, website or mobile App, call centre phone call, etc. Then the incoming datasets are analyzed and assessed by organization’s support teams in order to fulfil the grievance request. And then it is assigned to related department through manual classification system. Finally, it provides solution for the issue. As grievance categorization was handled manually, it is time-consuming process; we decide to improve We-care system’s workflow, by categorizing the grievance automatically, for better performance. Classifying methods are used to identify data into groupings of categories across the variable touch points. The system has more than 150 categories of problems, but for experimental purposes we decide to study 6 categories only. We have applied four commonly used classifiers (SVM, KNN,Naive Byes and Decision-Tree) on our dataset to classify the new grievances , and then selected the best of them to be the Grievances Classifier in our system. Among four classifiers applied on the dataset, KNN achieved the highest average accuracy (98%), then SVM (SMO) with accuracy of (97%).
The benefits of performing a thorough analysis of problems include better understanding of service performance.
Keywords-Airlines Grievance; Flight Delay; WEKA, classification ,Data Mining Text mining.