Author: Marie, Mohamed Ibrahim./ Title: Data Mining and Knowledge Discovery on the Web /

Search In this Thesis

العنوان

Data Mining and Knowledge Discovery on the Web /

المؤلف

Marie, Mohamed Ibrahim.

الموضوع

World Wide Web. Web search engines. Research.

تاريخ النشر

2004.

عدد الصفحات

P I-XIII ,249. :

الفهرس

Only 14 pages are availabe for public view

from

261

from

261

Abstract

Data mining and knowledge discovery on the Web, known as Web mining, is a very recent and important research topic. It combines two of the activated research areas, data mining and World Wide Web. Web mining could be classified into three domains: Web content mining, Web structure mining and Web usage mining. Our research is concerned with Web usage mining also known as Web log mining or server log mining. We focused on Web log mining, different data mining techniques associated with it, different research dealt with these techniques and different types of server logs.
In the empirical study, a server log-mining model is prepared to deal with server logs in order to demonstrate users usage behavior of Web sites. This model consists of three sub¬models; statistical sub-model, user/session identification sub¬model and association rules sub-model. Also, we proposed a logical model for server log mining database, a detailed description for each entity in this model, a Perl program for reading and parsing server logs, and a MYSQL DDL for server logs database.
Statistical sub-model is responsible for performing general server logs statistics such as Web server traffic, server successes and errors, users agents and top requested page. We presented new server logs statistics. User/session identification sub-model is responsible for solving two critical problems: 1) user identification, 2) session identification. Cleaning server logs from unneeded data such as error records, graphics records.