Search In this Thesis
   Search In this Thesis  
العنوان
Semantic Anonymity For Data Privacy Preserving /
المؤلف
Mubark, Ahmed Ali Abdu Abdallah.
هيئة الاعداد
باحث / أحمد علي عبدة عبد الله مبارك
مشرف / حاتم محمد سيد أحمد
مناقش / عماد سعيد عبد العليم العبد
مناقش / عربي السيد كشك
الموضوع
Database security. Confidential communications.
تاريخ النشر
2016.
عدد الصفحات
106 p. :
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
Information Systems
تاريخ الإجازة
6/4/2016
مكان الإجازة
جامعة المنوفية - كلية الحاسبات والمعلومات - قسم نظم المعلومات
الفهرس
Only 14 pages are availabe for public view

from 16

from 16

Abstract

The advancement of information technologies has enabled various
organizations (e.g., census agencies, hospitals) to collect large volumes of
sensitive personal data (e.g., census data, medical records. Data in its
original form, however, typically contains sensitive information about
individuals, and publishing such data will violate individual privacy and
poses potential privacy risks. It is a major concern when sharing or
publishing the data between one to many sources for research purpose and
data analysis. Sensitive information of data owners must be protected.
To deal with these privacy issues, data must be anonymized so that
no sensitive information about individuals can be disclosed from published
data while data distortion is minimized to ensure the usefulness of data in
practice. Currently, the large number of data publishing models and
methods have been proposed in order to protect personal privacy and
security. Researchers have proposed new methods, namely k-anonymity, ℓ-
diversity, t-closeness for data privacy.
The k-anonymity privacy requirement for publishing microdata
requires that each equivalence class (i.e., a set of records that are
indistinguishable from each other with respect to certain “identifying”
attributes) contains at least k records. Recently, several researchers have
recognized that k-anonymity cannot prevent attribute disclosure. The
method of ℓ-diversity has been proposed to address this; ℓ-diversity
requires that each equivalence class have at least ℓ well-represented value
for each sensitive attribute.
However, a major drawback of these techniques that they cannot
prevent the similarity attack on the data privacy because they did not
consider the semantic relation between the sensitive attributes of the data.
This thesis presents an extensive study of this problem. It focus
primarily on notions of anonymity that are defined with respect to
individual identity, or with respect to the value of a sensitive attribute.
In this thesis, a semantic anonymization approach is proposed. This
approach is based on the Domain based on semantic rules and the data
owner rules to overcome the similarity attacks. It cap the belief of an
adversary inferring a sensitive value in a published data set to as high as
that of an inference based on the relationship between sensitive data. The
semantic meaning is that when an adversary sees a record in a published
data set, s/he will have a lower confidence that the record belongs to a
victim than not.
Finally, the traditional model and semantic anonymization model
performances are tested by measuring the information loss, utility metrics
and privacy level. In these situations, the data distributor is often faced
with a quandary: on one hand, it is important to protect the anonymity and
personal information of individuals. While one the other hand, it is also
important to preserve the utility of the data for research. The simulation
results of the proposed model show a significant enhancement in terms of
privacy level but the data utility is decreased.