Author: Abd El-Rahman,Badrya Dahy./ Title: Semantic Similarity between Arabic Sentences /

Search In this Thesis

العنوان

Semantic Similarity between Arabic Sentences /

المؤلف

Abd El-Rahman,Badrya Dahy.

هيئة الاعداد

مشرف / بدريى ضاحي عبد الرحمن

مشرف / خالد فتحي حسن

مناقش / سامية عبد الفتاح

مناقش / ممدوح فاروق محمد

الموضوع

Natural language Processing (NLP).

تاريخ النشر

2022.

عدد الصفحات

84 P. :

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

علوم الحاسب الآلي

تاريخ الإجازة

22/9/2022

مكان الإجازة

جامعة أسيوط - كلية الحاسبات والمعلومات - computer sinces

الفهرس

Only 14 pages are availabe for public view

from

104

from

104

Abstract

Semantic textual similarity is an important field in Natural language Processing (NLP).It’s useful in a variety of NLP applications, including information retrieval, plagiarism detection, date extraction, and machine translation. Sentence similarity in the Arabic language has not been investigated deeply because of the lake of the Arabic language resources. This thesis presents a new Arabic dataset for the sentence similarity task. This dataset can be sued to help develop sentence similarity approaches. In addition, the main purpose of the created dataset is to evaluate the sentence similarity approaches. The dataset has been collected from Wikipedia, an intermediate lexicon, and other WWW resources. This thesis gives more details about the processing of collecting data, filtering , preprocessing the pairs of sentences and some statistics about the dataset. The dataset is available for the future research in the field. Moreover, this thesis process an approach to calculate the semantic similarity between Arabic sentence. The process approach uses two types of word embedding : context independent word embedding (word2vec) and contextual embedding (BERT) to measure similarity between words.