Search In this Thesis
   Search In this Thesis  
العنوان
Investigation about Employing GMM for Voice Conversion Techniques for Arabic Spoken Words \
المؤلف
El-Manfaloty, Rania Abdou Gaber.
هيئة الاعداد
باحث / Rania Abdou Gaber El-Manfaloty
rania-elmanfaloty@yahoo.com
مشرف / El-Sayed Ahmed Youssef
مشرف / Noha Othman Korany
nokorany@hotmail.com
مشرف / Mona Hamed Lotfy
مناقش / Said El-Sayed El-Khamy
مناقش / El-Sayed Mahmoud El-Rabiee
الموضوع
Voice Conversion - Techniques.
تاريخ النشر
2013.
عدد الصفحات
108 p. :
اللغة
الإنجليزية
الدرجة
الدكتوراه
التخصص
الهندسة الكهربائية والالكترونية
تاريخ الإجازة
1/12/2012
مكان الإجازة
جامعة الاسكندريه - كلية الهندسة - Electrical Engineering
الفهرس
Only 14 pages are availabe for public view

from 130

from 130

Abstract

This thesis employs Gaussian Mixture Model for voice conversion of Arabic spoken words and compares it with another technique called PSOLA and resampling which depends on pitch shifting. As well as it proposes the usage of two compression techniques to compress the residual which requires large data storage.The first technique based on transforming the spectral envelope which is represented by the LSF coefficients.The transformation function is implemented using a joint density Gaussian Mixture Mode’! that is trained on aligned LSF. Also some residual prediction techniques are used such as (copying source residuals, copying reference residual and residual selection) to predict the LPC target residuals. Also the first technique is implemented by using MFCC instead of LSF. The second technique is Pitch Synchronous Overlap Add (PSOLA) and resampling. This technique depends on pitch shifting using time domain PSOLA and then resampling to return signal to its original length .The two techniques are investigated for some Arabic spoken words that contain the three vowels (a , e , 0) and then subjective and objective evaluations are used to evaluate and compare the two techniques. These evaluations show that the first technique using LSF features and residual selection technique or MFCC gives results better than the second technique.The usage of the residual selection method in the first technique requires a large data storage which need a great storage space, so the Multi-pulse Excitation Model and the Wavelet Transform are used to compress the residual before storing it. This thesis employs the space saving in between 73% -89% with good quality for the transformed Arabic word. This thesis proposed a new technique for voice conversion between genders called Dynamic Pitch Shifting (DPS). The proposed technique aims to minimize the storage area in the voice conversion system by eliminating the need of saving the target residual signal and only save the pitch marks position or the pitch periods of the target signal.