Summary of the paper

Title Diacritization and Transliteration of Proper Nouns from Arabic to English
Authors Hamdy S. Mubarak, Mohamed Al Sharqawy and Esraa Al Masry
Abstract This paper proposes a complete system for the automatic diacritization and transliteration of proper nouns from Arabic to English using a database of name pairs in Arabic and English languages. The system consists of three phases: Correction, Diacritization, and Transliteration. Correction phase corrects the Common Arabic Mistakes (initial Hamza, final Yaa, and final Taa errors) using Normalization and corrects normal concatenation errors. The most frequent transliteration is considered in case of exact match with saved normalized tokens generated from proper names database. The missing diacritics are restored using Sakhr's Morphological Analyzer for analyzed tokens or from the best matching with patterns (for Arabic and Non-Arabic names) and consecutive characters obtained from the diacritized proper names. Transliteration rules are applied for the diacritized proper name to obtain the English equivalent (transliteration). Our results show an average accuracy of 89% on blind test sets with forced spelling mistakes (and 95% for correct input).
Topics Machine translation to or from Arabic
Full paper Diacritization and Transliteration of Proper Nouns from Arabic to English
Bibtex @InProceedings{MUBARAK09.81,
  author = {Hamdy S. Mubarak, Mohamed Al Sharqawy and Esraa Al Masry},
  title = {Diacritization and Transliteration of Proper Nouns from Arabic to English},
  booktitle = {Proceedings of the Second International Conference on Arabic Language Resources and Tools},
  year = {2009},
  month = {April},
  date = {22-23},
  address = {Cairo, Egypt},
  editor = {Khalid Choukri and Bente Maegaard},
  publisher = {The MEDAR Consortium},
  isbn = {2-9517408-5-9},
  language = {english}
  }

Powered by ELDA © 2009 The MEDAR Consortium