Summary of the paper

Title Collaborative Construction of Arabic Lexical Resources
Authors Mohammad Daoud, Daoud Daoud and Christian Boitet
Abstract The absence of free usable lexical and syntactic resources and tools for Arabic makes it a "pi-language" (poorly informatized). This may be the main reason for making the process of transferring knowledge into it very difficult, especially in technical domains. To demonstrate the size of the problem: the translation of EOLSS (EOLSS 2008) (Encyclopedia Of Life Support Systems), which contains about 200,000 pages, probably requires to find or create translations for at least 250,000 terms, an effort which is estimated at about 25,000 working hours. We are proposing a collaborative methodology that involves domain experts, linguists, terminologists and normal Internet users in the process of developing a domain-dedicated Arabic terminological database by facilitating their contribution and collaboration. The collaborative process would replace the expensive and infeasible (especially for Arabic) traditional approach. As an intermediate phase towards an Arabic terminological database, we aim at constructing a preterminological database (pTMDB) which contributability should be far higher than a true terminological database.
Topics Exploitation of LRs in different types of applications (information extraction, information retrieval, speech dictation, translation, summarisation, web services, semantic web, etc.),
Roadmapping for Arabic language technology,
Extraction and acquisition of knowledge (e.g. terms, lexical information, language modelling) from LRs
Collaborative Construction of Arabic Lexical Resources
