 |
CLEF Campaign 2006 | Participants 2006 | CLEF Campaign 2005
ATTENTION: The CLEF Test Suite (ELRA-E0008) is NOW AVAILABLE in the ELRA catalog!
CLEF - Cross Language Evaluation Forum
The CLEF Text Retrieval System Evaluation activity is co-ordinated in Europe by the DELOS Network of Excellence for Digital Libraries and organised in collaboration with the US National Institute of Standards and Technology (NIST) and the TREC Conferences.
The CLEF series of system evaluation campaigns aims at promoting research and development in Cross-Language Information Retrieval (CLIR) by:
- (i) providing an infrastructure for the testing and evaluation of information retrieval systems operating on European languages in monolingual, multilingual and cross-language contexts, and
- (ii) creating test-suites of reusable data which can be employed by system developers for benchmarking purposes.
In the framework of CLEF, ELDA is in charge of conducting the user needs survey, identifying the data and negotiating their distribution rights with the owners, and participating in the production of an exit plan that will set some evaluation procedures and recommendations regarding the evaluation of multilingual and cross-lingual systems.
The CLEF 2006 evaluation campaign
The results of the CLEF 2006 campaign were presented at the annual Workshop in Alicante, Spain, 20-22 September 2006, immediately following the 10th European Conference on Digital Libraries (ECDL’06). This year again, the Workshop was attended by nearly 150 researchers and system developers.
The aim of the workshop was to present and discuss the results of the CLEF 2006 evaluation campaign. Over the years CLEF has gradually increased the number of different tracks and tasks offered in order to facilitate experimentation with all kinds of multilingual information access.
This year again, eight evaluations tracks were offered to evaluate the performance of systems for :
- mono-, bi- and multilingual document retrieval on news collections (Ad-Hoc)
- mono- and cross-language information retrieval on structured scientific data (Domain-Specific)
- interactive cross-language information retrieval (iCLEF)
- multiple language question answering (QA@CLEF)
- cross-language retrieval in image collections (ImageCLEF)
- cross-language spoken document retrieval (CL-SR)
- multilingual web track (WebCLEF)
- cross-language geographical retrieval (GeoCLEF)
Although these tracks are the same as those offered last year, many of the tasks in these tracks are new in 2006.
For instance, for the Multilingual Question Answering track (QA@CLEF), in addition to the main task, three pilot tasks were proposed:
- WiQA, a task that assessed question answering using Wikipedia;
- an Answer Validation exercise;
- a Real-Time exercise that was conducted the morning just before the beginning of the Workshop (for the first time)..
An important number of different document collections were used in CLEF 2006 to build the CLEF test collections:
- the CLEF multilingual comparable corpus of more than 2 million news docs in 12 languages;
- the CLEF domain-specific collection consisting of the GIRT-4 social science database in English and German and two Russian databases: the Russian social Science Corpus and the Russian ISISS collection for sociology and economics (new in 2006);
- four collections for the ImageCLEF track were used: the ImageCLEFmed radiological medical database;
- the IRMA collection in English and German;
- the IAPR TC-12 database of 25,000 photographs with captions in English, German and Spanish and a general photographic collection;
- the Malach collection of spontaneous conversational speech derived from the Shoah archives in English and Czech is used for the Speech retrieval track and finally, a collection crawled from European governmental sites, called EuroGOV is used for the WebCLEF track.
The groups participating to the CLEF 2006 Campaign
Geographical distribution of groups having submitted runs in CLEF |
| |
2006 |
2005 |
2004 |
| Europe |
60 |
43 |
37 |
| North America |
15 |
19 |
12 |
| Asia |
10 |
10 |
5 |
| South America |
4 |
1 |
1 |
| Australia |
2 |
1 |
1 |
Total |
91 |
74 |
56 |
As in previous year, participated groups consisted of a mix of new-comers (34) and groups that had participated in one or more previous editions (56). Most of the groups came from academia, there were just 9 groups from industry.
The work of the groups participating in the 2006 campaign was presented in “plenary and parallel paper sessions” and also in a “poster session”. The last day of the Workshop, there was also the “breakout sessions” for more in-depth discussion of the results of individual tracks and intentions for the future. The “concluding session” included discussions on ideas for new tracks in future campaigns.
The final program and the Working Notes are available on the CLEF website.
ELDA Contacts
Khalid Choukri, ELDA CEO
Contact
Christelle Ayache, Junior Project Manager
Contact
ELDA
55-57 rue Brillat Savarin
75013 Paris (France)
Tel : +33 1 43 13 33 33
Fax : +33 1 43 13 33 30
Co-ordinator
DELOS Network of Excellence for Digital Libraries
CLEF Steering Committee
- Martin Braschler, Zurich , Switzerland
- Amedeo Cappelli, ISTI-CNR & CELCT , Italy
- Hsin-Hsi Chen, National Taiwan University , Taipei , Taiwan
- Khalid Choukri, Evaluations and Language resources Distribution Agency, Paris , France
- Paul Clough, University of Sheffield , UK
- Thomas Deselaers, RWTH Aachen University , Germany
- David A. Evans, Clairvoyance Corporation, USA
- Marcello Federico, ITC-irst, Trento , Italy
- Christian Fluhr, CEA-LIST, Fontenay-aux-Roses, France
- Norbert Fuhr, University of Duisburg , Germany
- Frederic C. Gey, U.C. Berkeley , USA
- Julio Gonzalo, LSI-UNED, Madrid, Spain
- Donna Harman, National Institute of Standards and Technology , USA
- Gareth Jones, Dublin City University , Ireland
- Franciska de Jong, University of Twente , Netherlands
- Noriko Kando, National Institute of Informatics, Tokyo , Japan
- Jussi Karlgren, Swedish Institute of Computer Science , Sweden
- Michael Kluck, German Institute for International and Security Affairs, Berlin , Germany
- Natalia Loukachevitch, Moscow State University , Russia
- Bernardo Magnini, ITC-irst, Trento , Italy
- Paul McNamee, Johns Hopkins University , USA
- Henning Müller, University & University Hospitals of Geneva , Switzerland
- Douglas W. Oard, University of Maryland , USA
- Maarten de Rijke, University of Amsterdam, Netherlands
- Diana Santos, Linguateca, Sintef, Oslo , Norway
- Jacques Savoy, University of Neuchatel , Switzerland
- Peter Schäuble, Eurospider Information Technologies, Switzerland
- Richard Sutcliffe, University of Limerick , Ireland
- Max Stempfhuber, Informationszentrum Sozialwissenschaften Bonn, Germany
- Hans Uszkoreit, German Research Center for Artificial Intelligence (DFKI), Germany
- Felisa Verdejo, LSI-UNED, Madrid, Spain
- José Luis Vicedo, University of Alicante , Spain
- Ellen Voorhees, National Institute of Standards and Technology , USA
- Christa Womser-Hacker, University of Hildesheim , Germany
CLEF restricted area available here
CLEF 2005
The results of the sixth campaign of the Cross-Language Evaluation Forum were presented at a two-and-a-half day workshop held in Vienna , Austria , 21-23 september, immediately following the ninth European Conference on Digital Libraries (ECDL 2005). The Workshop was attended by nearly 150 researchers and system developers. The aim of the workshop was to present and discuss the results of the CLEF 2005 evaluation campaign.
Over the years CLEF has gradually increased the number of different tracks and tasks offered in order to facilitate experimentation with all kinds of multilingual information access. This year, eight evaluations tracks were offered to evaluate the performance of systems for :
- mono-, bi- and multilingual document retrieval on news collections (Ad-Hoc)
- mono- and cross-language information retrieval on structured scientific data (Domain-Specific)
- interactive cross-language information retrieval (iCLEF)
- multiple language question answering (QA@CLEF)
- cross-language retrieval in image collections (ImageCLEF)
- cross-language spoken document retrieval (CL-SR)
- multilingual web track (WebCLEF)
- cross-language geographical retrieval (GeoCLEF)
Over the years, the CLEF test set collection has expanded dramatically and now counts 2 million news documents in 12 languages. In 2005, 2 new collections (Bulgarian and Hungarian newspapers) were added to the multilingual corpus.
A total of 74 groups submited runs in CLEF 2005, as opposed to the 54 groups of CLEF 2004 : 43 (37) from Europe, 19 (12) from North America, 10 (5) from Asia and 1 each from South America and Australia. The introduction of new tracks this year has clearly had a big impact both with respect to numbers and also regarding expertise - making CLEF an increasingly multidisciplinary forum. The work of the groups participating in this year’s campaign was presented in plenary paper and poster sessions. There was also break-out sessions for more in-depth discussion of the results of individual tracks and intentions for the future. The final sessions included discussions on ideas for new tracks in future campaigns.
In the end, the Workshop provided an ample panorama of the current state-of-the-art and the latest research directions in the multilingual information retrieval area.
The presentations given at the CLEF Workshops and detailed reports on the experiments of CLEF 2005 and previous years can be found on the CLEF website.
|
 |