W0004 : ECI - European Corpus Initiative
The European Corpus Initiative (ECI) was founded to oversee the
acquisition and preparation of a large multilingual corpus, and
supports existing and projected national and international efforts
to carefully design, collect and publish large-scale multilingual
written and spoken corpora. ECI has produced the Multilingual
Corpus I (ECI/MCI) of over 98 million words, covering most of
the major European languages, as well as Turkish, Japanese, Russian,
Chinese, Malay and more. The primary focus in this effort is
on textual material of all kinds, including transcriptions of
spoken material.
Just a sampling of the contents of the CD-ROM:
- German newspaper texts from the Frankfurter Rundschau from
July 1992 -March 1993. provided by Universität Gesamthochschule,
Paderborn, Germany. Approximately 34 million words.
- French newspaper texts from Le Monde, consisting of material
from September 1989, October 1989, and January 1990. Provided
by LIMSI CNRS, France. Approximately 4.1 million words.
- Extracts from the Leiden Corpus of Dutch, consisting of newspapers,
transcribed speech, etc. Provided by Institut voor Nederlandse
Lexicologie, Leiden, Holland. Approximately 5.5 million words.
- International Labor Organisation (ILO) "Official Bulletin,
B Series". Vols LXVII(1984) - LXXII(1989). Parallel texts
in English, French and Spanish provided by the International
Labor Organisation. Approximately 5 million words.
The ECI/MCI is available from ELSNET.
Click here to view the prices and browse other ressources belonging to this category
Copyright © 1996-2001 ELRA/ELDA - Webmaster
|