W0027: An-Nahar Newspaper Text Corpus

The An-Nahar Lebanon Newspaper Text Corpus comprises articles in standard Arabic from 1995 to 2000 (6 years) stored as HTML files on CDRom media. Each year contains 45 000 articles and 24 million words. Each article includes information such as title, newspaper's name, date, country, type, page, etc. For each year, the size in byte is as follows:

1995 :128 MB
1996 :138 MB
1997 :152 MB
1998 :140 MB
1999 :130 MB
2000 :118 MB


Click here to view the prices
and browse other ressources
belonging to this category

Copyright © 1996-2001 ELRA/ELDA - Webmaster