Spoken French Corpus 
(GARS/DELIC Corpus, since 1978)
Texts transcribed in ASCII files TXT in a format developed by UPRO
500 short corpora  2,000 words 1,000,000 
42 long corpora  12,000 words 500,000
28 large corpora  18,000 words 500,000
 
  • Variation in structure: interviews, monologues, dialogues;
  • Variation in contents: real life situations, professional experiences, political discussions, public speeches etc.);
  • Speakers variation: age, education, social and geographic origin. 
Français de reference

500.000 words
45 hours in 40 main towns, three age types, 3 production types