Corpora of Spontaneous Spoken Italian LABLITA 
(since the beginning of the 70’s)
Four corpora mainly transcribed in CHAT format (Mac Whinney, 1994)
  1. An open corpus of Spontaneous Adult Spoken Language:120 texts of spontaneous Spoken Language of variable length (from 5 minutes to two hours) for about 80 hours and 600,000 transcribed words in different formats. 
  2. Longitudinal corpora of Italian acquisition: for about 95 hours and 650,000 words. 
  3. Corpus of movie language transcriptions (12 significant films in the history of Italian cinema,1948-1994), for about 21 hours and 115,000 words. 
  4. Samples of media language (radio and TV) for 92,000 words. 
 
LABLITA Corpus of Italian Spontaneous Speech in CHAT format
FAMILY PRIVATE PUBLIC
words Duration (minutes) words Duration (minutes) words Duration (minutes)
Trans. Total Trans. Total Trans. Total
Dialogue Free 19.339 124 241 1.327 112 233 8.477 67 67
Dialogue Regulate - - - 31.928 135 257 18.349 135 190
Conversation Free 34.489 173 296 49.464 375 619 20.849 152 152
Conversation Regulate - - - 6.528 48 161 47.744 378 439
Monologue Free 18.834 153 275 - - - - - -
Monologue Regulate - - - 2.093 16 70 14.798 133 195
Totals 72662 450 812 91340 686 1340 110217 865 1043