|
Corpora of Spontaneous
Spoken Italian LABLITA
(since the beginning of the 70’s) |
Four corpora mainly
transcribed in CHAT format (Mac Whinney, 1994)
-
An open corpus of Spontaneous Adult
Spoken Language:120 texts of spontaneous Spoken Language of variable length
(from 5 minutes to two hours) for about 80 hours and 600,000 transcribed
words in different formats.
-
Longitudinal corpora of Italian acquisition:
for about 95 hours and 650,000 words.
-
Corpus of movie language transcriptions
(12 significant films in the history of Italian cinema,1948-1994), for
about 21 hours and 115,000 words.
-
Samples of media language (radio and
TV) for 92,000 words.
LABLITA
Corpus of Italian Spontaneous Speech in CHAT format |
|
FAMILY |
PRIVATE |
PUBLIC |
|
words |
Duration
(minutes) |
words |
Duration
(minutes) |
words |
Duration
(minutes) |
Trans. |
Total |
Trans. |
Total |
Trans. |
Total |
Dialogue
Free |
19.339 |
124 |
241 |
1.327 |
112 |
233 |
8.477 |
67 |
67 |
Dialogue
Regulate |
- |
- |
- |
31.928 |
135 |
257 |
18.349 |
135 |
190 |
Conversation
Free |
34.489 |
173 |
296 |
49.464 |
375 |
619 |
20.849 |
152 |
152 |
Conversation
Regulate |
- |
- |
- |
6.528 |
48 |
161 |
47.744 |
378 |
439 |
Monologue
Free |
18.834 |
153 |
275 |
- |
- |
- |
- |
- |
- |
Monologue
Regulate |
- |
- |
- |
2.093 |
16 |
70 |
14.798 |
133 |
195 |
Totals |
72662 |
450 |
812 |
91340 |
686 |
1340 |
110217 |
865 |
1043 |
|
|