Human-machine dialogues (ITC-Irst).
(under development since 1999)
From 100 to 200 human-computer dialogues will be collected for each language. The ITC-Irst speech recognition technology will be adapted in order to handle mixed initiative interactions in several languages. After the recordings, the corpora will be available with texts and waveform files. Log files of the interactions, containing recognised texts and recognition grammars, will also be provided. Recordings will be done, during the first six months of 2002, by an automatic service available through a free telephone number.

For each language a set of tasks will be defined according to a given semantic domain. For Italian, the chosen domain is accessing tourism information. For this domain, a preliminary collection has just been performed. Each caller was given two or three tasks consisting in asking for information about: hotel availability, available services, sports ground, localities in general, etc... . Up to now about 200 dialogues have been collected.