S0077 : Telephone Speech Data Collection for Czech
This speech database contains the recordings of 1,227 speakers
(590 males and 637 females) recorded over the fixed telephone network using
an ISDN interface.
The speech data were collected in Czech Republic during summer
1999.
Speech samples are stored as sequences of 8bit 8 kHz A-law, and are uncompressed.
Each prompted utterance is stored within a separate file. Each
speech file has an accompanying ASCII SAM label file according to the specifications
of the SpeechDat project (URL: http://www.speechdat.com).
Each speaker uttered the following items:
- connected digits (prompt sheet number, telephone number, credit card number)
- sequences of isolated digits (5 digits)
- answers to yes/no questions
- common application words and phrases
The following age distribution has been obtained: 36 speakers
are under 16, 537 speakers are between 16 and 30, 306 speakers are between 31
and 45, 259 speakers are between 46 and 60, 88 speakers are over 60, and the
age of 1 speaker is unknown.
The transcription included in this database is an orthographic,
lexical transcription with a few details that represent audible acoustic events
(speech and non speech) present in the corresponding waveform files. This database
follows the SpeechDat recommendations.
Click here to view the prices and browse other ressources belonging to this category |