S0092 : Portuguese SpeechDat(II) FDB-4000
The Portuguese SpeechDat(II) FDB-4000 database contains the
recordings of 4,027 Portuguese speakers (1,861 males, 2,166 females) recorded
over the Portuguese fixed telephone network. This database is partitioned into
11 CDs.
Speech samples are stored as sequences of 8-bit 8 kHz A-law.
Each prompted utterance is stored in a separate file. Each signal file is accompanied
by an ASCII SAM label file which contains the relevant descriptive information.
This speech database was validated by SPEX (the Netherlands)
to assess its compliance with the SpeechDat format and content specifications.
Each speaker uttered the following items:
- 1 isolated single digit
- 1 sequence of 10 isolated digits
- 4 connected digits (1 sheet number -5 digits, 1 telephone number –9/11 digits,
1 credit card number –14/16 digits, 1 PIN code -6 digits)
- 1 currency money amount
- 1 natural number
- 3 dates (1 spontaneous date e.g. birthday, 1 prompted date, 1 relative
or general date expression)
- 2 time phrases (1 spontaneous time of day, 1 word style time phrase)
- 3 spelled words (1 spontaneous e.g. own forename, 1 city name, 1 real
word for coverage)
- 5 directory assistance names (1 spontaneous e.g. own forename, 1 city
of birth/growing up, 1 frequent city name, 1 frequent company name, 1 common
forename and surname)
- 2 yes/no questions (1 predominantly "yes" question,
1 predominantly "no" question)
- 3 application words
- 1 keyword phrase using an embedded application word
- 4 phonetically rich words
- 9 phonetically rich sentences
The following age distribution has been obtained: 241 speakers
are under 16, 1404 speakers are between 16 and 30, 1532 speakers are between
31 and 45, 711 speakers are between 46 and 60, and 139 speakers are over 60.
A pronunciation lexicon with a phonemic transcription in SAMPA
is also included.
Click here to view the prices and browse other ressources belonging to this category |