;; ;; TC-STAR 2007 ASR Evaluation submission protocol ;; ;; ;; I> Introduction This document describes the formats and naming conventions for the 2007 TC-STAR ASR evaluation II> Result submission II.1> Naming conventions An experiment is identified by the following name: EXP-ID := _____ where: = ibm | limsi | rwth | uka | irst | upc | atr | lium | daedalus = eval07 = bn | parl = en | es | zh = restricted | public | open = primary | contrast-xxx | contrast-yyy Possible combinations between and are : bn_zh parl_en parl_es II.2> Systems description For each experiment, a one page system description must be provided describing the data used, the approaches (algorithms), the configuration, the processing time, etc. The structure of the document should be the following: 1. EXP-ID 2. Acoustic front-end 3. Acoustic model 4. Language model 5. Recognition lexicon description 6. Recognition process 7. Execution time 8. References The file should be named as .txt II.3> Structure of the archive For each experiment, the submitted archive should contain 2 files and should be: /.txt /.ctm The CTM file is a 5 or 6 columns UNIX text file in the following format : FILENAME CHANNEL STARTTIME DURATION WORD [CONFIDENCE_SCORE] Confidence scores are optional. The file must be sorted by the first three columns: the first and the second in ascii order, and the third by a numeric order. The UNIX sort command: "sort +0 -1 +1 -2 +2nb -3" will sort the words into appropriate order. Here is an example of CTM file with punctuation mark. ;;File ID Channel starttime duration word confidence_score 20050907_0900_1235_OR_SAT 1 322.768 0.120 who 0.9929 20050907_0900_1235_OR_SAT 1 322.889 0.118 is 0.9908 20050907_0900_1235_OR_SAT 1 323.011 0.194 with 0.9893 20050907_0900_1235_OR_SAT 1 323.207 0.140 me 0.9874 20050907_0900_1235_OR_SAT 1 323.353 0.470 today 0.9680 20050907_0900_1235_OR_SAT 1 323.823 0.00 . The archive should be a tgz, tar or a zip file. III> Submission Submissions must be sent by email at the following address: mostefa@elda.org with the subject submission and with the tar file in attachment. Each submission should be sent in a different email. The deadline is Sunday 28th of January, 23h59 CET. (5h59 pm for Pittsburgh and Yorktown) A return receipt will be sent within 24 hours.