Development of Communication Support System using Lip Reading


This page shows several demonstration video. All file formats are MPEG-4. Click a file name and watch a video.

Table 1
file format size[MB] time[sec] note
extraction_process.m4v MPEG-4 2.35 22 This video shows the step-by-step process of extraction. The speaker has uttered Japanese 5 vowel of /a/, /i/, /e/, /e/, and /o/. The yellow rectangle is the face region detected by the Viol-Jones face detector. The green lines with red dots are the resut of the face AAM extraction, and the cyan lines with blue dots are the result of the lip AAM extraction.
utterance_section_detection.m4v MPEG-4 3.51 34 This video shows the automatic utterance setion extraction. The speaker utters some phrases. The pink region is an extracted utterance section.
registration_process.m4v MPEG-4 4.62 45 This video shows the registration mode. The user utters the phrase displayed on the upper part of a screen. A dialog is displayed after the utterance section is detected. The user pushes a OK button, and the phrase is registered.
J50_JN(sitting)_A.m4v MPEG-4 4.43 43 This video shows the recognition experiment by speaker A. He was seated on a chair. His contents of utterance and the sentence of the upper part of a screen are unrelated to each other. He utters w09(/o-ha-yo-u-go-za-i-ma-su/), w22(/ko-n-ni-chi-wa/), w12(/o-me-de-to-u/), w48(/yo-ro-shi-ku-o-ne-ga-i-shi-ma-su/), w18(/ku-ri-ka-e-shi-o-ne-ga-i-shi-ma-su/), w31(/ta-su-ke-te-ku-da-sa-i/), and w39(/ma-ta-a-i-ma-syo-u/).
J50_JN(sitting)_B.m4v MPEG-4 10.1 98 This video shows the recognition experiment by speaker B. He was seated on a chair. He utters w01(/a-ta-ta-ka-i-de-su/), w04(/a-ri-ga-to-u/), w06(/i-i-te-n-ki-de-su/), w08(/o-ge-n-ki-de-su-ka/), w09(/o-ha-yo-u-go-za-i-ma-su/), w12(/o-me-de-to-u/), w13(/o-ya-su-mi-na-sa-i/), w19(/ge-n-ki-de-su/), w22(/ko-n-ni-chi-wa/), w25(/sa-yo-u-na-ra/), w29(/su-mi-ma-se-n/),and w30(/da-i-jo-u-bu-de-su/).
J50_JN(sitting)_C.m4v MPEG-4 5.11 50 This video shows the recognition experiment by speaker C. He was seated on a chair. He utters w21(/go-me-n-na-sa-i/), w22(/ko-n-ni-chi-wa/), w23(/ko-n-ba-n-wa/), w24(/sa-mu-i-de-su/), w25(/sa-yo-u-na-ra/), w26(/shi-tsu-re-i-shi-ma-su/), and w27(/shi-ri-ma-se-n/).
J50_JN(sitting)_D.m4v MPEG-4 4.77 46 This video shows the recognition experiment by speaker D. He was seated on a chair. He utters w04(/a-ri-ga-to-u/), w05(/i-i-de-su/), w06(/i-i-te-n-ki-de-su/), w07(/i-ssyo-ni-do-u-de-su-ka/), w36(/i-tsu-de-su-ka/), and w08(/o-ge-n-ki-de-su-ka/).
J50_JN(bed)_A.m4v MPEG-4 2.80 27 This video shows the recognition experiment by speaker A. He was in a supine position on a bed. He utters w09(/o-ha-yo-u-go-za-i-ma-su/), w22(/ko-n-ni-chi-wa/), w23(/ko-n-ba-n-wa/), and w13(/o-ya-su-mi-na-sa-i/).