Photo/IllutrationTakeshi Saito, right, an associate professor at the Kyushu Institute of Technology, explains his AI-powered lip-reading app. (Kei Yoshida)

An app powered by artificial intelligence (AI) that can read speakers' lips even if they do not enunciate words is available for those willing to collaborate in a study aimed at facilitating smooth conversation for people with speech difficulties.

The lip-reading AI technology, developed by Takeshi Saito, associate professor of intelligence information studies at the Kyushu Institute of Technology, can recognize numbers from zero to nine as well as 15 words or phrases such as “a-ri-ga-to-u" (thank you) and “ha-ji-me-ma-shi-te" (nice to meet you) in the study.

The app uses video to determine a user's words based combinations of estimated syllables analyzed by movements of 20 locations around the mouth of the speaker.

For example, when the app interprets syllables as “a, i, ga, to, u,” it can figure out that the correct word is "a-ri-ga-to-u" because “a, i, ga, to, u” has no meaning in Japanese.

As it only analyzes lip movements, the app can determine particular words even when no sounds are produced.

Using data collected from 48 students and fed into the AI, the study team achieved a correct interpretation rate of 71 percent.

The primary goal of the study is to develop technologies to help disabled people who have difficulty voicing words.

People who have lost their voices owing to larynx cancer or other reasons can feel frustrated or lonely, as they are unable to speak while still being able to move their lips, according to Saito.

In the field of the voice recognition, researchers have developed AI technologies that can determine not only words, but also sentences with a high correct interpretation rate.

Smartphones and car navigation systems employ voice-recognition technology equipped with AI to analyze a speaker's voice, follow instructions and even have a conversation.

Homonyms can also be read correctly by such AI-powered tech, which determines the meaning based on context by analyzing word sequences.

Studies on technologies to recognize a speaker’s utterance by movements of parts of the face have also advanced.

Combining such technology with that for lip reading, Saito aims to develop a technology that enables people with difficulties issuing utterances to have smooth conversations.

When those who have lost their voice move their lips in front of the camera lens of a smartphone, Saito’s dream technology will allow AI to swiftly read the movements and turn them into sounds to be played over a speaker.

The tech can be applied to other settings and systems such as car navigation devices otherwise hindered from recognizing a driver’s instructions owing to noise such as music and conversation among passengers.

It could also help conversations at noisy construction sites.

However, there are still challenges to overcome.

With the assistance of a hospital in Okayama Prefecture, the team used the app to analyze movements of the lips of elderly people who had lost their voices. However, the correct interpretation rate in analyzing samples of those elderly persons was considerably lower than that for the students.

The results appear to be attributable to the difference in how young people move their lips compared with older people, the team said.

The angle of a speaker's face toward a camera also affects the reading performance of the app. If the angle changes significantly, it becomes difficult for the app to read lip movements.

Such issues are expected to be resolved by having the AI learn from a larger data set.

Collecting samples from a large variety of people is thus one of the objectives of the study, with the app being offered to the public to spread the lip-reading technology to as many people as possible.

To avoid mischief or other trouble, those who would like to try the app are required to register their name and address before joining the study.

"We offer a big welcome to people with a serious interest in the app," said Saito.

For more information, visit the official website (Japanese only) at (https://demo.slab.ces.kyutech.ac.jp/).