Research

Computer Sciences and Information Technology

Title :	Standalone Domain Specific Speech to Speech translator for English, Hindi and Tamil Languages
Area of research :	Computer Sciences and Information Technology
Focus area :	Natural Language Processing and Artificial Intelligence
Principal Investigator :	Dr. Mahadeva Prasanna, Indian Institute Of Technology (IIT) Dharwad, Karnataka
Timeline Start Year :	2023
Timeline End Year :	2028
Contact info :	prasanna@iitdh.ac.in

Details

Executive Summary :

The project aims at design and development of standalone, domain-specific speech to speech translators (S2ST) for English, Hindi and Tamil Languages. The proposed S2ST is expected to work on an embedded device having limited computing power, memory, and battery life. The working of S2ST will be demonstrated in the tourism-domain as a case study. Even though the tourism-domain is considered here, the S2ST framework is generic and useful for many related activities like travel, shopping, hotel and taxi booking etc. The S2ST will follow the standard approach of automatic speech recognition (ASR), machine translation (MT), and text-to-speech synthesis (TTS). The S2ST is considered from English to Hindi and Tamil languages. Depending on the resources and interest of industry partners from a commercial exploitation point of view, the prototypes of these 3 languages, namely, English, Hindi and Tamil are considered as part of this project. In the next phase, the prototypes of other Indian languages will be considered. The working principle behind the proposed domain-specific, standalone S2ST is as follows: The user (tourist) who wants to seek information about a particular thing, speaks into the S2ST in English. The English ASR system converts speech into English text. The MT system converts text in English to the target Indian language, say Hindi or Tamil. The TTS system in the third and final module produces speech in the Indian language and plays it out. The only constraint to domain-specific is that the user produces sentences that are mostly domain-specific. The listener understands the speech played in the Indian language and performs suitable action. There are several variants possible for the proposed, standalone S2ST. The first version will be one-way S2ST and intended that the listener only performs necessary action. The second version can enhance performance by adding access to the internet, wherever possible, which will significantly increase the capability to S2ST, since the resources available over the network can be exploited for better performance. Efforts for all these variants will be started in parallel and prototype delivery will be prioritized in the following way: standalone one-way translators, and then internet-enabled one-way translators.

Co-PI:

Dr. Kavi Mahesh, Indian Institute Of Information Technology Dharwad, Karnataka-580009, Dr. Gayathri Ananthanarayanan, Indian Institute Of Technology Dharwad, Karnataka,Dharwad-580011, Dr. Bharath B N, Indian Institute Of Information Technology Dharwad, Karnataka-580009, Prof. Nagarajan T, SSN College Of Engineering, Kalavakkam,Tamil Nadu-603110, Dr. Sunil Soumya, Indian Institute Of Information Technology Dharwad, Karnataka-580009, Dr. Deepak K T, Indian Institute Of Information Technology Dharwad, Karnataka-580009, Prof. Vijayalakshmi P, SSN College Of Engineering, Kalavakkam,Tamil Nadu-60311, Dr. Rajshekhar V Bhat, Indian Institute Of Technology Dharwad, Karnataka-580011

Total Budget (INR):

2,23,67,858

Organizations involved

Implementing Agency :	Indian Institute Of Technology Dharwad
Funding Agency :	Anusandhan National Research Foundation/ Science and Engineering Research Board
Source :	Anusandhan National Research Foundation/Science and Engineering Research Board (SERB), DST 2023-24