Computer Sciences and Information Technology
Title : | Automatic Parts-of-Speech Tagger Based on BIS Tagset in Assamese |
Area of research : | Computer Sciences and Information Technology |
Focus area : | Computational Linguistics, Artificial Intelligence |
Principal Investigator : | Dr. Nomi Baruah, Dibrugarh University, Assam |
Timeline Start Year : | 2023 |
Timeline End Year : | 2026 |
Contact info : | baruahnomi@gmail.com |
Details
Executive Summary : | Parts-of-speech (POS) tagging is a challenging field in Natural Language Processing (NLP) due to its need for deep insight and knowledge about a specific language, particularly in large volumes of data. Despite the growing number of works on POS tagging in Indian languages like Hindi and Bengali, there is a lack of resources for Assamese, one of India's national languages, with 15.3 million populations worldwide. As NLP research on Assamese language grows, a high-accuracy automatic POS tagger is necessary. A dataset will be developed using BIS tagset for Assamese novels, news articles, and sports, which will be one of the pioneer works in Assamese and Indian languages. The POS tagger will be implemented using RNN-based deep learning methods and a newly designed hybrid method. The outputs and performance of these methods will be critically analyzed for their effectiveness. |
Total Budget (INR): | 13,69,500 |
Organizations involved