Research

Engineering Sciences

Title :

Online Subset Selection Algorithms for Data-centric Responsible AI and Efficient Computer Vision

Area of research :

Engineering Sciences

Principal Investigator :

Dr. Sourangshu Bhattacharya, Indian Institute Of Technology (IIT) Kharagpur, West Bengal

Timeline Start Year :

2024

Timeline End Year :

2027

Contact info :

Equipments :

Details

Executive Summary :

AI techniques are increasingly being used in practical applications, with two main aspects gaining prominence: developing robust and responsible models and efficient AI models for deployment in edge devices. A data-centric approach to responsible AI models requires explanations of model prediction in terms of training datapoints, which are measured by different value functions calculated on a validation dataset. The first task is Data Valuation, which involves selecting and ranking training datapoints to result in a high-value model. Designing data subset selection algorithms that can scale two large training data sets and complex models is a key challenge. The second task is designing a filter selection/pruning scheme for training smaller models focused on a given task defined by a value function. This reduces computational cost, increases frames processed per second on edge devices, and overall power consumption. The key challenge is determining the optimal number of CNN filters that can be pruned from each layer of the deep neural network to arrive at an optimal configuration. Both tasks involve the selection of a subset of items (datapoints or filters) from a given large set items, such as datapoints or filters, to result in a high-value trained model with respect to a given value function. Techniques such as submodular optimization, sparse approximation, and convex optimization are used to solve these problems. The project will further study subset selection algorithms in online settings, where items arrive online. It also addresses the bilevel optimization problem, where the first level is to learn optimal parameters and the second level is to select the optimal data subset that optimizes the value function.

Total Budget (INR):

28,27,000

Organizations involved