Research

Life Sciences & Biotechnology

Title :

Machine learning model of specificity anchors and clustered binding sites to unravel DNA binding of proteins and drugs

Area of research :

Life Sciences & Biotechnology

Principal Investigator :

Dr. Devesh Bhimsaria, Indian Institute Of Technology (IIT) Roorkee, Uttarakhand

Timeline Start Year :

2024

Timeline End Year :

2026

Contact info :

Equipments :

Details

Executive Summary :

Proteins and DNA-binding small molecule drugs interact with DNA through various types of contacts, which are essential for regulating gene transcription and controlling cellular processes. DNA binding proteins, mainly transcription factors (TFs), have a unique DNA binding domain (DBD) that recognizes a specific DNA sequence or motif. These nucleotides are called "specificity anchors" and their binding affinity varies based on the properties of neighboring DNA, such as shape and flexibility. A single model, mostly position weight matrix (PWM), is used to capture both binding specificity and affinity, but it is ineffective in distinguishing individual contributions and predicting their effect on binding. The project aims to develop a deep neural network-based machine learning model that correlates DNA binding to specificity anchors and DNA properties for improved predictions. For many TFs and sequence-specific drugs, a single binding site is not enough to trigger binding, and a cluster of binding sites is required. The project aims to create a clustered binding site model using linear regression and neural networks, modeling the binding of TFs and small molecule drugs based on their preference for a single or clustered binding site pattern. This distance-based model will help study diseases caused by repeat expansion and targeted therapeutics. An improved model of binding affinity and clustered binding sites will help predict DNA binding inside cells, improve related genetic networks, find associations between TFs to genetic diseases caused by mutations in non-coding regions, and design DNA binding drugs.

Total Budget (INR):

24,75,000

Organizations involved