Executive Summary : | Transcription factors are proteins that regulate gene transcription and play a crucial role in the transmission of genetic information from DNA to messenger RNA. They control multiple genes simultaneously in a cell type, and their binding sites are not identical. The discovery of these binding sites can help understand gene expression and the causes of diseases like sARs-CoV-2. Numerous calculating techniques exist to identify TFBs in DNA sequences, but they have drawbacks, especially when applied to large amounts of data produced by ChIP-seq technology. Computational modeling of TFBs to predict TF binding sites is an alternative, with deep learning being a successful method for this purpose. Transformer-based learning has been successfully used in various tasks, including textual entailment, learning task-independent sentence representations, reading comprehension, and abstractive summarization. This study aims to identify TFBs in human genes interacting with sARs-CoV-2 spike glycoprotein, identify TFs corresponding to enriched genes, and validate bindings between TFs and human genes in TFBss using in-silico methods. Understanding the role of transcription factors in driving differential regulation in COVID-19 patients is essential for uncovering virus infection. The project will focus on using Transformer-based learning to predict TFBs in human genes interacting with sARs-CoV-2 spike glycoprotein, selecting human genes enriched in predicted TFBss, identifying TFs corresponding to enriched human genes, validating binding between TFs and human genes in TFBss using in-silico analysis, applying enrichment analysis on enriched human genes, and expanding the work for other tropical diseases like Chikungunya and Dengue. |