Research

Engineering Sciences

Title :

Enabling Interactive Big Data Analytics and Visualization at Exascale via In Situ Efficient Statistical Data Modeling

Area of research :

Engineering Sciences

Principal Investigator :

Prof. Soumya Dutta, Indian Institute Of Technology Kanpur (IITK), Uttar Pradesh

Timeline Start Year :

2023

Timeline End Year :

2025

Contact info :

Equipments :

Details

Executive Summary :

The upcoming exascale computing regime holds the key to solving grand challenge science problems, such as accurate weather prediction, understanding the universe's origin, developing commercially viable fusion energy, understanding complex phenomena in aerospace engineering, and combustion science. However, as we build the nation's first exascale machine, we must address the data deluge that exascale applications will bring. Traditional analysis strategies will only move a subset of data to storage for in-depth offline analysis, and large chunks of data will have to be discarded without analysis. To solve the grand challenge problems, we will require access to the simulation output at a much finer spatiotemporal resolution than what can be stored on disks, disrupting the scientific discovery process. This research proposes a practical solution to the extreme-scale data problem by developing novel, storage-efficient statistical data models that will be generated during the simulation run itself. This new paradigm of processing data directly at the simulation time, known as in situ analysis, will minimize expensive data movement and deal with the extreme data size. The research will focus on developing scalable in situ distribution-based data models that can work across spatial, (multi)variable, and temporal domains, producing statistically salient and compact data models significantly smaller than the raw simulation output. GPU-accelerated probabilistic data analytics and visualization methods will be built on top of these distribution data models, working on standard desktop workstations to answer domain-specific science questions.

Total Budget (INR):

29,67,630

Organizations involved