Executive Summary : | The research project aims to develop and test compression techniques for deep neural networks in the context of Artificial Intelligence and Machine Learning (AIoT) applications. The project aims to address the challenge of deploying deep neural networks on low-powered devices with limited memory and processing capabilities. The scientific objectives are to investigate and evaluate the effectiveness of various model compression techniques and their combinations in reducing the size of deep neural networks while maintaining acceptable levels of accuracy, inference time, and energy consumption. The hypothesis is that using a combination of compression techniques such as pruning, quantization, weight sharing, knowledge distillation, low-rank factorization, structured sparsity, and compact convolutional filters can significantly reduce the size of deep neural networks while maintaining their performance. The main experiments will involve evaluating the effectiveness of different combinations of compression techniques on benchmark datasets and AIoT applications using low-powered machine learning-capable IoT development boards. The project's significance lies in its potential to advance the field of AIoT by enabling the deployment of deep neural networks on low-powered devices with limited memory and processing capabilities. The research outcomes can inform the development of new compression techniques optimized for specific AIoT applications and guide the design of low-powered machine learning-capable IoT development boards. |