Adaptive Compression of Supervised and Self-Supervised Models for Green Speech Recognition
Conférence : Communications avec actes dans un congrès international
Computational power is crucial for the development
and deployment of artificial intelligence capabilities, as the large
size of deep learning models often requires significant resources.
Compression methods aim to reduce model size making artificial
intelligence more sustainable and accessible. Compression techniques are often applied uniformly across model layers, without
considering their individual characteristics. In this paper, we
introduce a customized approach that optimizes compression for
each layer individually. Some layers undergo both pruning and/or
quantization, while others are only quantized, with fuzzy logic
guiding these decisions. The quantization precision is further adjusted based on the importance of each layer. Our experiments on
both supervised and self-supervised models using the librispeech
dataset show only a slight decrease in performance, with about
85% memory footprint reduction.