ОПТИМІЗАЦІЯ СПОЖИВАННЯ ПАМ'ЯТІ ПІД ЧАС РОБОТИ З НЕЙРОМЕРЕЖАМИ ДЛЯ СКАНУВАННЯ

ГЛІБ СЕРЕДЮК; ВОЛОДИМИР ГАРМАШ

doi:10.31891/2307-5732-2025-355-76

Authors

HLIB SEREDIUK Vinnytsia National Technical University Author https://orcid.org/0009-0000-3010-6437
VOLODYMYR GARMASH Vinnytsia National Technical University Author https://orcid.org/0009-0007-1861-8772

DOI:

https://doi.org/10.31891/2307-5732-2025-355-76

Keywords:

deep neural networks, quantization, model pruning, compression, image scanning, optical character recognition, memory optimization

Abstract

This paper investigates memory optimization methods for deep neural networks in image scanning and text recognition tasks. The research examines the application of neural networks in optical character recognition (OCR), document analysis, barcode scanning, and QR code identification—all critical components of modern scanning systems. Due to memory constraints on mobile and embedded devices, optimizing these models is essential for practical deployment. The study systematically analyzes three approaches to memory optimization: network pruning, weight quantization, and model compression techniques. Network pruning eliminates connections with negligible weight values, converting dense weight matrices into sparse representations. Quantization reduces the precision of weight representation from 32-bit floating-point numbers to 8-bit integers, decreasing model size by a factor of four. Huffman coding provides additional compression by assigning shorter codes to frequently occurring weight values. Experimental results confirm that a combined approach integrating pruning, quantization, and Huffman coding can reduce model size by 35-49 times while maintaining accuracy degradation below 1%. Detailed comparative analysis of Post-Training Quantization (PTQ) and Quantization-Aware Training (QAT) algorithms reveals that QAT produces superior results in preserving accuracy (0.3% loss versus 0.5% for PTQ). For ResNet-50 adapted for document scanning, the combination of 90% connection pruning with 8-bit QAT reduces memory requirements by 40 times while sacrificing only 0.9% accuracy. The practical implications include significantly reduced energy consumption (56%) and improved inference speed (43%), making these optimization methods particularly valuable for portable scanning devices operating under real time constraints. The research demonstrates that even complex neural network architectures can be effectively deployed on resource-constrained devices through appropriate optimization techniques.

OPTIMIZATION OF MEMORY CONSUMPTION WHEN WORKING WITH NEURAL NETWORKS FOR SCANNING

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

Language

Make a Submission

Index

For Avtors

Flag