METHODOLOGY OF OPTIMIZATION OF A NEURAL NETWORK WITH INTEGRATION OF PERCEPTRON COMPONENTS IMPLEMENTED ON FPGA
DOI:
https://doi.org/10.31891/2307-5732-2025-351-51Keywords:
neural network, perceptron, digital components, probabilistic estimates, programmable logic integrated circuits (PLIS)Abstract
The proposed work is dedicated to developing a new methodology for optimizing a neural network that enables real-time inference on programmable logic integrated circuits. The main idea of the study is to transfer part of the computations from online inference to the offline training stage using explainable methods. This approach allows for evaluating the importance of each feature, ensuring data sparsity and reducing the overhead of local computations. Consequently, the local neural network can be significantly lighter, which substantially lowers memory requirements, energy consumption, and computational resources, while still maintaining a high level of inference accuracy. The methodology takes into account the heterogeneity of the input data, allowing for the separation of key features that are retained locally from less important ones that are transmitted to a remote server for further processing. Integration of local and remote neural networks is achieved through a unified loss function that includes both the accuracy criterion and the requirement for asymmetry in feature importance. The use of preliminary processing by the feature extractor, along with adaptive network partitioning, helps achieve an optimal balance between inference accuracy and computational costs. The study demonstrates that the proposed approach can reduce inference delays to tens of milliseconds, which is critical for applications in IoT, autonomous robotics, medical devices, and other systems with limited resources. Additionally, attention is focused on the aspects of incorporating both single-layer and multi-layer perceptrons as primary components of neural networks. These can be used in the stages of preliminary signal and data processing—particularly for evaluating data heterogeneity—as well as in forming local inferences. Furthermore, improved energy efficiency and optimized use of local memory expand the possibilities for implementing advanced artificial intelligence technologies at the FPGA level. The methodology enables more precise feature attribution using integrated gradients, which allows for adaptive tuning of parameters in both local and remote networks, minimizing data transmission delays and reducing energy consumption. Overall, the contribution of this work lies in the creation of a flexible neural network architecture that effectively balances inference accuracy with computational costs, opening up new prospects for the practical application of AI in various sectors of the modern IT industry.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 ВАСИЛЬ ШЕКЕТА, ВОЛОДИМИР ПІХ, МАРЯН СЛАБІНОГА, ЮРІЙ СТРІЛЕЦЬКИЙ, МАРІЯ ПІХ (Автор)

This work is licensed under a Creative Commons Attribution 4.0 International License.