ПРИНЦИП ДАЛЕКОДІЇ-БЛИЗЬКОДІЇ У ЗАДАЧАХ СТРУКТУРИЗАЦІЇ ТА НАВЧАННЯ ШТУЧНИХ НЕЙРОННИХ МЕРЕЖ

МИКОЛА ОДЕГОВ; ЮРІЙ БАБІЧ

doi:10.31891/2307-5732-2024-337-3-54

Authors

NICK ODEGOV State University of Intellectual Technologies and Telecommunications Author https://orcid.org/0000-0001-5526-2487
YURII BABICH State University of Intellectual Technologies and Telecommunications Author https://orcid.org/0000-0002-7888-7591

DOI:

https://doi.org/10.31891/2307-5732-2024-337-3-54

Keywords:

Artificial Neural Networks, long-and-close-range action principle, unsupervised learning, nonlinear convolutional networks, parametric sigmoid, transition matrices

Abstract

Classical artificial neural networks in the general case require learning a significant number of transition matrix parameters between adjacent layers. The main idea of the work is to set the matrix once and rigidly according to some "reasonable thoughts". Then only the neurons of the network will need to be trained. This work uses the principle of long-and-close-range action in the kind of the "reasonable thoughts". The essence of this principle is that the closer layers are to each other stronger is the influence between neurons. In the current work, it is proposed to use the radial structure of the topology. This way of determining the geometric arrangement of the layers ensures the conditions of the balance of the network, namely: the set of effects of the neurons of the previous layer on the neurons of the neighboring layer is a constant regardless of the conditional numbers of neurons.

According to the mathematical essence, the proposed artificial neural networks based on the long-and-close-range action principle can be classified as a specific subclass of nonlinear convolutional networks. Nonlinear convolutions are implemented with the help of core discrete transforms, where the transition matrices of connections between adjacent layers are the cores of transformations.

As activation functions, parametric sigmoids are considered, which have only one free parameter i.e. the nonlinearity coefficient.

The developed algorithms and programs are applied to solve the problem of unsupervised learning, namely the problem of clustering. The well-known set of handwritten digits MNIST was chosen as the test data set. The problem was solved on a regular computer using only the CPU (GPU was not used).

The results of the validation of the obtained distribution over 50,000 samples of the MNIST set on 1000 clusters yielded very encouraging results. The time for solving the learning and pure clustering tasks is less than 10 minutes, and the accuracy of correct assignment to clusters at the validation stage reaches 97%.

PRINCIPLE OF LONG-AND-CLOSE-RANGE ACTION IN STRUCTURIZATION PROBLEMS AND TRAINING OF ARTIFICIAL NEURAL NETWORKS

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

How to Cite

Language

Make a Submission

Index

For Avtors

Flag