COMPARATIVE ANALYSIS OF METHODS FOR TESTING THE INDEPENDENCE OF PSEUDO-RANDOM NUMBER SEQUENCES
DOI:
https://doi.org/10.31891/2307-5732-2023-329-6-258-265Keywords:
random numbers, simulation modeling, correlation, statistical independence, random number generators, uniform distribution, Kolmogorov criterionAbstract
Very long sequences of pseudorandom numbers can be used in statistic and simulation modeling tasks. At the same time, there is a need to solve problems of the Big Data class. Solving such problems sometimes requires a slight disregard for accuracy in order to obtain practically acceptable research results in an acceptable time.
Currently, a significant number of systems for testing sequences of pseudo-random numbers (PRN) has been developed for compliance with the rather conventional concept of "randomness", that is, the impossibility of predicting their individual values. However, PRNs are generated using regular algorithms. The paradigm of this work is the requirement to match the empirical distribution functions of PRN with the theoretical distribution functions.
For one-dimensional (marginal) distributions, this problem is solved quite simply. The task of establishing the characteristics of statistical dependence or independence of PRN is more difficult. For a pair of PRNs of length N, the most logical method is the method of complete verification of their independence (CVM). The essence of this method is to establish the deviation of the products of the imperial values of the probabilities from the theoretical ones. This method is reduced to NxN algorithms.
In previous works, the algorithms of the sum criterion method (SCM) were considered, which is reduced to the analysis of the sums of PRN values. At the same time, in the case of statistical independence, the sum distribution will have the form of a convolution of marginal distributions. The order of this algorithm is only N.
In this paper, a comparative analysis of CVM and SCM was performed based on reliability and speed indicators. At the same time, the artificial dependence of PRN was modeled by introducing a certain level of correlation. Comparative analysis showed that both methods are approximately the same in terms of reliability. In terms of speed, the SCM is orders of magnitude (in proportion to N ) more efficient than the CVM one.
Finally, it was concluded that ISS should be preferred for Big Data modeling tasks.