ANALYSIS OF SMALL SAMPLES OF MULTIDIMENSIONAL DATAON THE EXAMPLE OF TIMES OF RECOVERY OF PATIENTS FROM COVID-19
DOI:
https://doi.org/10.31891/2307-5732-2024-341-5-44Keywords:
small sample, multivariate data, normal distribution, gender differences, Covid-19Abstract
The analysis of multidimensional data presented in small volumes - small samples is quite often used in publishing, pedagogy, sociology mainly for the purpose of understanding the existing situation and choosing and making appropriate decisions. The main goal of such an analysis is to identify the impact of various factors on the object of attention in order to adjust further steps in one or another type of activity. The authors of this article worked with the material, namely with the medical and biological indicators of 19 patients with the disease of Covid-19, who underwent a course of treatment and at the time of recovery, their condition is presented with relevant data. These data include: duration of recovery of each patient in bed days and 29 indicators of physical and physiological state, divided into five groups. These are the following groups: physical characteristics (age, height, weight) and cardiovascular system, respiratory system, immune and circulatory systems. The data analysis started with the recovery term regarding the normality of the distribution. The following methods were used for this: quartile-quartile graph, as well as Shapiro-Wilk, Kolmogorov-Smirnov and Anderson-Darling tests. Based on the results of the application of these methods, it was concluded that the data are subject to the normal distribution law. The presence of both women and men in the data sample, and the latter are twice as few, required clarification of the gender difference between them. Here, the authors used a visual comparison using a boxplot and Student's T-test. By visual comparison, the differences between the minimums and the first quartiles are greater than the differences between the medians, third quartiles, and maximums, but according to the results of the T-test, the average values of these samples are equal and differ only by chance. The relationship of signs with the terms of recovery was determined in the following way. For each group of features, their individual multivariate averages were determined, and the recovery time interval was divided into four subintervals. According to these subintervals, patients were divided along with their multivariate averages for each group of indicators. Each subinterval is matched with the average value of the individual multivariate averages. The data are presented in a table - subintervals as levels of the main factor and groups of averages from individual multivariate averages as variables. One-factor variance analysis was applied to the data in the table, the equality of the average groups of indicators, as well as the weakness of the influence of diagnostic indicators on recovery time. The work has a scientific and practical nature and can be useful in similar situations.