METHOD OF DEPENDENCES MINING IN BIG DATA SETS WITH OMISSIONS AS AN EXAMPLE OF DATA ON THE SPREAD OF COVID
DOI:
https://doi.org/10.31891/2307-5732-2024-333-2-1Keywords:
extraction of data dependencies, large data sets, associative ruleAbstract
In the article, a method of big data analysis with omissions is developed on the example of the construction of advisory rules regarding adequate state policy to reduce the spread of new Covid-19 cases. Association rules and rule generation in Big Data have a number of challenges, the main one being the presence of large numbers of vectors and multivalued datasets. That is why in the paper the system of these rules is based on a new ensemble of machine learning techniques such as associative rules, regression tree and clustering. This study used pooled data from the Government's COVID-19 Response Tracker and ECDC's Covid-19. Clustering was performed using the k-means method. Gap statistics allow finding an appropriate number of clusters, and in the case study three clusters were selected. The clusters differ in the recommendations and actions of the correspondent governments. Thus, the first countries of the cluster chose to close schools and control international travel as the main recommendations; countries in the second cluster recommended staying at home, while the main recommendations from the governments of countries in the third cluster were to stay at home and cancel public events. The same country can be attached to different clusters at different time intervals. Therefore, the clustering by countries will not be so unambiguous. That is why the time series for the detached country can be interesting and will be the subject of further study. In addition, the impact of such clustering on the spread of COVID-19, the position and duration of the peak, and the mortality rate will also be the subject of our further investigation. A regression decision-making tree was built, a set of rules was obtained from the decision tree and applied rules for generating associative dependencies. The resulting dependencies can be used for strategic planning in the healthcare system.