METHOD FOR INCREASING THE EFFICIENCY OF ETL/ELT PROCESSING BASED ON METADATA PARAMETERS
DOI:
https://doi.org/10.31891/2307-5732-2026-361-47Keywords:
ETL, ELT, metadata, declarative approach, data engineering, data processing, data pipeline, database, data warehouseAbstract
This paper addresses one of the key challenges in contemporary data engineering: the limited flexibility and high maintenance costs of traditional ETL/ELT processes. The issue arises from the fact that in common imperative approaches, the logic of data extraction, transformation, and loading is rigidly hard-coded within numerous SQL scripts or program modules. As a result, even minor changes in business requirements become complex engineering tasks requiring code modification, thorough testing, and redeployment.
The study proposes and validates a novel metadata-driven method for managing data processing workflows that employs a purely declarative paradigm. The scientific novelty of the approach lies in the fundamental separation of concerns between the logical configuration of a task and the mechanism of its physical execution. Unlike existing methods, the entire pipeline logic is externalized from the codebase into a dedicated metadata layer. This layer consists of specialized relational tables that declaratively define what should be executed — including the data source, target, and loading strategy (e.g., Full reload, Incremental update, or PartitionReplace mode).
The imperative execution layer is represented by a unified processing engine (implemented as a stored procedure, e.g., usp_Execute_Data_Movement_Task) that contains no embedded business logic. During execution, this engine dynamically reads configuration parameters from the metadata tables and generates the required SQL statements at runtime. Consequently, a major behavioral change — for instance, switching from incremental to full load mode — can be achieved through a simple update of a metadata field, without any modification to the underlying code or infrastructure.
The proposed approach demonstrates significant advantages over both traditional scripting and modern low-code/no-code platforms (e.g., Azure Data Factory), which, despite offering declarative design, often lead to vendor lock-in and reduced flexibility. Experimental validation has confirmed that the metadata-driven method reduces the time required for pipeline development and modification by more than 90% compared to conventional approaches. Moreover, it introduces negligible performance overhead while achieving a substantial reduction in total cost of ownership).
The results of this study illustrate a successful and practical transition from the imperative to the declarative paradigm in data engineering, enabling the creation of more flexible, transparent, scalable, and cost-efficient analytical systems.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 ВОЛОДИМИР СОЛОГУБ, ВОЛОДИМИР ПАШКЕВИЧ (Автор)

This work is licensed under a Creative Commons Attribution 4.0 International License.