АРХІТЕКТУРА СИСТЕМИ МАШИННОГО НАВЧАННЯ ДЛЯ СТВОРЕННЯ ПАРАЛЕЛЬНИХ ДВОМОВНИХ КОРПУСІВ ТЕКСТІВ

МИКОЛА ФАНТ

doi:10.31891/2307-5732-2023-321-3-314-319

Authors

MYKOLA FANT State University «Zhytomyrska Politekhnika» Author https://orcid.org/0000-0002-4994-8009

DOI:

https://doi.org/10.31891/2307-5732-2023-321-3-314-319

Keywords:

machine learning, model, architecture, text alignment, CAT-tool

Abstract

The text alignment service is one of the essential parts of any Computer Aided Translation (CAT) tools and also important for other tasks, related to any kind of text transformation from one language to another. This article proposes a unique architecture of a text alignment service, which is based on machine learning technologies. The suggested architecture considers the newest approaches to constructing micro-services systems considering both easy deployment and maintenance of such systems. The article elaborates on requirements for the text alignment system as a crucial precondition of developing the architecture. The established requirements take into account both sides of the system: the system as a machine learning application and the system as a CAT service. The suggested architecture gives the possibility to build a universal system with several entry points for end customers, system administrators, and data scientists. It also preserves different options of the system usage: e.g. from the own user interfaces or with REST API calls from a third-party server. The system contains three different user interfaces designed for ordinary users, system administrators as well as data-scientists. That heterogenous UX approach is crucial for secure yet flexible system maintenance. The service built on the proposed architecture will be able to cover different user scenarios: using a general model for predicting customers’ own bilingual text corpora, training their own model, or just using the service as a storage of aligned bilingual texts. To achieve such usage universality a great emphasis is given to model versioning support since the system should manage different parallel versions of the predicting models. The system is planned as a microservice architecture system with an orchestrator as its central component. An important part of the system is the monitoring service which will estimate the efficiency of trained models as well as get user feedback based on user actions after model predictions. The article suggests the technology stack needed for easy and secure development, deployment, and delivery of the product with zero downtime using the blue-green model of the deployment.

ARCHITECTURE OF A MACHINE LEARNING SYSTEM FOR TEXT ALIGNMENT

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

Language

Make a Submission

Index

For Avtors

Flag