APPROXIMATE SIMULATION PRETRAINING AS A WAY TO REDUCE SOFT ACTOR-CRITIC EXPLORATION NEEDS IN TERMS OF BIOREACTOR CONTROL
DOI:
https://doi.org/10.31891/2307-5732-2025-351-49Keywords:
adaptive control, bioprocess control, bioreactor, reinforcement learning, Soft Actor-Critic, offline-to-online learningAbstract
Bioreactors are at the heart of most biotechnological processes as they make it possible to grow stem cells and artificial organs, process waste and pollutants, produce biofuels, pharmaceuticals, vaccines, popular food and beverages, and many more. Due to the limitations of the dominant methods for implementing autonomous bioreactor control systems based on proportional-integral-differential laws, fuzzy logic, and predictive models, there is a steadily growing theoretical and practical interest in creating smart reinforcement-learning-based controllers that are capable of learning the operated system by themselves and adapting to various changes in real-time without the need in an accurate bioprocess model. However, implementing RL-based controllers in practice is often accompanied by many challenges, with the high cost and duration of environment exploration being among the most significant. In this article, an approximate simulation pretraining method is proposed that will allow obtaining a flexible initial control policy, the use of which in a real bioreactor will significantly reduce the need for expensive environment exploration and the probability of the controller accidentally bringing the system to an irreversible critical state. Using a baker yeast bioreactor simulation as an example, it is demonstrated that with this approach, the RL-agent converges to an optimal policy significantly faster and, with a sufficient amount of pretraining, avoids exploring potentially dangerous states of the system even though none of the approximate simulations used for pertaining accurately reflected the target system's dynamics. The agent adapted to the “real” environment 9 times faster, reducing the MSE by a factor of 50 and the ITSE by 160. This method simplifies the implementation of reinforcement learning-based controllers for many bioprocesses, reducing the economic cost, complexity, and laboriousness of the development of autonomous control systems for industrial bioreactors, which in turn will allow for increased production volume, quality, and availability of many valuable products.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 ОЛЕКСАНДР ПЕТРОВСЬКИЙ (Автор)

This work is licensed under a Creative Commons Attribution 4.0 International License.