Остання редакція: 2026-05-24
Тези доповіді
Introduction. The primary contributions of this research can be summarized as follows.
1. In the capacity configuration of the PESCS, the impact of the ESS battery's state of health is taken into account. An optimal configuration model aiming to minimize the annual total cost of the PESCS is set up, and the effects of factors like the battery's state of health on the configuration outcomes are further examined.
2. Using the time-of-use electricity price as a basis, developing a reasonable operation strategy are conducive to extending the battery's state-of-health. Then, a capacity configuration model for the ESS is established, taking into account the constraints related to the system power balance, ESS, and the cost associated with the battery's state of health.
3. By constructing an optimal capacity configuration model that considers the health status of batteries and solving it using DQN algorithms, a capacity configuration method that can delay the battery's state of health and better adapt to energy complementary characteristics is obtained, which improves the economy of PESCS [1].
The output power of the PV system is primarily influenced by factors such as light intensity and temperature. The output power PPV (t) at time t can be expressed as PPV (t) = ln s(t) Aηcηpc [1+v(Tc(t) - TSTC)].
Where, lns(t) is the radiance (W/m2), A is the area of the photovoltaic module, ηc is the photoelectric conversion efficiency of module, ηpc is the MPPT efficiency of the DC conversion link, v is the weather correlation coefficient, Tc(t) is the temperature of the PV cell in the t period, and TSTC is the temperature of the PV cell under standard test conditions.
Purpose and objectives. The fundamental structure of the PESCS is depicted in Fig. 1. It primarily consists of a PV system, an ESS, an energy management system for the PESCS, and an EV load section. Every component is linked to the DC bus and engages in energy exchange with the communication system via the energy management system [2, 3].
First of all, battery state of health estimation can help to evaluate the impact on the battery's state of health and then choose the appropriate way to prolong the service life of the battery. Secondly, an effective operation strategy can improve the economy and reliability of the system. Therefore, there is a close relationship between a battery's state of health estimation and integrated energy configuration optimization. By accurately estimating the service life of the battery, a reasonable operation strategy can be formulated to achieve the optimization and efficient operation of the system.
Figure 1. Basic structure of PESCS
The energy management strategy of the PESCS is the basis for the energy exchange between the PESCS and the power grid and the basis for configuring the ESS capacity of the PESCS. The basic components of RL include the state space st , action space at, and reward function r(t), which represent the environment. According to the solving characteristics of RL, the capacity configuration optimization model of ESS in this paper is transformed into a DQN framework. Through multiple training of the deep reinforcement learning model, the optimal strategy is finally obtained to maximize the return of the entire scheduling cycle of the microgrid. Among them, state space, action space, and return function are the core elements of the whole process, which together constitute the deep reinforcement learning framework of microgrid optimal scheduling. Its RL framework is composed of agents and an environment [4], [5].
Research material and results. Based on the Q-learning algorithm and a deep neural network, a DQN is developed. By using two independent networks to estimate the action value function, DQN can more accurately evaluate the value of the action and reduce the problem of overestimation, thereby improving the performance of RL. DQN improves the ability to process data, overcomes the shortcomings of traditional solvers that cannot process too much data, and has good practical significance. Fig. 2 depicts the reward obtained during the training process of the proposed enhanced DQN algorithm. During the early phase of the training, given the insufficiency of training samples, the agent actively explores the environment with a high learning rate. With the gradual accumulation of samples, the reward curve climbs significantly and tends to converge.
Figure 2. Improved DQN algorithm training reward curve
With the continuous increase of training rounds, the reward curve tends to be stable, and the agent successfully completes the learning of the optimal mapping relationship. Ensure that the decision of each agent remains reliable and stable in a dynamic, uncertain environment [6], [7].
The capacity configuration of a PESCS largely determines its operational mode and economic benefits. The approach takes into account the battery’s state-of-health and makes use of the flexible complementary capacities of the ESS to improve the operational economic efficiency of the PESCS. Via case analysis, the following conclusions are reached:
1. The DQN algorithm, relying on a time-of-use pricing energy management strategy, relieves the load demand of the PESCS and significantly improves its adjustable capacity during power shortages at peak electricity price periods through the ESS.
2. The capacity configuration model of the PESCS comprehensively considers the battery's state-of-health, reduces ESS replacement costs, and increases the long-term economic benefits of the PESCS.
3. The capacity configuration method for PESCS based on reinforcement learning, which combines the flexible complementary capability of ESS with the battery health state, can effectively enhance the economic efficiency of PESCS.
References:
1. S. Jang, A. Yoon, et al. “Optimal capacity determination of photovoltaic and energy storage systems for electric vehicle charging stations”, Journal of Energy Storage, vol. 106, 114730, 2025.
2. Y. Liu, P. Li, et al. “Research on Microgrid Superconductivity-Battery Energy Storage Control Strategy Based on Adaptive Dynamic Programming”, IEEE Transactions on Applied Superconductivity, vol 34, no. 8, pp. 1-4, 2024.
3. J. Zhang, L. Hou, et al. “Optimal operation of energy storage system in photovoltaic-storage charging station based on intelligent reinforcement learning. Energy and Buildings”, Vol. 299, 113570, 2023.
4. X. Dong, J. Shen, et al. “Simultaneous capacity configuration and scheduling optimization of an integrated electrical vehicle charging station with photovoltaic and battery energy storage system”, Energy, Vol. 289, 129991, 2024.
5. Y. Liu, X. Han, et al. “Research on Control Strategy of Hybrid Superconducting Energy Storage Based on Reinforcement Learning Algorithm”, IEEE Transactions on Applied Superconductivity, vol. 34, no. 8, pp. 1-4, 2024.
6. G. Chen, J. Li, et al. “Optimal Configuration of Renewable Energy DGs Based on Improved Northern Goshawk Optimization Algorithm Considering Load and Generation Uncertainties”, Engineering Letters, vol. 31, no. 2, pp 511-530, 2023.
7. Y. Wang, S. Guo, et al. “A comprehensive review of machine learning-based state of health estimation for lithium-ion batteries: data, features, algorithms, and future challenges”, Renewable and Sustainable Energy Reviews, vol. 224, 116125, 2025.