Determining Cost-Efficient Controls of Electrical Energy Storages Using Dynamic Programming

Volatile electrical energy prices are a challenge and an opportunity for small and medium-size companies in energy-intensive industries. By using electrical energy storage and/or an adaptation of production processes, companies can significantly profit from time-depending energy prices and reduce their energy costs. We consider a time-discrete optimal control problem to reach a desired final state of the energy storage at a certain time step. Thereby, the energy input is discrete since only multiples of 100 kWh can be purchased at the EPEX SPOT market. We use available price estimations to minimize the total energy cost by a rounding based dynamic programming approach. With our model non-linear energy loss functions of the storage can be considered and we obtain a significant speed-up compared to the integer (linear) programming formulation.


Introduction
The climate targets for Germany to become climate-neutral by 2045, which were reinforced in 2021, imply that the ambitions for expanding renewable energy must be further increased.As a result, the share of volatile power generation will rise and energy storage will become increasingly important.In 2020, there were already almost 300 hours with negative electricity prices on the day-ahead market [www.epexspot.com].This results, among other things, from the oversupply by renewable power generation.The reversal of the trend for an increasing number of negative electricity prices can be significantly influenced by energy storage.Energy storage systems can assume different functions in the energy system.In multi-use approaches, it is increasingly being investigated how battery storage can be used in times when it is not required for its primary purpose, such as primary control power generation or self-supply optimization [1].One option is basically to trade energy on EPEX Spot and take advantage of the price spread between two points in time.Besides mixed-integer optimization models [2], various (approximate) dynamic programming approaches (see, e.g., [3][4][5][6][7]) are used to determine cost efficient controls of electrical storages and/or grids.Most approaches therefore use approximate dynamic pro-gramming [8] in order to avoid the "curses of dimensionality" by approximating the value function in each state.In [9] it is shown that the energy storage problem can be solved in polynomial time in a deterministic setting, while it is an NP-complete problem if prices and energy production are stochastic.
In this paper we consider a time-discrete optimal control problem of an electrical energy storage device and present a rounding based dynamic programming approach, which considers a discrete state space by rounding the energy level in the storage.This also reduces the computation time significantly in contrast to the solution of the mixed-integer programming problem.The rounding of the state space enables us to optimize the control over longer time periods (up to one year), which is of particular interest for the layout of the storage device within a retrospective analysis.
The calculations shown here are based on the example of electro-chemical energy storage.In principle, however, these considerations apply to all forms of energy storage (electrical, electromagnetic, electro-chemical, mechanical, thermal and chemical) and the suggested algorithm can be applied analogously.
This paper is organized as follows.In Sect. 2 we present a time-discrete model of an electrical energy storage devise which takes energy loss during charging, withdrawal, and self-discharge into account.Based on this model we develop a mixed-integer programming (MIP) formulation for the optimal control of the energy storage.Due the limited applicability of the MIP for larger instances we present a rounding based dynamic programming approach in Sect.3, which efficiently approximates the problem.Numerical tests are presented in Sect.4, including a run-time comparison of the considered models and a trade-off analysis of the investments in storage devices.Section 5 concludes the article and gives a brief outlook on future research directions based on our approach.

Modeling of an electrical energy storage
The energy market has varying energy prices due to supply and demand reasons.On account of, e.g., solar or wind energy, prices are relative to the weather, i.e., low costs correlate with using more eco-power.Therefore, storing energy instead of always buying exactly the required amount may be economically as well as ecologically reasonable (see, e.g., [10][11][12]).A comprehensive introduction to dynamical energy prices and their impact on industrial processes is given in [13].

Linear programming
Regarding the day-ahead market, trading is only possible at fixed points in time, hence we define the set of discrete trading dates as T := {1, . . ., m}, m ∈ N. Let then V t denote the charge level of an electrical storage at the time of t ∈ T, which is restricted by lower and upper capacity bounds c, C ∈ R ≥0 , i.e., c ≤ V t ≤ C for all t ∈ T. Furthermore, let the initial and a final fill level be given, denoted by V 0 = V init and V m ≥ V final , respectively (see Fig. 1 for a schematic illustration).The lower and upper bounds on the purchased energy per time step are denoted by l and u (0 ≤ l ≤ u), i.e., l ≤ x t ≤ u for all t ∈ T. The fill level V t depends on three quantities, firstly on the purchased energy x t , secondly on the external energy consumption Z t ∈ R ≥0 and lastly on the previous fill level V t-1 .Regarding the latter, we introduce an energy loss function g : R ≥0 → R ≥0 , which is often assumed to be linear, e.g., g(V ) = (1β) V for some given value β ∈ (0, 1).
The efficiency of storing to and withdrawing energy from the storage is modeled by the efficiency factors η in , η out ∈ [0, 1], which are either assumed to be constant factors, or The energy flow is depicted by arrows of different colors.The left energy from previous time intervals (black) remains in the storage, possibly affected by some energy loss.The purchased energy x can either be fed into the storage (green) or directly consumed (blue).The required amount of energy Z can also be (partly) extracted from the electrical storage (red) depending on the amount of stored/withdrawn energy η in , η out : R → [0, 1].Consequently, it has to be distinguished whether the consumed energy Z t is taken from the storage or purchased energy is used directly.Let y t ∈ [0, x t ] be the amount of energy stored in time step t and ζ t := Z tx t + y t the amount of energy loaded from the storage.In total, we obtain the fill level V t by To illustrate this formula we consider the two extreme cases: If the energy consumption Z t in time step t is taken completely from the storage (since there is not energy input in this time step, i.e., x t = y t = 0) then the energy level in the storage V t = g(V t-1 ) -1 η out Z t is reduced by 1  η out Z t taking into account the energy loss when withdrawing energy from the storage.On the other hand, if the energy consumption equals the energy input Z t = x t energy level of the storage is V t = g(V t-1 ) unchanged apart from the time-dependent energy loss g.
Forecasting models can provide a prognosis about the energy prices p t , t ∈ T for the period T. For a retrospective analysis, however, we can also consider the true prices, e.g., to evaluate the capacity of the energy storage.In the following we will concentrate on the running energy costs and neglect acquisition and other types of fixed costs.Thus, we aim to minimize the total costs of purchased energy x = (x 1 , . . ., x m ), i.e., min t∈T p t x t .Together with the aforementioned constraints, we formulate the following linear program (LP): It is well known that linear optimization problems are efficiently solvable (in polynomial time with interior point methods, cf.[14]).However, the energy is often traded in discrete quantities, which makes the energy input x a discrete variable.For example, at the EPEX SPOT market only multiples of 100 kWh can be purchased.We thus obtain a mixedinteger (linear) programming problem (MIP).

Integer programming
Since the energy input can only attain discrete values, we modify equation (1l) in (MIP) to equation (2c):

c )
Thereby, h x ∈ R ≥0 denotes the discretization step size of the energy input, i.e., x is restricted to multiples of h x : x ∈ {0, h x , 2 h x , . ..}.In contrast to LPs, MIPs are in general NPhard optimization problems, which are solvable by a significant computational effort, e.g., using branch and bound based approaches [15].Moreover, if the energy loss in the storage depends non-linearly on the fill level, one would obtain a mixed-integer non-linear optimization problem, which are computationally even more demanding.
On that account, we introduce our dynamic programming approach in the following chapter.

Rounding-based dynamic programming 3.1 Dynamic programming
The central idea of dynamic programming is to break down an optimization problem into a sequence of smaller efficiently solvable subproblems.Thereby, dynamic programming relies on Bellman's principle of optimality [16], which states that a solution can only be optimal if its intermediate solutions (up to a certain state/time) are optimal w.r.t. the corresponding subproblems.Knapsack problems [17] and shortest path problems are the most prominent examples of optimization problems satisfying Bellman's principle [18], which does not hold for all optimization problems.The discrete electrical energy storage problem (2a)-(2c), however, satisfies Bellman's principle, since a control x t , y t , t = 1, . . ., τ , of the storage up to an intermediate time step τ ∈ T with fill level of V τ can only be extended to an optimal solution if there is no other feasible policy reaching this (or a larger) fill level V τ at time τ with lower energy cost.The optimal solution of the overall problem can then be derived from the optimal solutions of these subproblems.Applying Bellman's recursion [16] we determine the cheapest way to reach every feasible fill level at time step t based on the costs at time step t -1.The optimal control (x * 1 , . . ., x * m ) for an arbitrary final fill level can then be reconstructed by a backtracking procedure.Adapted to the previously introduced electrical storage problem (2a)-(2c), we initialize the recursion for the total energy costs z t (d) up to time step t to reach a given storage fill level d as where x 1 (d) = (dg(V init ) + ζ 1 η out )/η in is the amount of energy required to reach the level of d in the current state.Thereby, we assume that energy is only withdrawn from the storage when it is not necessary to reach the desired storage level d in time step t, i.e., ζ t = max{0, η out (g(V t-1 )d)}, since it is always preferable to directly consume energy over its lossy storage.Consequently, in each time step there can be only either charging of the storage or withdrawel of energy from the storage.

Rounding in the state space
Since the computational efficiency of dynamic programming algorithms strongly depends on the size of the state space a straightforward application of the Bellman recursion onto the discrete storage problem would lead to numerical difficulties.Due to the energy loss function g it is very unlikely that different policies end up at the same fill level, so the number of states grows exponentially with increasing number of time steps.In order to limit the number of states, we discretize the state space, i.e., the fill level of the storage, with step size h V ∈ R ≥0 .A similar approach is proposed in [19] on continuous control problems. Let define the ceil function with respect to some step size h, the corresponding floor function, V ∈ R ≥0 .In the rounding based dynamic programming (RBDP) algorithm (Algorithm 1), we underestimate the fill level of the storage, i.e., we round off in the state space and consider the following adapted recursion formula, by which we obtain the optimal control to approximately reach a given fill level d at time step t based on the controls up to the previous time step t -1: Using this recursion formula we can apply a dynamic programming scheme to compute the optimal control for each energy level of the storage d = c, c + h V , . . ., C for all time steps t = 1, . . ., m, and each energy input k, by eliminating dominated states, cf.Algorithm 1.In the following we assume that l and u are multiples of the discretization h x .By doing so, the rounding error is bounded from above by h V for one single time step.In order to estimate the total error of our algorithm, we first consider some fill level with rounding error 0 ≤ ε t < h V , t ∈ T. If we assume that the energy loss function g is a monotonically increasing, linear function, we can derive an explicit bound on the total rounding error: where g k denotes the k-times iterated function, i.e., g 0 = id and g k = g(g k-1 ) with k ∈ N.
Then, the total rounding error is given by By definition of the function g as g(V ) = (1β) V with β ∈ (0, 1), it holds that Since the rounding based dynamic programming algorithms is an exact method on the discretized state space, the difference between the costs of a solution of RBDP and the cost of the optimal solution of (2a)-(2c) are at most m h V • max t {p t }.The bound in equation ( 5) can be computed and subtracted from the upper capacity bound C in order to guarantee feasibility of the exact solution.Hence, h V should be chosen depending on the maximal capacity, i.e., Cε tot h V 0.

Numerical tests
We implemented the proposed rounding based dynamic programming algorithm, which can be easily adjusted to different use cases.In a simple setup, we compare our method against both linear and mixed integer linear programming, w.r.t. the obtained objective function value z * , of the respective approach.We further add some experiments that demonstrate the run-time differences of our method compared to integer programming, as well as a trade-off analysis that contrasts the computed energy costs with the storage capacity.The DP is implemented in Python 3.6, the LP in MATLAB and solved with Gurobi, and the MILP is implemented in Julia and solved with Cbc.All experiments were performed on an Intel(R) Celeron(R) N4000 CPU with 1.10 GHz and 8 GB main memory.In our framework there are several parameters to be adjusted.All experiments were performed with a linear energy loss function, however, different types of functions can be applied.Further, we assume a constant energy consumption over the overall time period to obtain interpretable results reflecting the energy costs.According to the nature of the day-ahead market, we allow to purchase energy in steps of 100 kWh, whereas the storage is discretized with a step size of 1 kWh, i.e., V t ∈ Z for all t ∈ T. A finer or coarser discretization increases the run-time or the rounding error, respectively.In all our numerical experiments we use a time discretization of 1 h.The storage capacity and the amount of purchasable energy are lower bounded by 0. The maximal fill level is varied in the following experiments, and we restrict the energy that is stored at one time step to be maximum half of the capacity.The efficiency factors are η in = 0.9 and η out = 0.95, the energy loss factor is β = 0.1.
In Table 1 we compare the results of our approach against linear and mixed-integer linear programming solutions.Linear programming (LP) achieves the best solutions, since it considers a relaxation of the discrete problem.However, these solutions are not feasible energy inputs from the EPEX Spot market.Our dynamic programming (DP) approach yields only slightly worse solutions compared to the exact optimal solutions of the mixed integer linear programming (MILP) problem (obtained with coin-or/Cbc [20]).All solutions are computed with initial and final fill level equal to 100 kWh.A visual example for the method comparison is given in Fig. 2. We observe that all three approaches provide qualitatively similar results.While the run-time of the rounding based dynamic programming algorithm depends not only on the considered time period, but also on the storage capacity, its discretization level and the purchasable energy, the run-time of the MILP is only little impaired by variations of these parameters.However, our approach has the clear advantage compared to the MILP model that the optimization over longer time periods (months/years) or with finer time discretizations (15 min/1 min) is possible.For fixed bounds regarding the storage size and the purchasable energy, its run-time grows only linearly for an increasing number of time steps, while the run-time of the MILP grows exponentially, see Fig. 3.
We provide a trade-off analysis (Fig. 4) where we consider the solutions of our algorithm applied to time frames of one month and one year for varying storage capacities going from 0 kWh to 5000 kWh and 1000 kWh, respectively, in steps of 10 kWh.Based on historical data, previous months or even years can be optimized for several storages and their respective investment costs can be viewed relative to the appropriate energy cost savings.This multiobjective perspective allows to investigate the potential of further investments in storage devices, since it shows the gradual cost reduction induced by the increasing capacity of the storage device.In the case that future energy prices are known either exactly or through forecasting models optimizing several days or weeks jointly improves the result (Fig. 5).In Fig. 6 we illustrate the solution obtained by the rounding based DP retrospectively optimizing the energy costs over one year, compared to the hourly energy prices.Here, we consider a storage with a maximal fill level of 1000 kWh and a constant energy consumption of 200 kWh per hour.We observe that most energy is purchased in the hours before the two energy price peaks.This demonstrates that our algorithm employs the energy storage in order to bridge expensive time periods.

Conclusions and outlook
In this paper, we show that rounding based dynamic programming is an efficient optimization approach for the optimal control of energy storage devices in the presence of volatile costs.In comparison to mixed-integer (linear) programming models the run-time of RBDP is linear in the number of time-steps, which allows us to optimize over larger time periods.The solution of RBDP is thereby a good approximation of the global optimum obtained by the solution of the MILP model.
The presented computational experiments use simplified load curves.However, it is possible to integrate more complicated load curves and feeding plants, as well as supply from own renewable energy sources.This could be used, for example, to optimize the energy trading of a medium-sized company with its own photovoltaic system and battery storage.In addition, this could also be used to calculate the optimal dimensioning of an energy storage system before the investment.

Figure 1
Figure 1Schematic visualization of an electrical energy storage with fill level V for discretized time intervals.The energy flow is depicted by arrows of different colors.The left energy from previous time intervals (black) remains in the storage, possibly affected by some energy loss.The purchased energy x can either be fed into the storage (green) or directly consumed (blue).The required amount of energy Z can also be (partly) extracted from the electrical storage (red)

Figure 2
Figure 2 Comparison of the different optimization methods for 01/07/2018 with a constant energy consumption of 200 kWh.The horizontal red and black lines represent the bounds on the capacity and the purchased energy, respectively

Figure 3 Figure 4
Figure 3 Comparison of the run-time for the DP and the MILP.The run-time is measured in seconds for different time periods.We observe a linearly and exponentially increasing run-time for the DP and MILP, respectively

Figure 5 Figure 6
Figure5Optimizing a number of days at once rather than in sequence improves the result.In this example, the costs for optimizing both days separately sum up to 6796 e , jointly to 6752 e, assuming a constant energy consumption of 200 kWh.The horizontal red and black lines represent the bounds on the capacity and the purchased energy, respectively

Table 1
Comparison of different optimization algorithms to minimize the energy costs over one week (06/15/18-06/21/18) for four different storage capacities.Purchasing exactly the required amount of energy in each time step, i.e., without using an electrical storage, costs 13,532 e.Note that the LP solutions are only given as lower bounds and do not correspond to feasible controls