logo

SCIENCE CHINA Information Sciences, Volume 63 , Issue 11 : 212202(2020) https://doi.org/10.1007/s11432-019-2702-x

Event-based optimization with random packet dropping

More info
  • ReceivedMay 6, 2019
  • AcceptedSep 27, 2019
  • PublishedOct 9, 2020

Abstract

Event-based optimization (EBO) provides a general framework for policy optimization in many discrete event dynamic systems where decision making is triggered by events that represent state transitions with common features. Because the number of events can be defined by the user and usually increases linearly with respect to the system scale, EBO has good potential to address large-scale problems where the state space grows exponentially. However, in many practical systems, sensors are geographically distributed and connected to a central controller through imperfect communication channels. Therefore, events observed by the sensors may not reach the controller. Optimization methods of event-based policies in these cases have not yet been identified. In this paper, we consider this important problem and make three major contributions. First, we formulate a mathematical EBO model in which the communication between sensors and controllers is subject to random packet dropping. Second, we show that this EBO model can be converted to another EBO model with perfect communication. Then, the performance difference equation and the performance derivative equation for event-based policies are straightforward to develop. One gradient-based policy iteration algorithm is developed for problems where the state transition probabilities are explicitly known, while another for problems where they are unknown. Third, the performance of the algorithms and the impact of the packet dropping probability on policy performance are numerically demonstrated on a single-zone occupant level estimation problem in buildings.


References

[1] Cassandras C G, Lafortune S. Introduction to Discrete Event Systems. 2nd ed. New York: Springer, 2008. Google Scholar

[2] Cao X R. Basic Ideas for Event-Based Optimization of Markov Systems. Discrete Event Dyn Syst, 2005, 15: 169-197 CrossRef Google Scholar

[3] Cao X R. Stochastic Learning and Optimization---A Sensitivity-Based Approach. New York: Springer, 2007. Google Scholar

[4] Cassandras C G. The event-driven paradigm for control, communication and optimization. J Control Decision, 2014, 1: 3-17 CrossRef Google Scholar

[5] Ren Z, Krogh B H. State aggregation in Markov decision processes. In: Proceedings of the 41st IEEE Confefence on Decision and Control, Las Vegas, 2002. 3819--3824. Google Scholar

[6] Cao X R, Ren Z, Bhatnagar S. A time aggregation approach to Markov decision processes. Automatica, 2002, 38: 929-943 CrossRef Google Scholar

[7] Li Xia , Qianchuan Zhao , Qing-Shan Jia . A Structure Property of Optimal Policies for Maintenance Problems WithSafety-Critical Components. IEEE Trans Automat Sci Eng, 2008, 5: 519-531 CrossRef Google Scholar

[8] Qing-Shan Jia . A Structural Property of Optimal Policies for Multi-Component Maintenance Problems. IEEE Trans Automat Sci Eng, 2010, 7: 677-680 CrossRef Google Scholar

[9] Bertsekas D P, Tsitsiklis J N. Neuro-Dynamic Programming. Belmont: Athena Scientific, 1996. Google Scholar

[10] Powell W B. Approxiamte Dynamic Programming: Solving the Curse of Dimensionality. New York: Wiley-Interscience, 2007. Google Scholar

[11] Bertsekas D P. Dynamic Programming and Optimal Control: Approximate Dynamic Programming. 4th ed. Nashua: Athena Scientific, 2012. Google Scholar

[12] Sutton R S, Barto A G. Reinforcement Learning: An Introduction. Cambridge: MIT Press, 1998. Google Scholar

[13] Guestrin C, Koller D, Parr R. Efficient Solution Algorithms for Factored MDPs. jair, 2003, 19: 399-468 CrossRef Google Scholar

[14] ${\rm~~\AA}$ström KJ, Bernhardsson B. Comparison of periodic and event based sampling for first-order stochastic systems. In: Proceedings of the 14th IFAC World Congress, Beijing, 1999. 301--306. Google Scholar

[15] Arzén KE. A simple event-based PID controller. In: Proceedings of the 14th IFAC World Congress, Beijing, 1999. 423--428. Google Scholar

[16] Tabuada P. Event-Triggered Real-Time Scheduling of Stabilizing Control Tasks. IEEE Trans Automat Contr, 2007, 52: 1680-1685 CrossRef Google Scholar

[17] Heemels W P M H, Johansson K H, Tabuada P. An introduction to event-triggered and self-triggered control. In: Proceedings of the 51st IEEE Conference on Decision and Control, Muai, 2012. 3270--3285. Google Scholar

[18] Zhang X M, Han Q L, Zhang B L. An Overview and Deep Investigation on Sampled-Data-Based Event-Triggered Control and Filtering for Networked Systems. IEEE Trans Ind Inf, 2017, 13: 4-16 CrossRef Google Scholar

[19] Wu Z G, Xu Y, Lu R. Event-Triggered Control for Consensus of Multiagent Systems With Fixed/Switching Topologies. IEEE Trans Syst Man Cybern Syst, 2018, 48: 1736-1746 CrossRef Google Scholar

[20] Xia L, Jia Q S, Cao X R. A tutorial on event-based optimizationa new optimization framework. Discrete Event Dyn Syst, 2014, 24: 103-132 CrossRef Google Scholar

[21] Xi-Ren Cao , Han-Fu Chen . Perturbation realization, potentials, and sensitivity analysis of Markov processes. IEEE Trans Automat Contr, 1997, 42: 1382-1393 CrossRef Google Scholar

[22] Jia Q S, Wen Z, Xia L. Event-based sensor activation for indoor occupant distribution estimation. In: Proceedings of the 12th International Conference on Control, Automation, Robotics, and Vision, Guangzhou, 2012. 240--245. Google Scholar

[23] Jia Q S, Shen J X, Xu Z B. Simulation-Based Policy Improvement for Energy Management in Commercial Office Buildings. IEEE Trans Smart Grid, 2012, 3: 2211-2223 CrossRef Google Scholar

[24] Jia Q S, Guo Y. Event-based evacuation in outdoor environment. In: Proceedings of the 24th Chinese Control and Decision Conference, Taiyuan, 2012. 33--38. Google Scholar

[25] Wang D X, Cao X R. Event-based optimization for POMDP and its application in portfolio management. In: Proceedings of the 18th IFAC World Congress, Milano, 2011. 3228--3233. Google Scholar

[26] Jia Q S, Wu J. On distributed event-based optimization for shared economy in cyber-physical energy systems. Sci China Inf Sci, 2018, 61: 110203 CrossRef Google Scholar

[27] Guan X, Xu Z, Jia Q S. Cyber-physical model for efficient and secured operation of CPES or energy Internet. Sci China Inf Sci, 2018, 61: 110201 CrossRef Google Scholar

[28] Zhong M, Cassandras C G. Asynchronous Distributed Optimization With Event-Driven Communication. IEEE Trans Automat Contr, 2010, 55: 2735-2750 CrossRef Google Scholar

[29] Jia Q S. Event-based optimization with lagged state information. In: Proceedings of the 31st Chinese Control Conference, Hefei, 2012. 2055--2060. Google Scholar

[30] Cao X R, Wang D X, Qiu L. Partial-Information State-Based Optimization of Partially Observable Markov Decision Processes and the Separation Principle. IEEE Trans Automat Contr, 2014, 59: 921-936 CrossRef Google Scholar

[31] Sinopoli B, Schenato L, Franceschetti M. Kalman Filtering With Intermittent Observations. IEEE Trans Automat Contr, 2004, 49: 1453-1464 CrossRef Google Scholar

[32] Zhang M, Shen C, Wu Z G. Dissipative Filtering for Switched Fuzzy Systems With Missing Measurements.. IEEE Trans Cybern, 2019, : 1-10 CrossRef PubMed Google Scholar

[33] Wang H T, Jia Q S, Lei Y L, et al. Estimation of occupant distribution by detecting the entrance and leaving events of zones in building. In: Proceedings of the 2012 IEEE International Conference on Multisensor Fusion and Integration, Hamburg, 2012. 27--32. Google Scholar

[34] Jia Q S, Wang H, Lei Y. A Decentralized Stay-Time Based Occupant Distribution Estimation Method for Buildings. IEEE Trans Automat Sci Eng, 2015, 12: 1482-1491 CrossRef Google Scholar

  • Figure 1

    Event-based optimization with random packet dropping.

  • Figure 2

    Event-based optimization without packet dropping.

  •   

    Algorithm 1 Gradient-based policy iteration

    Step 1. Choose $d^1~\in~\mathcal{D},~n=1$, $N$ (a fixed large integer), $\gamma_1~=~\gamma_0$, where $\gamma_0$ is a given scalar. Let $\Pi(d)$ be a $V$-by-$A$ matrix, where the $(e,j)$-th element denotes the probability for $d(e)$ to take action $j$. $\Delta>0$ is a small constant.

    Step 2. Use policy $d^n$ to generate a sample path $\{s_k,~k=0,1,\ldots\}$. The sample path should be sufficiently long to guaranteethe estimation accuracy of $g_N^{d^n}$ and $\pi^{d^n}(i|e)$.

    Step 3. Use the sample path in Step 2 to estimate

  • Table 1  

    Table 1Parameter settings in the experiment

    Group $\gamma_0~$ $d^1$ $\alpha$ $\lambda$ $p$
    (1) 0.1 Average, Random, Sub-optimal 0.5 0.9 0.1
    (2) 1.0 Average, Random, Sub-optimal 0.5 0.9 0.1
    (3) 1.0 Average 0,0.1,…,1 0.9 0.1
    (4) 1.0 Average 0.5 0,0.1,…,1 0.1
    (5) 1.0 Average 0.5 0.9 0,0.1,…,1

Copyright 2020  CHINA SCIENCE PUBLISHING & MEDIA LTD.  中国科技出版传媒股份有限公司  版权所有

京ICP备14028887号-23       京公网安备11010102003388号