SCIENTIA SINICA Informationis, Volume 50 , Issue 4 : 588-602(2020) https://doi.org/10.1360/N112019-00049

## Differential game learning approach for multiple microsatellites takeover of the attitude movement of failed spacecraft

• AcceptedJun 5, 2019
• PublishedApr 8, 2020
Share
Rating

### Abstract

Takeover of the attitude control function of a failed spacecraft suffering from fuel exhaustion or actuator failures enables recycling of on-board valuable reusable payloads. Microsatellites can provide cost-efficient ways for the attitude takeover control through coordination. Differential games are used to study the individual optimal decision problem, where each player optimizes their local performance index function to obtain the control policy, and the game's predefined global objective can be achieved. In this paper, the failed spacecraft attitude takeover control problem is transformed into a multi-microsatellite differential game problem. First, the multi-microsatellite differential game model is established, the performance index function is designed for each microsatellite, and the mathematical description of the multi-microsatellite differential game problem is realized. Second, the Hamilton-Jacobi (HJ) equations are provided and solved through the single neural network (NN) based policy iteration (PI) algorithm to learn the multi-microsatellite game equilibrium control strategies. Finally, numerical simulations are carried out to validate the effectiveness of the multi-microsatellite differential game learning method. The results have shown that the predefined global objective of the takeover of the attitude control of the failed spacecraft can be realized through the approximate game equilibrium control strategies of multiple microsatellites.

### Supplement

Appendix

\begin{align*}&\Psi_{1}=\left[\begin{matrix}-\textstyle\theta_{1m}^{2} \cdots -\textstyle\theta_{Nm}^{2}\end{matrix}\right], \Psi_{21}=\begin{bmatrix} \frac{\lambda_{DM11}^{2}}{2\psi_{121}^{2}}&\cdots &\frac{\lambda_{DM1N}^{2}}{2\psi_{12N}^{2}} \\ \vdots& &\vdots \\ \frac{\lambda_{DMN1}^{2}}{2\psi_{N21}^{2}}&\cdots &\frac{\lambda_{DMNN}^{2}}{2\psi_{N2N}^{2}}\end{bmatrix}, \Psi_{22}=\begin{bmatrix}\textstyle{\sum_{j=1}^{N} \frac{\psi_{12j}^{2}\theta_{1M}^{2}}{2}} \cdots \textstyle{\sum_{j=1}^{N} \frac{\psi_{N2j}^{2}\theta_{NM}^{2}}{2}}\end{bmatrix}, \\ &\Psi_{41}=\begin{bmatrix} &\textstyle{\sum_{j=1}^{N} \frac{\lambda_{EMj1}^{2}}{32\psi_{j41}^{2}}}& & \\ & &\ddots & \\ & & &\textstyle{\sum_{j=1}^{N} \frac{\lambda_{EMjN}^{2}}{32\psi_{j4N}^{2}}}\end{bmatrix}, \Psi_{42}=\begin{bmatrix}\textstyle{\sum_{j=1}^{N} \frac{\psi_{14j}^{2}b_{1M}^{2}}{2}} \cdots \textstyle{\sum_{j=1}^{N} \frac{\psi_{N4j}^{2}b_{NM}^{2}}{2}}\end{bmatrix}, \\ &\Psi_{5}=\begin{bmatrix}\textstyle{\sum_{j=1}^{N} \frac{\psi_{15j}^{2}b_{1M}^{2}}{2}} \cdots \textstyle{\sum_{j=1}^{N} \frac{\psi_{N5j}^{2}b_{NM}^{2}}{2}}\end{bmatrix}+\begin{bmatrix}\textstyle{\sum_{j=1}^{N} \frac{b_{Dj1}^{2}}{8\psi_{j51}^{2}}} \cdots \textstyle{\sum_{j=1}^{N} \frac{b_{DjN}^{2}}{8\psi_{j5N}^{2}}}\end{bmatrix}, \\ &\Psi_{6}=\begin{bmatrix}\textstyle{\sum_{j=1}^{N} \frac{\psi_{16j}^{2}b_{1M}^{2}}{2}} \cdots \textstyle{\sum_{j=1}^{N} \frac{\psi_{N6j}^{2}b_{NM}^{2}}{2}}\end{bmatrix}+\begin{bmatrix}\textstyle{\sum_{j=1}^{N} \frac{b_{Ej1}^{2}}{8\psi_{j61}^{2}}} \cdots \textstyle{\sum_{j=1}^{N} \frac{b_{EjN}^{2}}{8\psi_{j6N}^{2}}}\end{bmatrix}, \\ &\Psi_{71}=\begin{bmatrix} \frac{\psi_{17}^{2}b_{1M}^{2}}{2} \cdots \frac{\psi_{N7}^{2}b_{NM}^{2}}{2}\end{bmatrix}, \Psi_{72}=\sum_{i=1}^{N} \frac{b_{e_{Hi}}^{2}}{2\psi_{i7}^{2}}, \end{align*} 其中$\Psi_{21}$, $\Psi_{41}\in\mathbb{R}^{{N}\times{N}}$, $\Psi_{1}$, $\Psi_{22}$, $\Psi_{42}$, $\Psi_{5}$, $\Psi_{6}$, $\Psi_{71}\in\mathbb{R}^{{1}\times{N}}$, $\Psi_{72}\in\mathbb{R}$.

### References

[1] Zhai G, Zhang J R, Zhou Z C. A review of on-orbit life-time extension technologies for GEO satellites. Journal of Astronautics, 2012, 33: 849--859 DOI: 10.3873/j.issn.1000-1328.2012.07.001. Google Scholar

[2] Huang P F, Wang M, Chang H T, et al. Takeover control of attitude maneuver for failed spacecraft. Journal of Astronautics, 2016, 37: 924--935 DOI: 10.3873/j.issn.1000-1328.2016.08.005. Google Scholar

[3] Wang Z, Yuan J, Shi Y. Robust adaptive fault tolerant attitude control for post-capture non-cooperative targets with actuator nonlinearities. Trans Institute Measurement Control, 2018, 40: 2116-2128 CrossRef Google Scholar

[4] Chang H T, Huang P F, Wang M, et al. Distributed control allocation for cellular space robots in takeover control. Acta Aeronaut et Astronaut Sin, 2016, 37: 2864--2873. Google Scholar

[5] Jaeger T, Mirczak W. Satlets - The building blocks of future satellites - and which mold do you use? In: Proceedings of AIAA SPACE 2013 Conference and Exposition, San Diego, 2013. Google Scholar

[6] Goeller M, Oberlaender J, Uhl K, et al. Modular robots for on-orbit satellite servicing. In: Proceedings of IEEE International Conference on Robotics and Biomimetics, Guangzhou, 2012. 2018--2023. Google Scholar

[7] Xue L, Wang Q L, Sun C Y. Game theoretical approach for the leader selection of the second-order multi-agent system. Control Theory Appl, 2016, 33: 1593--1602. Google Scholar

[8] Lin W. Distributed UAV formation control using differential game approach. Aerospace Sci Tech, 2014, 35: 54-62 CrossRef Google Scholar

[9] Vamvoudakis K G, Lewis F L, Hudas G R. Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality. Automatica, 2012, 48: 1598-1611 CrossRef Google Scholar

[10] Abouheaf M I, Lewis F L, Vamvoudakis K G. Multi-agent discrete-time graphical games and reinforcement learning solutions. Automatica, 2014, 50: 3038-3053 CrossRef Google Scholar

[11] Ru C J, Wei R X, Guo Q, et al. Guidance control of cognitive game for unmanned aerial vehicleautonomous collision avoidance. Control Theory & Applications, 2014, 31: 1555--1560. Google Scholar

[12] Bopardikar S D, Bullo F, Hespanha J P. On Discrete-Time Pursuit-Evasion Games With Sensing Limitations. IEEE Trans Robot, 2008, 24: 1429-1439 CrossRef Google Scholar

[13] Blasch E P, Pham K, Shen D. Orbital satellite pursuit-evasion game-theoretical control. In: Proceedings of the 11th International Conference on Information Science, Signal Processing and their Applications, Montreal, 2012. 1007--1012. Google Scholar

[14] Vamvoudakis K G, Lewis F L. Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations. Automatica, 2011, 47: 1556-1569 CrossRef Google Scholar

[15] Liu D, Li H, Wang D. Online Synchronous Approximate Optimal Learning Algorithm for Multi-Player Non-Zero-Sum Games With Unknown Dynamics. IEEE Trans Syst Man Cybern Syst, 2014, 44: 1015-1027 CrossRef Google Scholar

[16] Vamvoudakis K G. Non-zero sum Nash Q-learning for unknown deterministic continuous-time linear systems. Automatica, 2015, 61: 274-281 CrossRef Google Scholar

[17] Schaub H, Junkins J L. Analytical mechanics of space systems. In: Proceedings of AIAA Education Series, Reston, 2003. 107--142. Google Scholar

• Figure 2

(Color online) Value function trajectories of microsatellites

• Figure 3

(Color online) Convergence of critic NN weight vector estimations of microsatellites. (a) Microsatellite 1; protectłinebreak (b) Microsatellite 2; (c) Microsatellite 3

• Figure 4

(Color online) Attitude MRPs trajectories of combination

• Figure 5

(Color online) Attitude angular velocity trajectories of combination

• Figure 6

(Color online) Control torques trajectories of microsatellites. (a) Microsatellite 1; (b) Microsatellite 2; protectłinebreak (c) Microsatellite 3

Citations

• #### 0

Altmetric

Copyright 2020  CHINA SCIENCE PUBLISHING & MEDIA LTD.  中国科技出版传媒股份有限公司  版权所有