# SCIENTIA SINICA Informationis, Volume 50, Issue 4: 588-602(2020) https://doi.org/10.1360/N112019-00049

## Differential game learning approach for multiple microsatellites takeover of the attitude movement of failed spacecraft • AcceptedJun 5, 2019
• PublishedApr 8, 2020
Share
Rating

### Abstract

Takeover of the attitude control function of a failed spacecraft suffering from fuel exhaustion or actuator failures enables recycling of on-board valuable reusable payloads. Microsatellites can provide cost-efficient ways for the attitude takeover control through coordination. Differential games are used to study the individual optimal decision problem, where each player optimizes their local performance index function to obtain the control policy, and the game's predefined global objective can be achieved. In this paper, the failed spacecraft attitude takeover control problem is transformed into a multi-microsatellite differential game problem. First, the multi-microsatellite differential game model is established, the performance index function is designed for each microsatellite, and the mathematical description of the multi-microsatellite differential game problem is realized. Second, the Hamilton-Jacobi (HJ) equations are provided and solved through the single neural network (NN) based policy iteration (PI) algorithm to learn the multi-microsatellite game equilibrium control strategies. Finally, numerical simulations are carried out to validate the effectiveness of the multi-microsatellite differential game learning method. The results have shown that the predefined global objective of the takeover of the attitude control of the failed spacecraft can be realized through the approximate game equilibrium control strategies of multiple microsatellites.

### Supplement

Appendix

\begin{align*}&\Psi_{1}=\left[\begin{matrix}-\textstyle\theta_{1m}^{2} \cdots -\textstyle\theta_{Nm}^{2}\end{matrix}\right], \Psi_{21}=\begin{bmatrix} \frac{\lambda_{DM11}^{2}}{2\psi_{121}^{2}}&\cdots &\frac{\lambda_{DM1N}^{2}}{2\psi_{12N}^{2}} \\ \vdots& &\vdots \\ \frac{\lambda_{DMN1}^{2}}{2\psi_{N21}^{2}}&\cdots &\frac{\lambda_{DMNN}^{2}}{2\psi_{N2N}^{2}}\end{bmatrix}, \Psi_{22}=\begin{bmatrix}\textstyle{\sum_{j=1}^{N} \frac{\psi_{12j}^{2}\theta_{1M}^{2}}{2}} \cdots \textstyle{\sum_{j=1}^{N} \frac{\psi_{N2j}^{2}\theta_{NM}^{2}}{2}}\end{bmatrix}, \\ &\Psi_{41}=\begin{bmatrix} &\textstyle{\sum_{j=1}^{N} \frac{\lambda_{EMj1}^{2}}{32\psi_{j41}^{2}}}& & \\ & &\ddots & \\ & & &\textstyle{\sum_{j=1}^{N} \frac{\lambda_{EMjN}^{2}}{32\psi_{j4N}^{2}}}\end{bmatrix}, \Psi_{42}=\begin{bmatrix}\textstyle{\sum_{j=1}^{N} \frac{\psi_{14j}^{2}b_{1M}^{2}}{2}} \cdots \textstyle{\sum_{j=1}^{N} \frac{\psi_{N4j}^{2}b_{NM}^{2}}{2}}\end{bmatrix}, \\ &\Psi_{5}=\begin{bmatrix}\textstyle{\sum_{j=1}^{N} \frac{\psi_{15j}^{2}b_{1M}^{2}}{2}} \cdots \textstyle{\sum_{j=1}^{N} \frac{\psi_{N5j}^{2}b_{NM}^{2}}{2}}\end{bmatrix}+\begin{bmatrix}\textstyle{\sum_{j=1}^{N} \frac{b_{Dj1}^{2}}{8\psi_{j51}^{2}}} \cdots \textstyle{\sum_{j=1}^{N} \frac{b_{DjN}^{2}}{8\psi_{j5N}^{2}}}\end{bmatrix}, \\ &\Psi_{6}=\begin{bmatrix}\textstyle{\sum_{j=1}^{N} \frac{\psi_{16j}^{2}b_{1M}^{2}}{2}} \cdots \textstyle{\sum_{j=1}^{N} \frac{\psi_{N6j}^{2}b_{NM}^{2}}{2}}\end{bmatrix}+\begin{bmatrix}\textstyle{\sum_{j=1}^{N} \frac{b_{Ej1}^{2}}{8\psi_{j61}^{2}}} \cdots \textstyle{\sum_{j=1}^{N} \frac{b_{EjN}^{2}}{8\psi_{j6N}^{2}}}\end{bmatrix}, \\ &\Psi_{71}=\begin{bmatrix} \frac{\psi_{17}^{2}b_{1M}^{2}}{2} \cdots \frac{\psi_{N7}^{2}b_{NM}^{2}}{2}\end{bmatrix}, \Psi_{72}=\sum_{i=1}^{N} \frac{b_{e_{Hi}}^{2}}{2\psi_{i7}^{2}}, \end{align*} 其中$\Psi_{21}$, $\Psi_{41}\in\mathbb{R}^{{N}\times{N}}$, $\Psi_{1}$, $\Psi_{22}$, $\Psi_{42}$, $\Psi_{5}$, $\Psi_{6}$, $\Psi_{71}\in\mathbb{R}^{{1}\times{N}}$, $\Psi_{72}\in\mathbb{R}$.

### References

 Zhai G, Zhang J R, Zhou Z C. A review of on-orbit life-time extension technologies for GEO satellites. Journal of Astronautics, 2012, 33: 849--859 DOI: 10.3873/j.issn.1000-1328.2012.07.001. Google Scholar

 Huang P F, Wang M, Chang H T, et al. Takeover control of attitude maneuver for failed spacecraft. Journal of Astronautics, 2016, 37: 924--935 DOI: 10.3873/j.issn.1000-1328.2016.08.005. Google Scholar

 Wang Z, Yuan J, Shi Y. Robust adaptive fault tolerant attitude control for post-capture non-cooperative targets with actuator nonlinearities. Trans Institute Measurement Control, 2018, 40: 2116-2128 CrossRef Google Scholar

 Chang H T, Huang P F, Wang M, et al. Distributed control allocation for cellular space robots in takeover control. Acta Aeronaut et Astronaut Sin, 2016, 37: 2864--2873. Google Scholar

 Jaeger T, Mirczak W. Satlets - The building blocks of future satellites - and which mold do you use? In: Proceedings of AIAA SPACE 2013 Conference and Exposition, San Diego, 2013. Google Scholar

 Goeller M, Oberlaender J, Uhl K, et al. Modular robots for on-orbit satellite servicing. In: Proceedings of IEEE International Conference on Robotics and Biomimetics, Guangzhou, 2012. 2018--2023. Google Scholar

 Xue L, Wang Q L, Sun C Y. Game theoretical approach for the leader selection of the second-order multi-agent system. Control Theory Appl, 2016, 33: 1593--1602. Google Scholar

 Lin W. Distributed UAV formation control using differential game approach. Aerospace Sci Tech, 2014, 35: 54-62 CrossRef Google Scholar

 Vamvoudakis K G, Lewis F L, Hudas G R. Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality. Automatica, 2012, 48: 1598-1611 CrossRef Google Scholar

 Abouheaf M I, Lewis F L, Vamvoudakis K G. Multi-agent discrete-time graphical games and reinforcement learning solutions. Automatica, 2014, 50: 3038-3053 CrossRef Google Scholar

 Ru C J, Wei R X, Guo Q, et al. Guidance control of cognitive game for unmanned aerial vehicleautonomous collision avoidance. Control Theory & Applications, 2014, 31: 1555--1560. Google Scholar

 Bopardikar S D, Bullo F, Hespanha J P. On Discrete-Time Pursuit-Evasion Games With Sensing Limitations. IEEE Trans Robot, 2008, 24: 1429-1439 CrossRef Google Scholar

 Blasch E P, Pham K, Shen D. Orbital satellite pursuit-evasion game-theoretical control. In: Proceedings of the 11th International Conference on Information Science, Signal Processing and their Applications, Montreal, 2012. 1007--1012. Google Scholar

 Vamvoudakis K G, Lewis F L. Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations. Automatica, 2011, 47: 1556-1569 CrossRef Google Scholar

 Liu D, Li H, Wang D. Online Synchronous Approximate Optimal Learning Algorithm for Multi-Player Non-Zero-Sum Games With Unknown Dynamics. IEEE Trans Syst Man Cybern Syst, 2014, 44: 1015-1027 CrossRef Google Scholar

 Vamvoudakis K G. Non-zero sum Nash Q-learning for unknown deterministic continuous-time linear systems. Automatica, 2015, 61: 274-281 CrossRef Google Scholar

 Schaub H, Junkins J L. Analytical mechanics of space systems. In: Proceedings of AIAA Education Series, Reston, 2003. 107--142. Google Scholar

• Figure 2

(Color online) Value function trajectories of microsatellites

• Figure 3

(Color online) Convergence of critic NN weight vector estimations of microsatellites. (a) Microsatellite 1; protectłinebreak (b) Microsatellite 2; (c) Microsatellite 3

• Figure 4

(Color online) Attitude MRPs trajectories of combination

• Figure 5

(Color online) Attitude angular velocity trajectories of combination

• Figure 6

(Color online) Control torques trajectories of microsatellites. (a) Microsatellite 1; (b) Microsatellite 2; protectłinebreak (c) Microsatellite 3

Copyright 2020 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有