logo

SCIENCE CHINA Information Sciences, Volume 64 , Issue 3 : 132206(2021) https://doi.org/10.1007/s11432-019-2825-y

The greedy crowd and smart leaders: a hierarchical strategy selection game with learning protocol

More info
  • ReceivedOct 22, 2019
  • AcceptedFeb 5, 2020
  • PublishedFeb 7, 2021

Abstract


Acknowledgment

This work was supported by Tianjin Natural Science Foundation (Grant Nos. 20JCYBJC01060, 20JCQNJC01450) and National Natural Science Foundation of China (Grant No. 61973175).


Supplement

Appendix

Proof of Theorem 4

Proof. We set $J_{1k}(s_k)~=~f^a_k(w_k)~-~C_1s$, $J_{2k}(s_k)~=~f^a_k(w_k)~-~C_2s_k$, and $s_1~=~[s_{11},~s_{12},\dots,~s_{1n_c}]~$ is the solutions of $J_{1k}$. So according to the Nash-equilibrium condition we have \begin{equation} 0 = \frac{\partial J_{1k}}{\partial s_{1k}}= \frac{f^a_k{}' (w_{1k})(1-v_{1k})K^a}{\sum_{m\in V^c}s_{1m}}-C_1. \tag{1}\end{equation} Now we set $s_2~=~[s_{21},~s_{22},\dots,~s_{2n_c}]$ where $s_{2k}~=~C_1s_{1k}/C_2$. This makes $v_{1k}~=~v_{2k}$ and $w_{1k}~=~w_{2k}$ to be the same allocation of weights. $s_2$ also satisfies \begin{align} \frac{\partial J_{2k}}{\partial s_{2k}} = \frac{f^a_k{}' (w_{2k})(1-v_{2k})K^a}{\sum_{m\in V^c}s_{2m}}-C_2 &=\frac{C_2}{C_1} \frac{f^a_k{}' (w_{1k})(1-v_{1k})K^a}{\sum_{m\in V^c}s_{1m}}-C_2 = C_2 - C_2 = 0. \tag{2} \end{align} Namely $s_2$ is also the Nash-equilibrium of each function $J_{2k}(\cdot)$. Since $s_1$ and $s_2$ denote the same allocation, we know that all the Nash-equilibrium of $J_1$ is the Nash-equilibrium of $J_2$, and the inverse proposition holds the same. So the choice between $C_1$ and $C_2$ only affects the scale of the bid $s_k$ but not the final allocation of the auction.here


References

[1] The Role of Population Games and Evolutionary Dynamics in Distributed Control Systems: The Advantages of Evolutionary Game Theory. IEEE Control Syst, 2017, 37: 70-97 CrossRef Google Scholar

[2] Nowak M A, Tarnita C E, Antal T. Evolutionary dynamics in structured populations. Phil Trans R Soc B, 2010, 365: 19-30 CrossRef Google Scholar

[3] Fu F, Wang L, Nowak M A. Evolutionary dynamics on graphs: Efficient method for weak selection. Phys Rev E, 2009, 79: 046707 CrossRef ADS Google Scholar

[4] Taylor C, Fudenberg D, Sasaki A. Evolutionary game dynamics in finite populations. Bull Math Biol, 2004, 66: 1621-1644 CrossRef Google Scholar

[5] Ohtsuki H, Nowak M A. Evolutionary games on cycles. Proc R Soc B, 2006, 273: 2249-2256 CrossRef Google Scholar

[6] Nowak M A. Five Rules for the Evolution of Cooperation. Science, 2006, 314: 1560-1563 CrossRef ADS Google Scholar

[7] Ohtsuki H, Nowak M A, Pacheco J M. Breaking the Symmetry between Interaction and Replacement in Evolutionary Dynamics on Graphs. Phys Rev Lett, 2007, 98: 108106 CrossRef ADS Google Scholar

[8] Tarnita C E, Ohtsuki H, Antal T. Strategy selection in structured populations. J Theor Biol, 2009, 259: 570-581 CrossRef Google Scholar

[9] Xia C, Li X, Wang Z. Doubly effects of information sharing on interdependent network reciprocity. New J Phys, 2018, 20: 075005 CrossRef ADS Google Scholar

[10] Tang C, Li X, Wang Z. Cooperation and distributed optimization for the unreliable wireless game with indirect reciprocity. Sci China Inf Sci, 2017, 60: 110205 CrossRef Google Scholar

[11] Xia C, Ding S, Wang C. Risk Analysis and Enhancement of Cooperation Yielded by the Individual Reputation in the Spatial Public Goods Game. IEEE Syst J, 2017, 11: 1516-1525 CrossRef ADS Google Scholar

[12] Chen M, Wang L, Sun S. Evolution of cooperation in the spatial public goods game with adaptive reputation assortment. Phys Lett A, 2016, 380: 40-47 CrossRef ADS Google Scholar

[13] Fudenberg D, Levine D K. The Theory of Learning in Games. Boston: MIT Press, 1998. Google Scholar

[14] Li J, Zhang C, Sun Q. Changing intensity of interaction can resolve prisoner's dilemmas. EPL, 2016, 113: 58002 CrossRef ADS Google Scholar

[15] Perc M, Gómez-Garde?es J, Szolnoki A. Evolutionary dynamics of group interactions on structured populations: a review. J R Soc Interface, 2013, 10: 20120997 CrossRef Google Scholar

[16] Gracia-Lázaro C, Gómez-Garde?es J, Floría L M. Intergroup information exchange drives cooperation in the public goods game. Phys Rev E, 2014, 90: 042808 CrossRef ADS arXiv Google Scholar

[17] Gómez-Garde?es J, Vilone D, Sánchez A. Disentangling social and group heterogeneities: Public Goods games on complex networks. EPL, 2011, 95: 68003 CrossRef ADS arXiv Google Scholar

[18] Gómez-Garde?es J, Romance M, Criado R. Evolutionary games defined at the network mesoscale: The Public Goods game. Chaos, 2011, 21: 016113 CrossRef ADS arXiv Google Scholar

[19] Kelly F P, Maulloo A K, Tan D K H. Rate control for communication networks: shadow prices, proportional fairness and stability. J Operational Res Soc, 1998, 49: 237-252 CrossRef Google Scholar

[20] Li J, Ma G, Li T. A Stackelberg game approach for demand response management of multi-microgrids with overlapping sales areas. Sci China Inf Sci, 2019, 62: 212203 CrossRef Google Scholar

[21] Monderer D, Shapley L S. Potential games. Games Econom Behav, 1996, 16: 124--143. Google Scholar

[22] Barreiro-Gomez J, Obando G, Quijano N. Distributed Population Dynamics: Optimization and Control Applications. IEEE Trans Syst Man Cybern Syst, 2016, : 1-11 CrossRef Google Scholar

[23] Barreiro-Gomez J, Quijano N, Ocampo-Martinez C. Constrained distributed optimization: A population dynamics approach. Automatica, 2016, 69: 101-116 CrossRef Google Scholar

[24] Li N, Marden J R. Designing Games for Distributed Optimization. IEEE J Sel Top Signal Process, 2013, 7: 230-242 CrossRef ADS Google Scholar

[25] Li N, Marden J R. Decoupling Coupled Constraints Through Utility Design. IEEE Trans Automat Contr, 2014, 59: 2289-2294 CrossRef Google Scholar

[26] Marden J R. State based potential games. Automatica, 2012, 48: 3075-3088 CrossRef Google Scholar

[27] Maheswaran R, Basar T. Efficient signal proportional allocation (ESPA) mechanisms: decentralized social welfare maximization for divisible resources. IEEE J Sel Areas Commun, 2006, 24: 1000-1009 CrossRef Google Scholar

[28] Yan L, Qu B, Zhu Y. Dynamic economic emission dispatch based on multi-objective pigeon-inspired optimization with double disturbance. Sci China Inf Sci, 2019, 62: 070210 CrossRef Google Scholar

[29] Tang C, Li A, Li X. Asymmetric Game: A Silver Bullet to Weighted Vertex Cover of Networks. IEEE Trans Cybern, 2018, 48: 2994-3005 CrossRef Google Scholar

[30] Li X, Peng Z, Liang L. Policy iteration based Q-learning for linear nonzero-sum quadratic differential games. Sci China Inf Sci, 2019, 62: 052204 CrossRef Google Scholar

[31] Watkins C J, Dayan P. Technical note: Q-learning. Mach Learn, 1992, 8: 279--292. Google Scholar

[32] Lanctot M, Zambaldi V F, Gruslys A, et al. A unified game-theoretic approach to multiagent reinforcement learning. In: Proceedings of the 31st International Conference on Neural Information Processing, 2017. 4190--4203. Google Scholar

[33] Tuyls K, Pérolat J, Lanctot M. Symmetric Decomposition of Asymmetric Games. Sci Rep, 2018, 8: 1015 CrossRef ADS arXiv Google Scholar

[34] Zhang K Q, Yang Z R, LIU H, et al. Fully decentralized multi-agent reinforcement learning with networked agents. In: Proceedings of International Conference on Machine Learning, 2018. 5867--5876. Google Scholar

[35] Busoniu L, Babuska R, De Schutter B. A Comprehensive Survey of Multiagent Reinforcement Learning. IEEE Trans Syst Man Cybern C, 2008, 38: 156-172 CrossRef Google Scholar