logo

SCIENCE CHINA Information Sciences, Volume 62, Issue 5: 052103(2019) https://doi.org/10.1007/s11432-018-9609-7

Cumulative activation in social networks

More info
  • ReceivedApr 3, 2018
  • AcceptedSep 11, 2018
  • PublishedApr 3, 2019

Abstract

Most studies on influence maximization focus on one-shot propagation, i.e., the influence is propagated fromseed users only once following a probabilistic diffusion model and users' activation are determined viasingle cascade.In reality it is often the case that a user needs to be cumulatively impactedby receiving enough pieces of information propagated to herbefore she makes the final purchase decision.In this paper we model such cumulative activation as the following process: first multiple pieces of information are propagated independently in the social networkfollowing the classical independent cascade model, then the user will be activated (and adopt the product) if the cumulative pieces of information she receivedreaches her cumulative activation threshold.Two optimization problems are investigated under this framework: seed minimization with cumulative activation (SM-CA), which asks how to select a seed set with minimum size such that the number of cumulatively activenodes reaches a given requirement $\eta$;influence maximization with cumulative activation (IM-CA), which asks how to choose a seed set with fixed budget to maximize the number of cumulatively active nodes.For SM-CA problem, we design a greedy algorithm that yields a bicriteria $O(\ln~n)$-approximation when $\eta=n$, where$n$ is the number of nodes in the network.For both SM-CA problem with $\eta<n$ and IM-CA problem, we prove strong inapproximability results.Despite the hardness results, we propose two efficient heuristic algorithms for SM-CA and IM-CA respectively based on the reverse reachable set approach.Experimental results on different real-world social networks show that our algorithmssignificantly outperform baseline algorithms.


Acknowledgment

This work was supported in part by National Natural Science Foundation of China (Grant Nos. 61433014, 61502449, 61602440), National Basic Research Program of China (973) (Grant No. 2016YFB1000201).


References

[1] Chen W, Lakshmanan L S V, Castillo C. Information and influence propagation in social networks. Morgan & Claypool Publishers, 2013. Google Scholar

[2] Domingos P, Richardson M. Mining the network value of customers. In: Proceeding of KDD. New York: ACM, 2001. 57--66. Google Scholar

[3] Kempe D, Kleinberg J, TardosÉ Maximizing the spread of influence through a social network. In: Proceeding of KDD. New York: ACM, 2003. 137--146. Google Scholar

[4] Centola D, Macy M. Complex Contagions and the Weakness of Long Ties. Am J Sociology, 2007, 113: 702-734 CrossRef Google Scholar

[5] Borgs C, Brautbar M, Chayes J, et al. Maximizing social influence in nearly optimal time. In: Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms, Portland, 2014. 946--957. Google Scholar

[6] Nguyen H T, Thai M T, Dinh N T. Stop-and-stare: optimal sampling algorithms for viral marketing in billion-scale networks. In: Proceeding of SIGMOD. New York: ACM, 2016. 695--710. Google Scholar

[7] Tang Y, Shi Y, Xiao X. Influence maximization in near-linear time: a martingale approach. In: Proceeding of SIGMOD. New York: ACM, 2015. 1539--1554. Google Scholar

[8] Tang Y, Xiao X, Shi Y. Influence maximization: near-optimal time complexity meets practical efficiency. In: Proceeding of SIGMOD. New York: ACM, 2014. 946--957. Google Scholar

[9] M. Richardson, Domingos P. Mining knowledge-sharing sites for viral marketing. In: Proceeding of KDD. New York: ACM, 2002. 61--70. Google Scholar

[10] Chen W, Wang C, Wang Y. Scalable influence maximization for prevalent viral marketing in large-scale social networks. In: Proceeding of KDD. New York: ACM, 2010. 1029--1038. Google Scholar

[11] Chen W, Wang Y, Yang S. Efficient influence maximization in social networks. In: Proceeding of KDD. New York: ACM, 2009. 199--208. Google Scholar

[12] Leskovec J, Krause A, Guestrin C, et al. Cost-effective outbreak detection in networks. In: Proceeding of KDD. New York: ACM, 2007. 420--429. Google Scholar

[13] Kempe D, Kleinberg J, É Tardos. Influential nodes in a diffusion model for social networks. In: ICALP, 2005. 1127--1138. Google Scholar

[14] Chen N. On the Approximability of Influence in Social Networks. SIAM J Discrete Math, 2009, 23: 1400-1415 CrossRef Google Scholar

[15] Long C, Wong R-W. Minimizing seed set for viral marketing. In: Proceedings of IEEE 11th International Conference on Data Mining (ICDM), 2011. 427--436. Google Scholar

[16] Goyal A, Bonchi F, Lakshmanan L V, et al. On minimizing budget and time in influence propagation over social networks. Social Network Analysis and Mining, 2012. 1--14. Google Scholar

[17] Zhang P, Chen W, Sun X, et al. Minimizing seed set selection with probabilistic coverage guarantee in a social network. In: Proceeding of KDD. New York: ACM, 2014. 1306--1315. Google Scholar

[18] He J, Ji S, Beyah R, et al. Minimum-sized influential node set selection for social networks under the independent cascade model. In: Proceeding of MobiHoc. New York: ACM, 2014. 93--102. Google Scholar

[19] Gruhl D, Guha R, Liben-Nowell D, et al. Information diffusion through blogspace. In: Proceeding of WWW. New York: ACM, 2004. 491--501. Google Scholar

[20] Tang J, Sun J, Wang C, et al. Social influence analysis in large-scale networks. In: Proceeding of KDD. New York: ACM, 2009. 807--816. Google Scholar

[21] Nemhauser G L, Wolsey L A, Fisher M L. An analysis of approximations for maximizing submodular set functions-I. Math Programming, 1978, 14: 265-294 CrossRef Google Scholar

[22] Chen W, Li F, Lin T, et al. Combining traditional marketing and viral marketing with amphibious influence maximization. In: Proceeding of EC. New York: ACM, 2015. 779--796. Google Scholar

[23] Feige U. A threshold of ln n for approximating set cover. J ACM, 1998, 45: 634-652 CrossRef Google Scholar

[24] Farajtabar M, Du N, Gomez-Rodriguez M, et al. Shaping social activity by incentivizing users. In: Proceeding of NIPS, 2014. 2474--2482. Google Scholar

[25] Feige U, Peleg D, Kortsarz G. The Dense k -Subgraph Problem. Algorithmica, 2001, 29: 410-421 CrossRef Google Scholar

[26] Bhaskara A, Moses C, Chlamt. Google Scholar

[27] Khot S. Ruling Out PTAS for Graph Min?Bisection, Dense k?§ubgraph, and Bipartite Clique. SIAM J Comput, 2006, 36: 1025-1071 CrossRef Google Scholar

[28] Manurangsi P. Almost-polynomial ratio eth-hardness of approximating densest k-subgraph. In: Proceeding of STOC. New York: ACM, 2017. 954--961. Google Scholar

[29] Hajiaghayi M T, Jain K, Konwar K, et al. The minimum k-colored subgraph problem in haplotyping and dna primer selection. In: Proceeding of IWBRA, 2006. Google Scholar

[30] Barbieri N, Bonchi F, Manco G. Topic-aware social influence propagation models. In: Proceedings of IEEE 12th International Conference on Data Mining (ICDM), 2012. 81--90. Google Scholar

[31] Goyal A, Lu W, Lakshmanan S V L. SIMPATH: An Efficient Algorithm for Influence Maximization under the Linear Threshold Model In: Proceeding of ICDM, 2011. 211--220. Google Scholar

[32] Wang C, Chen W, Wang Y. Scalable influence maximization for independent cascade model in large-scale social networks. Data Min Knowl Disc, 2012, 25: 545-576 CrossRef Google Scholar

[33] Brin S, Page L. The anatomy of a large-scale hypertextual Web search engine. Comput Networks ISDN Syst, 1998, 30: 107-117 CrossRef Google Scholar

  • Figure 1

    (Color online) Illustration of multiple cascades. (a) $T=0$; (b) $T=1$; (c) $T=2$; (d) $T=3$.

  • Figure 2

    Figures for understanding the model. (a) CA & IC; (b) nonsubmoduarity; (c) $\eta<n$.

  • Table 1   Datasets
    Name # Node #Edge Type AOD
    Flixster 29 K 174 K Directed 6.0
    NetPHY 37 K 348 K Undirected 18.8
    DBLP 655 K 2 M Undirected 6.1
  •   

    Algorithm 1 Estimate $f(S)$ by Monte Carlo

    Require:$G=(V,~E),~\{p_{u~v}\}_{(u,~v)\in~E},~\{\tau_u\}_{u\in~V},~U,~S,~R$;

    Output:$\hat{f}(S)$: the estimation of $f(S)$;

    $\hat{f}(S)=0$;

    $\hat{P}_u(S)=0$; $t_u=0$ for all $u~\in~V$;

    for $i=1$ to $R$

    Simulate IC diffusion from seed set $S$;

    if $u$ is activated then

    $t_u=t_u+1$;

    end if

    end for

    for $u\in~U$

    $\hat{P}_u(S)=t_u/R$;

    if $\hat{P}_u(S)\geq\tau_u$ then

    $\hat{f}(S)=\hat{f}(S)+\tau_u$;

    else

    $\hat{f}(S)=\hat{f}(S)+\hat{P}_u(S)$;

    end if

    end for

    return $\hat{f}(S)$.

  •   

    Algorithm 2 Greedy algorithm for SM-CA with $\eta=n$

    Require:$G=(V,~E),~\{p_{uv}\}_{(u,~v)\in~E},\{\tau_u\}_{u\in~V},~U$, $\varepsilon$;

    Output:Seed set $S$

    $S=\emptyset$, $\hat{f}(S)=0$;

    while $\hat{f}(S)<\sum_{u\in~V}\tau_u-\varepsilon$ do

    Choose $v=\arg\max_{u\in~V\setminus~S}~[\hat{f}(S\cup~\{u\})-\hat{f}(S)]$;

    $S=S\cup~\{v\}$;

    end while

    return $S$.

  • Table 2   Running time ($\tau=0.3$, $k=500$) (s)
    sf TIM$^+$ ADG-IM-CA BTG-IM-CA
    Flixster39 87 138
    NetPHY54 112 142
    DBLP 509 8865 8685
  •   

    Algorithm 3 Framework of greedy algorithm for IM-CA problem

    Require:$G=(V,~~E),~~\{p_{uv}\}_{(u,~v)\in~E},~\{\tau_u\}_{u\in~V},~~k,~~\theta$;

    Output:Seed set $S$;

    Set $S=\emptyset$;

    Generate $\theta$ RR sets for each node $u\in~V$: $\{\mathcal~R_u\}_{u\in~V}$;

    Set ${\rm~req}(u)=\tau_u\theta~$ for each node $u\in~V$;

    for $j=1$ to $k$

    $x=$ SS($G,~\{p_{uv}\}_{(u,v)\in~E},~\{{\rm~req}(u)\}_{u\in~V},~~\{\mathcal~R_u\}_{u\in~V}$);

    $S=S\cup~\{x\}$;

    Remove all RR Sets containing $x$;

    for each $u$ in $V$

    ${\rm~rem}(u)$: the number of RR Sets removed from $\mathcal~R_u$;

    ${\rm~req}(u)={\rm~req}(u)-{\rm~rem}(u)$;

    end for

    end for

    return $S$.

Copyright 2020 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1