logo

SCIENCE CHINA Information Sciences, Volume 63, Issue 3: 132101(2020) https://doi.org/10.1007/s11432-019-9900-9

Identifying the vulnerabilities of bitcoin anonymous mechanism based on address clustering

More info
  • ReceivedApr 26, 2019
  • AcceptedMay 21, 2019
  • PublishedFeb 11, 2020

Abstract

The anonymity mechanism of bitcoin is favored by the society, which promotes its usage and development. An adversary should not be able to discover the relation between bitcoin addresses and bitcoin users to ensure effective privacy. However, the relation among bitcoin transactions can be used to analyze the bitcoin privacy information, which seriously jeopardizes the bitcoin anonymity. Herein, we describe the vulnerabilities associated with the anonymity mechanism of bitcoin, including the relation among bitcoin addresses and the relation among bitcoin users. Further, we demonstrate that the existing methods do not guarantee the comprehensiveness, accuracy, and efficiency of the analysis results. We propose a heuristic clustering method to judge the relation among bitcoin addresses and employ the Louvain method to discover the relation among bitcoin users. Subsequently, we construct an address-associated database of historical transactions and implement real-time updates. Extensive experiments are used to demonstrate the comprehensiveness, accuracy, and efficiency of the proposed scheme. Specifically, the proposed scheme reveals the privacy vulnerability associated with the blockchain technology. We expect that our scheme can be applied to improve the blockchain technology.


Acknowledgment

This work was supported by National Cryptography Development Fund (GrantNo.MMJJ20180412).


References

[1] Nakamoto S. Bitcoin: a peer-to-peer electronic cash system. Technical report, 2008. https://bitcoin.org/bitcoin.pdf. Google Scholar

[2] Androulaki E, Karame G O, Roeschlin M, et al. Evaluating user privacy in Bitcoin. In: Proceedings of International Conference on Financial Cryptography and Data Security, Berlin, 2013. 34--51. Google Scholar

[3] Liao K, Zhao Z M, Doupe A, et al. Behind closed doors: measurement and analysis of cryptolocker ransoms in Bitcoin. In: Proceedings of Symposium on Electronic Crime Research, Toronto, 2016. Google Scholar

[4] Meiklejohn S, Pomarole M, Jordan G, et al. A fistful of bitcoins: characterizing payments among men with no names. In: Proceedings of Conference on Internet Measurement Conference, New York, 2013. 127--140. Google Scholar

[5] Kakadiaris I A, Kumar A, Scheirer W J, et al. Identifying Bitcoin users by transaction behavior. In: Proceedings of Biometric and Surveillance Technology for Human and Activity Identification XII, Baltimore, 2015. 945704. Google Scholar

[6] Reid F, Harrigan M. An analysis of anonymity in the Bitcoin system. In: Proceedings of IEEE International Conference on Social Computing (SocialCom), Boston, 2011. 1318--1326. Google Scholar

[7] Ron D, Shamir A. Quantitative analysis of the full Bitcoin transaction graph. In: Proceedings of Financial Cryptography and Data Security, Okinawa, 2013. 6--24. Google Scholar

[8] Spagnuolo M, Maggi F, Zanero S. BitIodine: extracting intelligence from the Bitcoin network. In: Proceedings of International Conference on Financial Cryptography & Data Security, Christ Church, 2014. 457--468. Google Scholar

[9] Zhao C, Guan Y. A graph-based investigation of Bitcoin transactions. In: Proceedings of IFIP International Conference on Digital Forensics, Orlando, 2015. 79--95. Google Scholar

[10] Zheng B K, Zhu L H, Shen M. Scalable and Privacy-Preserving Data Sharing Based on Blockchain. J Comput Sci Technol, 2018, 33: 557-567 CrossRef Google Scholar

[11] Zheng B K, Zhu L H, Shen M, et al. Malicious Bitcoin transaction tracing using incidence relation clustering. In: Proceedings of International Conference on Mobile Networks & Management, Melbourne, 2017. 313--323. Google Scholar

[12] Gao F, Zhu L, Shen M. A Blockchain-Based Privacy-Preserving Payment Mechanism for Vehicle-to-Grid Networks. IEEE Network, 2018, 32: 184-192 CrossRef Google Scholar

[13] Antonopoulos A M. Mastering Bitcoin: Unlocking Digital Crypto-Currencies. Sebastopol: O'Reilly Media, 2014. Google Scholar

[14] Swan M. Blockchain: Blueprint for a New Economy. Sebastopol: O'Reilly Media, 2015. Google Scholar

[15] Saxena A, Misra J, Dhar A. Increasing anonymity in Bitcoin. In: Proceedings of International Conference on Financial Cryptography & Data Security, Christ Church, 2014. 122--139. Google Scholar

[16] Neilson D, Hara S, Mitchell I. Bitcoin forensics: a tutorial. In: Proceedings of International Conference on Global Security, London, 2017. 12--26. Google Scholar

[17] Li H, Zhu L, Shen M. Blockchain-Based Data Preservation System for Medical Data.. J Med Syst, 2018, 42: 141 CrossRef PubMed Google Scholar

[18] Blondel V D, Guillaume J L, Lambiotte R. Fast unfolding of communities in large networks. J Stat Mech, 2008, 2008(10): P10008 CrossRef ADS arXiv Google Scholar

[19] Lewenberg Y, Bachrach Y, Sompolinsky Y, et al. Bitcoin mining pools: a cooperative game theoretic analysis. In: Proceedings of International Conference on Autonomous Agents and Multiagent Systems, Istanbul, 2015. 919--927. Google Scholar

[20] Gach O, Hao J K. Improving the Louvain algorithm for community detection with modularity maximization. In: Proceedings of Artificial Evolution, Bordeaux, 2013. 145--156. Google Scholar

[21] Park H S, Jun C H. A simple and fast algorithm for K-medoids clustering. Expert Syst Appl, 2009, 36: 3336-3341 CrossRef Google Scholar

[22] Xiaojiang Du , Hsiao-Hwa Chen . Security in wireless sensor networks. IEEE Wireless Commun, 2008, 15: 60-66 CrossRef Google Scholar

[23] Zhang C, Zhu L, Xu C. PTBI: An efficient privacy-preserving biometric identification based on perturbed term in the cloud. Inf Sci, 2017, 409-410: 56-67 CrossRef Google Scholar

[24] Du X, Xiao Y, Guizani M. An effective key management scheme for heterogeneous sensor networks. Ad Hoc Networks, 2007, 5: 24-34 CrossRef Google Scholar

[25] Fahad A, Alshatri N, Tari Z. A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis. IEEE Trans Emerg Top Comput, 2014, 2: 267-279 CrossRef Google Scholar

[26] Zhang C, Zhu L, Xu C. PPDP: An efficient and privacy-preserving disease prediction scheme in cloud-based e-Healthcare system. Future Generation Comput Syst, 2018, 79: 16-25 CrossRef Google Scholar

[27] Guan Z, Si G, Zhang X. Privacy-Preserving and Efficient Aggregation Based on Blockchain for Power Grid Communications in Smart Communities. IEEE Commun Mag, 2018, 56: 82-88 CrossRef Google Scholar

[28] Guan Z, Zhang Y, Zhu L. EFFECT: an efficient flexible privacy-preserving data aggregation scheme with authentication in smart grid. Sci China Inf Sci, 2019, 62: 32103 CrossRef Google Scholar

[29] Shen M, Ma B, Zhu L. Secure Phrase Search for Intelligent Processing of Encrypted Data in Cloud-Based IoT. IEEE Internet Things J, 2019, 6: 1998-2008 CrossRef Google Scholar

[30] Shen M, Ma B, Zhu L. Cloud-Based Approximate Constrained Shortest Distance Queries Over Encrypted Graphs With Privacy Protection. IEEE TransInformForensic Secur, 2018, 13: 940-953 CrossRef Google Scholar

[31] Shen M, Tang X Y, Zhu L H, et al. Privacy-preserving support vector machine training over blockchain-based encrypted iot data in smart cities. IEEE Int Thing J, 2019. Google Scholar

[32] de Meo P, Ferrara E, Fiumara G, et al. Generalized Louvain method for community detection in large networks. In: Proceedings of International Conference on Intelligent Systems Design and Applications, Córdoba, 2011. 88--93. Google Scholar

  • Figure 1

    Bitcoin transaction. (a) Bitcoin transaction diagram; (b) multi-input transaction diagram; (c) coinbase transaction diagram; (d) change address diagram.

  • Figure 2

    (Color online) System model of transaction analysis.

  • Figure 3

    (Color online) Address transaction relationship diagram.

  • Figure 4

    (Color online) The user incidence relation diagram (different colored addresses belonging to different subcommunities).

  • Figure 5

    (Color online) Different data group results.

  • Figure 6

    (Color online) Different iteration results.

  • Figure 7

    (Color online) Different heuristic method results.

  • Figure 8

    (Color online) Runtime comparison between the Louvain and improved Louvain algorithms.

  • Figure 9

    (Color online) Increase of $\triangle~Q$ with the improved Louvain algorithm.

  •   

    Algorithm 1 Scrapy web spider

    Require:User URL seeds $S=\{s_{1},~s_{2},~\ldots,~s_{n}\}$, $k\leq~n$;

    Output:Bitcoin addresses.

    Initialize $S=\{s_{1},~s_{2},~\ldots,~s_{n}\}$;

    for $k~\in~[1,n]$

    Acquire $s_{k}$ page data;

    if page data contains the addresses then

    Store the addresses;

    end if

    Find the associated URL from the user's URL;

    Save URL of page to $S$;

    end for

    return Addresses.

  •   

    Algorithm 2 Confirmation of the bitcoin address incidence relation

    Require:Inquiry addresses;

    Output:Addresses in the same cluster.

    Initialize address set $A=\{a_{1},~a_{2},~\ldots,~a_{n}\}$, $k\leq~n$;

    Initialize transaction set $T=\{t_{1},~t_{2},~\ldots,~t_{n}\}$;

    Acquire data of the address transaction to save to $A$;

    for $k~\in~[1,n]$

    Acquire $s_{k}$ page data;

    if $t$ is coinbase transaction then

    Addresses outputted to save to $A$;ELSIFone of Output$(t)$ belongs to a known mining pool and number of Output$(t)~\geq~100$

    Addresses outputted to save to $A$;

    else

    Addresses inputted to save to $A$;

    Judge change address to save to $A$;

    end if

    end for

    return Addresses.

  •   

    Algorithm 3 Clustering of the historical transaction data

    Require:Historical transaction data;

    Output:Addresses in the same cluster.

    Initialize transaction data set txLsit;

    Initialize a mapping set of address and cluster labels ${\rm~addr}\_{\rm~label}$;

    Initialize temporary address set tempList and temporary label ${\rm~tempLabel}=0$;

    Take the transaction ${\rm~TX}_{n}$ out of txList;

    for $k~\in~[1,n]$ and ${\rm~tempLabel}=0$

    if ${\rm~TX}_{n}$ is coinbase transaction then

    Extract all output addresses from ${\rm~TX}_{n}$ and add them to the collection tempList;

    end if

    for $k~\in~[1,n]$

    Take an undetected address from tempList;

    if the same address can be found in tempList then

    The label corresponding to this address is written to tempLabel;

    else

    The address is written back to tempList;

    end if

    end for

    Write all the addresses in tempList to ${\rm~addr}\_{\rm~label}$ and assign the same label to those addresses;

    ${\rm~addr}\_{\rm~label}$+;

    end for

    return Addresses.

  •   

    Algorithm 4 $G=(V,~E)$ pretreatment

    Require:$G=(V,~E)$ using adjacency list;

    Output:The result of division of $G~(V,~E)$.

    Initialize $G$;

    Calculate the module value $Q_{1},~Q_{0}~=~Q_{1}$;

    $Q_{2}~=~Q_{1}$;

    for $k~\in~[1,N]$

    if $V_{i}$ = 1 then

    Put vertex $V_{i}$ to the community connected with it;

    else

    Take vertex $V_{i}$ out of original community;

    Put $V_{i}$ to community with largest $\Delta~Q$;

    end if

    end for

    Calculate the module value $Q_{1}$;

    if $Q_{2}>Q_{1}$ then

    Go to step 2;

    end if

    Combine the various communities into one hyperdot.

  •   

    Algorithm 5 $G=(V,~E)$ coarseness inverse operation optimization

    Require:$G=(V,~E)$;

    Output:The result of division of $G~(V,~E)$.

    Initialize $G$;

    Check node $V_{i}$ of every community connecting outside;

    Check node $V_{j}$ with highest degree in every community;

    Adopt K-medoids algorithm;

    Calculate distance value $D_{ij}$ of every $V_{i}$ and every $V_{j}$;

    Calculate minimum value of $D_{ij}$;

    $V_{i}$ save to community of $V_{j}$.

Copyright 2020 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1