References
[1]
Mutlu
O,
Ghose
S,
Gómez-Luna
J.
Processing data where it makes sense: Enabling in-memory computation.
Microprocessors MicroSyst,
2019, 67: 28-41
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=Processing data where it makes sense: Enabling in-memory computation&author=Mutlu O&author=Ghose S&author=Gómez-Luna J&publication_year=2019&journal=Microprocessors MicroSyst&volume=67&pages=28-41
[2]
Mutlu O, Ghose S, Gómez-Luna J, et al. Enabling practical processing in and near memory for data-intensive computing. In: Proceedings of the 56th Annual Design Automation Conference 2019, Las Vegas, 2019. 1--4.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Mutlu O, Ghose S, Gómez-Luna J, et al. Enabling practical processing in and near memory for data-intensive computing. In: Proceedings of the 56th Annual Design Automation Conference 2019, Las Vegas, 2019. 1--4&
[3]
Singh G, Gómez-Luna J, Mariani G, et al. NAPEL: Near-memory computing application performance prediction via ensemble learning. In: Proceedings of the 56th ACM/IEEE Design Automation Conference (DAC), Las Vegas, 2019. 1--6.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Singh G, Gómez-Luna J, Mariani G, et al. NAPEL: Near-memory computing application performance prediction via ensemble learning. In: Proceedings of the 56th ACM/IEEE Design Automation Conference (DAC), Las Vegas, 2019. 1--6&
[4]
Boroumand A, Ghose S, Patel M, et al. CoNDA: efficient cache coherence support for near-data accelerators. In: Proceedings of the 46th International Symposium on Computer Architecture, Phoenix Arizona, 2019. 629--642.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Boroumand A, Ghose S, Patel M, et al. CoNDA: efficient cache coherence support for near-data accelerators. In: Proceedings of the 46th International Symposium on Computer Architecture, Phoenix Arizona, 2019. 629--642&
[5]
Ghose
S,
Boroumand
A,
Kim
J S.
Processing-in-memory: A workload-driven perspective.
IBM J Res Dev,
2019, 63: 3:1-3:19
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=Processing-in-memory: A workload-driven perspective&author=Ghose S&author=Boroumand A&author=Kim J S&publication_year=2019&journal=IBM J Res Dev&volume=63&pages=3:1-3:19
[6]
Song L, Qian X, Li H, et al. Pipelayer: a pipelined reram-based accelerator for deep learning. In: Proceedings of 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA). Austin: IEEE, 2017. 541--552.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Song L, Qian X, Li H, et al. Pipelayer: a pipelined reram-based accelerator for deep learning. In: Proceedings of 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA). Austin: IEEE, 2017. 541--552&
[7]
Farmahini-Farahani A, Ahn J H, Morrow K, et al. NDA: near-DRAM acceleration architecture leveraging commodity DRAM devices and standard memory modules. In: Proceedings of 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA). Burlingame: IEEE, 2015. 283--295.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Farmahini-Farahani A, Ahn J H, Morrow K, et al. NDA: near-DRAM acceleration architecture leveraging commodity DRAM devices and standard memory modules. In: Proceedings of 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA). Burlingame: IEEE, 2015. 283--295&
[8]
Springer R, Lowenthal K D, Rountree B, et al. Minimizing execution time in MPI programs on an energy-constrained, power-scalable cluster. In: Proceedings of the 11th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. New York: ACM, 2006. 230--238.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Springer R, Lowenthal K D, Rountree B, et al. Minimizing execution time in MPI programs on an energy-constrained, power-scalable cluster. In: Proceedings of the 11th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. New York: ACM, 2006. 230--238&
[9]
Xiao
P,
Han
N.
A novel power-conscious scheduling algorithm for data-intensive precedence-constrained applications in cloud environments.
IJHPCN,
2014, 7: 299-306
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=A novel power-conscious scheduling algorithm for data-intensive precedence-constrained applications in cloud environments&author=Xiao P&author=Han N&publication_year=2014&journal=IJHPCN&volume=7&pages=299-306
[10]
Patki T, Lowenthal K D, Rountree B, et al. Exploring hardware overprovisioning in power-constrained, high performance computing. In: Proceedings of the 27th international ACM conference on International conference on supercomputing. Eugene Oregon: ACM, 2013. 173--182.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Patki T, Lowenthal K D, Rountree B, et al. Exploring hardware overprovisioning in power-constrained, high performance computing. In: Proceedings of the 27th international ACM conference on International conference on supercomputing. Eugene Oregon: ACM, 2013. 173--182&
[11]
Pugsley
S H,
Jestes
J,
Balasubramonian
R.
Comparing Implementations of Near-Data Computing with In-Memory MapReduce Workloads.
IEEE Micro,
2014, 34: 44-52
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=Comparing Implementations of Near-Data Computing with In-Memory MapReduce Workloads&author=Pugsley S H&author=Jestes J&author=Balasubramonian R&publication_year=2014&journal=IEEE Micro&volume=34&pages=44-52
[12]
Chi
P,
Li
S,
Xu
C.
PRIME.
SIGARCH Comput Archit News,
2016, 44: 27-39
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=PRIME&author=Chi P&author=Li S&author=Xu C&publication_year=2016&journal=SIGARCH Comput Archit News&volume=44&pages=27-39
[13]
Ahn J, Yoo S, Mutlu O, et al. PIM-enabled instructions: a low-overhead, locality-aware processing-in-memory architecture. In: Proceedings of 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA). Portland: IEEE, 2015. 336--348.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Ahn J, Yoo S, Mutlu O, et al. PIM-enabled instructions: a low-overhead, locality-aware processing-in-memory architecture. In: Proceedings of 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA). Portland: IEEE, 2015. 336--348&
[14]
Gao M, Kozyrakis C. HRL: efficient and flexible reconfigurable logic for near-data processing. In: Proceedings of 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA). Barcelona: IEEE, 2016. 126--137.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Gao M, Kozyrakis C. HRL: efficient and flexible reconfigurable logic for near-data processing. In: Proceedings of 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA). Barcelona: IEEE, 2016. 126--137&
[15]
Zhang D, Jayasena N, Lyashevsky A, et al. TOP-PIM: throughput-oriented programmable processing in memory. In: Proceedings of the 23rd International Symposium on High-performance Parallel and Distributed Computing. Vancouver: ACM, 2014. 85--98.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Zhang D, Jayasena N, Lyashevsky A, et al. TOP-PIM: throughput-oriented programmable processing in memory. In: Proceedings of the 23rd International Symposium on High-performance Parallel and Distributed Computing. Vancouver: ACM, 2014. 85--98&
[16]
Hsieh K, Ebrahimi E, Kim G, et al. Transparent offloading and mapping (TOM): enabling programmer-transparent near-data processing in GPU systems. In: Proceedings of ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), 2016. 44: 204--216.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Hsieh K, Ebrahimi E, Kim G, et al. Transparent offloading and mapping (TOM): enabling programmer-transparent near-data processing in GPU systems. In: Proceedings of ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), 2016. 44: 204--216&
[17]
Xu C, Niu D, Muralimanohar N, et al. Understanding the trade-offs in multi-level cell ReRAM memory design. In: Proceedings of 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC). Ausin: IEEE, 2013. 1--6.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Xu C, Niu D, Muralimanohar N, et al. Understanding the trade-offs in multi-level cell ReRAM memory design. In: Proceedings of 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC). Ausin: IEEE, 2013. 1--6&
[18]
Gokhale
M,
Holmes
B,
Iobst
K.
Processing in memory: the Terasys massively parallel PIM array.
Computer,
1995, 28: 23-31
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=Processing in memory: the Terasys massively parallel PIM array&author=Gokhale M&author=Holmes B&author=Iobst K&publication_year=1995&journal=Computer&volume=28&pages=23-31
[19]
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521(7553): 436-444.
Google Scholar
http://scholar.google.com/scholar_lookup?title=LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521(7553): 436-444&
[20]
Goodfellow I, Bengio Y, Courville A. Deep learning. Cambridge: MIT Press, 2016.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Goodfellow I, Bengio Y, Courville A. Deep learning. Cambridge: MIT Press, 2016&
[21]
Soomro T R. Google Translation service issues: Religious text perspective. Journal of Global Research in Computer Science, 2013, 4(8): 40-43.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Soomro T R. Google Translation service issues: Religious text perspective. Journal of Global Research in Computer Science, 2013, 4(8): 40-43&
[22]
Vazquez-Calvo B, Zhang L T, Pascual M, et al. Fan translation of games, anime, and fanfiction. Language, Learning and Technology, 2019, 23(1):49-71 DOI: 10.125/446722019.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Vazquez-Calvo B, Zhang L T, Pascual M, et al. Fan translation of games, anime, and fanfiction. Language, Learning and Technology, 2019, 23(1):49-71 DOI: 10.125/446722019&
[23]
Li
C,
Qouneh
A,
Li
T.
iSwitch.
SIGARCH Comput Archit News,
2012, 40: 512-523
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=iSwitch&author=Li C&author=Qouneh A&author=Li T&publication_year=2012&journal=SIGARCH Comput Archit News&volume=40&pages=512-523
[24]
Li C, Zhou R, Li T. Enabling distributed generation powered sustainable high-performance data center. In: Proceedings of 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA). Shenzhen: IEEE, 2013. 35--46.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Li C, Zhou R, Li T. Enabling distributed generation powered sustainable high-performance data center. In: Proceedings of 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA). Shenzhen: IEEE, 2013. 35--46&
[25]
Li C, Zhang W, Cho C, et al. SolarCore: Solar energy driven multi-core architecture power management. In: Proceedings of 2011 IEEE 17th International Symposium on High Performance Computer Architecture. San Antonio: IEEE, 2011. 205--216.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Li C, Zhang W, Cho C, et al. SolarCore: Solar energy driven multi-core architecture power management. In: Proceedings of 2011 IEEE 17th International Symposium on High Performance Computer Architecture. San Antonio: IEEE, 2011. 205--216&
[26]
Hadidi R, Asgari B, Mudassar B A, et al. Demystifying the characteristics of 3D-stacked memories: a case study for Hybrid Memory Cube. In: Proceedings of 2017 IEEE International Symposium on Workload Characterization (IISWC). Seattle: IEEE, 2017. 66--75.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Hadidi R, Asgari B, Mudassar B A, et al. Demystifying the characteristics of 3D-stacked memories: a case study for Hybrid Memory Cube. In: Proceedings of 2017 IEEE International Symposium on Workload Characterization (IISWC). Seattle: IEEE, 2017. 66--75&
[27]
Pei
J,
Deng
L,
Song
S.
Towards artificial general intelligence with hybrid Tianjic chip architecture.
Nature,
2019, 572: 106-111
CrossRef
ADS
Google Scholar
http://scholar.google.com/scholar_lookup?title=Towards artificial general intelligence with hybrid Tianjic chip architecture&author=Pei J&author=Deng L&author=Song S&publication_year=2019&journal=Nature&volume=572&pages=106-111
[28]
Farmahini-Farahani
A,
Ho Ahn
J,
Morrow
K.
DRAMA: An Architecture for Accelerated Processing Near Memory.
IEEE Comput Arch Lett,
2015, 14: 26-29
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=DRAMA: An Architecture for Accelerated Processing Near Memory&author=Farmahini-Farahani A&author=Ho Ahn J&author=Morrow K&publication_year=2015&journal=IEEE Comput Arch Lett&volume=14&pages=26-29
[29]
Nair
R,
Antao
S F,
Bertolli
C.
Active Memory Cube: A processing-in-memory architecture for exascale systems.
IBM J Res Dev,
2015, 59: 17:1-17:14
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=Active Memory Cube: A processing-in-memory architecture for exascale systems&author=Nair R&author=Antao S F&author=Bertolli C&publication_year=2015&journal=IBM J Res Dev&volume=59&pages=17:1-17:14
[30]
Vermij E, Hagleitner C, Fiorin L, et al. An architecture for near-data processing systems. In: Proceedings of the ACM International Conference on Computing Frontiers. Como: ACM, 2016. 357--360.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Vermij E, Hagleitner C, Fiorin L, et al. An architecture for near-data processing systems. In: Proceedings of the ACM International Conference on Computing Frontiers. Como: ACM, 2016. 357--360&
[31]
Liu Z, Calciu I, Herlihy M, et al. Concurrent data structures for near-memory computing. In: Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures. Washington: ACM, 2017. 235--245.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Liu Z, Calciu I, Herlihy M, et al. Concurrent data structures for near-memory computing. In: Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures. Washington: ACM, 2017. 235--245&
[32]
Yazdanbakhsh A, Song C, Sacks J, et al. In-DRAM near-data approximate acceleration for GPUs. In: Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques. Limassol Cyprus: ACM, 2018. 34.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Yazdanbakhsh A, Song C, Sacks J, et al. In-DRAM near-data approximate acceleration for GPUs. In: Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques. Limassol Cyprus: ACM, 2018. 34&
[33]
Wang Y, Han Y, Zhang L, et al. ProPRAM: exploiting the transparent logic resources in non-volatile memory for near data computing. In: Proceedings of the 52nd Annual Design Automation Conference. San Francisco: ACM, 2015. 47.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Wang Y, Han Y, Zhang L, et al. ProPRAM: exploiting the transparent logic resources in non-volatile memory for near data computing. In: Proceedings of the 52nd Annual Design Automation Conference. San Francisco: ACM, 2015. 47&
[34]
Lee J H, Sim J, Kim H. SSync: Processing near memory for machine learning workloads with bounded staleness consistency models. In: Proceedings of 2015 International Conference on Parallel Architecture and Compilation (PACT), San Francisco: IEEE, 2015. 241--252.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Lee J H, Sim J, Kim H. SSync: Processing near memory for machine learning workloads with bounded staleness consistency models. In: Proceedings of 2015 International Conference on Parallel Architecture and Compilation (PACT), San Francisco: IEEE, 2015. 241--252&
[35]
Kim D, Kung J, Chai S, et al. Neurocube: a programmable digital neuromorphic architecture with high-density 3D memory. In: Proceedings of 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), Seoul: IEEE, 2016. 380--392.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Kim D, Kung J, Chai S, et al. Neurocube: a programmable digital neuromorphic architecture with high-density 3D memory. In: Proceedings of 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), Seoul: IEEE, 2016. 380--392&
[36]
Aga S, Jayasena N, Ignatowski M. Co-ML: a case for collaborative ML acceleration using near-data processing. In: Proceedings of the International Symposium on Memory Systems. Alexandria: ACM, 2019. 506--517.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Aga S, Jayasena N, Ignatowski M. Co-ML: a case for collaborative ML acceleration using near-data processing. In: Proceedings of the International Symposium on Memory Systems. Alexandria: ACM, 2019. 506--517&
[37]
Ahn
J,
Hong
S,
Yoo
S.
A scalable processing-in-memory accelerator for parallel graph processing.
SIGARCH Comput Archit News,
2016, 43: 105-117
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=A scalable processing-in-memory accelerator for parallel graph processing&author=Ahn J&author=Hong S&author=Yoo S&publication_year=2016&journal=SIGARCH Comput Archit News&volume=43&pages=105-117
[38]
Nai L, Hadidi R, Sim J, et al. Graphpim: enabling instruction-level pim offloading in graph computing frameworks. In: Proceedings of 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA). Austin: IEEE, 2017. 457--468.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Nai L, Hadidi R, Sim J, et al. Graphpim: enabling instruction-level pim offloading in graph computing frameworks. In: Proceedings of 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA). Austin: IEEE, 2017. 457--468&
[39]
Jang J, Heo J, Lee Y, et al. Charon: specialized near-memory processing architecture for clearing dead objects in memory. In: Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. Columbus: ACM, 2019. 726--739.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Jang J, Heo J, Lee Y, et al. Charon: specialized near-memory processing architecture for clearing dead objects in memory. In: Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. Columbus: ACM, 2019. 726--739&
[40]
Shafiee
A,
Nag
A,
Muralimanohar
N.
ISAAC.
SIGARCH Comput Archit News,
2016, 44: 14-26
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=ISAAC&author=Shafiee A&author=Nag A&author=Muralimanohar N&publication_year=2016&journal=SIGARCH Comput Archit News&volume=44&pages=14-26
[41]
Chen
Y H,
Krishna
T,
Emer
J S.
Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks.
IEEE J Solid-State Circuits,
2017, 52: 127-138
CrossRef
ADS
Google Scholar
http://scholar.google.com/scholar_lookup?title=Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks&author=Chen Y H&author=Krishna T&author=Emer J S&publication_year=2017&journal=IEEE J Solid-State Circuits&volume=52&pages=127-138
[42]
Hu M, Strachan J P, Li Z, et al. Dot-product engine for neuromorphic computing: programming 1T1M crossbar to accelerate matrix-vector multiplication. In: Proceedings of the 53rd annual design automation conference. Austin: ACM, 2016. 19.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Hu M, Strachan J P, Li Z, et al. Dot-product engine for neuromorphic computing: programming 1T1M crossbar to accelerate matrix-vector multiplication. In: Proceedings of the 53rd annual design automation conference. Austin: ACM, 2016. 19&
[43]
Cheng M, Xia L, Zhu Z, et al. Time: a training-in-memory architecture for memristor-based deep neural networks. In: Proceedings of the 54th Annual Design Automation Conference. Austin: ACM, 2017. 26.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Cheng M, Xia L, Zhu Z, et al. Time: a training-in-memory architecture for memristor-based deep neural networks. In: Proceedings of the 54th Annual Design Automation Conference. Austin: ACM, 2017. 26&
[44]
Mao H, Song M, Li T, et al. LerGAN: a zero-free, low data movement and PIM-based GAN architecture. In: Proceedings of 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). Fukuoka: IEEE, 2018. 669--681.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Mao H, Song M, Li T, et al. LerGAN: a zero-free, low data movement and PIM-based GAN architecture. In: Proceedings of 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). Fukuoka: IEEE, 2018. 669--681&
[45]
Ambrogio
S,
Narayanan
P,
Tsai
H.
Equivalent-accuracy accelerated neural-network training using analogue memory.
Nature,
2018, 558: 60-67
CrossRef
ADS
Google Scholar
http://scholar.google.com/scholar_lookup?title=Equivalent-accuracy accelerated neural-network training using analogue memory&author=Ambrogio S&author=Narayanan P&author=Tsai H&publication_year=2018&journal=Nature&volume=558&pages=60-67
[46]
Feinberg B, Vengalam U K R, Whitehair N, et al. Enabling scientific computing on memristive accelerators. In: Proceedings of 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). Los Angeles: IEEE, 2018. 367--382.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Feinberg B, Vengalam U K R, Whitehair N, et al. Enabling scientific computing on memristive accelerators. In: Proceedings of 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). Los Angeles: IEEE, 2018. 367--382&
[47]
Song L, Zhuo Y, Qian X, et al. GraphR: accelerating graph processing using ReRAM. In: Proceedings of 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). Vienna: IEEE, 2018. 531--543.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Song L, Zhuo Y, Qian X, et al. GraphR: accelerating graph processing using ReRAM. In: Proceedings of 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). Vienna: IEEE, 2018. 531--543&
[48]
Li S, Xu C, Zou Q, et al. Pinatubo: a processing-in-memory architecture for bulk bitwise operations in emerging non-volatile memories. In: Proceedings of the 53rd Annual Design Automation Conference. Austin: ACM, 2016. 173.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Li S, Xu C, Zou Q, et al. Pinatubo: a processing-in-memory architecture for bulk bitwise operations in emerging non-volatile memories. In: Proceedings of the 53rd Annual Design Automation Conference. Austin: ACM, 2016. 173&
[49]
Xie L, Nguyen H A D, Yu J, et al. Scouting logic: a novel memristor-based logic design for resistive computing. In: Proceedings of 2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI). Bochum: IEEE, 2017. 176--181.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Xie L, Nguyen H A D, Yu J, et al. Scouting logic: a novel memristor-based logic design for resistive computing. In: Proceedings of 2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI). Bochum: IEEE, 2017. 176--181&
[50]
Abu Lebdeh
M,
Abunahla
H,
Mohammad
B.
An Efficient Heterogeneous Memristive xnor for In-Memory Computing.
IEEE Trans Circuits Syst I,
2017, 64: 2427-2437
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=An Efficient Heterogeneous Memristive xnor for In-Memory Computing&author=Abu Lebdeh M&author=Abunahla H&author=Mohammad B&publication_year=2017&journal=IEEE Trans Circuits Syst I&volume=64&pages=2427-2437
[51]
Imani M, Kim Y, Rosing T. Mpim: multi-purpose in-memory processing using configurable resistive memory. In: Proceedings of 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC). Makuhari Messe: IEEE, 2017. 757--763.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Imani M, Kim Y, Rosing T. Mpim: multi-purpose in-memory processing using configurable resistive memory. In: Proceedings of 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC). Makuhari Messe: IEEE, 2017. 757--763&
[52]
Sim J, Kim M, Kim Y, et al. MAPIM: mat parallelism for high performance processing in non-volatile memory architecture. In: Proceedings of the 20th International Symposium on Quality Electronic Design (ISQED). Santa Clara: IEEE, 2019. 145--150.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Sim J, Kim M, Kim Y, et al. MAPIM: mat parallelism for high performance processing in non-volatile memory architecture. In: Proceedings of the 20th International Symposium on Quality Electronic Design (ISQED). Santa Clara: IEEE, 2019. 145--150&
[53]
Imani
M,
Peroni
D,
Rosing
T.
Nvalt: Nonvolatile Approximate Lookup Table for GPU Acceleration.
IEEE Embedded Syst Lett,
2018, 10: 14-17
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=Nvalt: Nonvolatile Approximate Lookup Table for GPU Acceleration&author=Imani M&author=Peroni D&author=Rosing T&publication_year=2018&journal=IEEE Embedded Syst Lett&volume=10&pages=14-17
[54]
Imani M, Gupta S, Arredondo A, et al. Efficient query processing in crossbar memory. In: Proceedings of 2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED). Taipei: IEEE, 2017. 1--6.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Imani M, Gupta S, Arredondo A, et al. Efficient query processing in crossbar memory. In: Proceedings of 2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED). Taipei: IEEE, 2017. 1--6&
[55]
Yantir
H E,
Eltawil
A M,
Kurdahi
F J.
Approximate Memristive In-memory Computing.
ACM Trans Embed Comput Syst,
2017, 16: 1-18
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=Approximate Memristive In-memory Computing&author=Yantir H E&author=Eltawil A M&author=Kurdahi F J&publication_year=2017&journal=ACM Trans Embed Comput Syst&volume=16&pages=1-18
[56]
Sun Y, Wang Y, Yang H. Energy-efficient SQL query exploiting RRAM-based process-in-memory structure. In: Proceedings of 2017 IEEE 6th Non-Volatile Memory Systems and Applications Symposium (NVMSA). Hsinchu: IEEE, 2017. 1--6.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Sun Y, Wang Y, Yang H. Energy-efficient SQL query exploiting RRAM-based process-in-memory structure. In: Proceedings of 2017 IEEE 6th Non-Volatile Memory Systems and Applications Symposium (NVMSA). Hsinchu: IEEE, 2017. 1--6&
[57]
Kim H, Kim H, Yalamanchili S, et al. Understanding energy aspects of processing-near-memory for HPC workloads. In: Proceedings of the 2015 International Symposium on Memory Systems. Washington: ACM, 2015. 276--282.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Kim H, Kim H, Yalamanchili S, et al. Understanding energy aspects of processing-near-memory for HPC workloads. In: Proceedings of the 2015 International Symposium on Memory Systems. Washington: ACM, 2015. 276--282&
[58]
Mao H, Zhang X, Sun G, et al. Protect non-volatile memory from wear-out attack based on timing difference of row buffer hit/miss. In: Proceedings of Design, Automation & Test in Europe Conference & Exhibition (DATE). Lausanne: IEEE, 2017. 1623--1626.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Mao H, Zhang X, Sun G, et al. Protect non-volatile memory from wear-out attack based on timing difference of row buffer hit/miss. In: Proceedings of Design, Automation & Test in Europe Conference & Exhibition (DATE). Lausanne: IEEE, 2017. 1623--1626&
[59]
Qureshi M K, Karidis J, Franceschini M, et al. Enhancing lifetime and security of PCM-based main memory with start-gap wear leveling. In: Proceedings of 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). New York: IEEE, 2009. 14--23.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Qureshi M K, Karidis J, Franceschini M, et al. Enhancing lifetime and security of PCM-based main memory with start-gap wear leveling. In: Proceedings of 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). New York: IEEE, 2009. 14--23&
[60]
Cai Y, Lin Y, Xia L, et al. Long live time: improving lifetime for training-in-memory engines by structured gradient sparsification. In: Proceedings of 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC). San Francisco: IEEE, 2018. 1--6.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Cai Y, Lin Y, Xia L, et al. Long live time: improving lifetime for training-in-memory engines by structured gradient sparsification. In: Proceedings of 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC). San Francisco: IEEE, 2018. 1--6&
[61]
Xia
L,
Huangfu
W,
Tang
T.
Stuck-at Fault Tolerance in RRAM Computing Systems.
IEEE J Emerg Sel Top Circuits Syst,
2018, 8: 102-115
CrossRef
ADS
Google Scholar
http://scholar.google.com/scholar_lookup?title=Stuck-at Fault Tolerance in RRAM Computing Systems&author=Xia L&author=Huangfu W&author=Tang T&publication_year=2018&journal=IEEE J Emerg Sel Top Circuits Syst&volume=8&pages=102-115
[62]
Xia L, Liu M, Ning X, et al. Fault-tolerant training with on-line fault detection for RRAM-based neural computing systems. In: Proceedings of the 54th Annual Design Automation Conference 2017. Austin: ACM, 2017. 1--6.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Xia L, Liu M, Ning X, et al. Fault-tolerant training with on-line fault detection for RRAM-based neural computing systems. In: Proceedings of the 54th Annual Design Automation Conference 2017. Austin: ACM, 2017. 1--6&
[63]
Liu C, Hu M, Strachan J P, et al. Rescuing memristor-based neuromorphic design with high defects. In: Proceedings of 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC). Austin: IEEE, 2017. 1--6.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Liu C, Hu M, Strachan J P, et al. Rescuing memristor-based neuromorphic design with high defects. In: Proceedings of 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC). Austin: IEEE, 2017. 1--6&
[64]
Ni L, Wang Y, Yu H, et al. An energy-efficient matrix multiplication accelerator by distributed in-memory computing on binary RRAM crossbar. In: Proceedings of 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC). Macao: IEEE, 2016. 280--285.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Ni L, Wang Y, Yu H, et al. An energy-efficient matrix multiplication accelerator by distributed in-memory computing on binary RRAM crossbar. In: Proceedings of 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC). Macao: IEEE, 2016. 280--285&
[65]
Wang Y, Tang T, Xia L, et al. Energy efficient RRAM spiking neural network for real time classification. In: Proceedings of the 25th edition on Great Lakes Symposium on VLSI. Pittsburgh: IEEE, 2015. 189--194.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Wang Y, Tang T, Xia L, et al. Energy efficient RRAM spiking neural network for real time classification. In: Proceedings of the 25th edition on Great Lakes Symposium on VLSI. Pittsburgh: IEEE, 2015. 189--194&
[66]
Narayanan S, Shafiee A, Balasubramonian R. INXS: bridging the throughput and energy gap for spiking neural networks. In: Proceedings of 2017 International Joint Conference on Neural Networks (IJCNN). Anchorage: IEEE, 2017. 2451--2459.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Narayanan S, Shafiee A, Balasubramonian R. INXS: bridging the throughput and energy gap for spiking neural networks. In: Proceedings of 2017 International Joint Conference on Neural Networks (IJCNN). Anchorage: IEEE, 2017. 2451--2459&
[67]
Ankit A, Sengupta A, Panda P, et al. Resparc: a reconfigurable and energy-efficient architecture with memristive crossbars for deep spiking neural networks. In: Proceedings of the 54th Annual Design Automation Conference 2017. Austin: ACM, 2017. 1--6.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Ankit A, Sengupta A, Panda P, et al. Resparc: a reconfigurable and energy-efficient architecture with memristive crossbars for deep spiking neural networks. In: Proceedings of the 54th Annual Design Automation Conference 2017. Austin: ACM, 2017. 1--6&
[68]
Xia L, Tang T, Huangfu W, et al. Switched by input: power efficient structure for RRAM-based convolutional neural network. In: Proceedings of the 53rd Annual Design Automation Conference. Austin: ACM, 2016. 1--6.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Xia L, Tang T, Huangfu W, et al. Switched by input: power efficient structure for RRAM-based convolutional neural network. In: Proceedings of the 53rd Annual Design Automation Conference. Austin: ACM, 2016. 1--6&
[69]
Mao H, Shu J. 3D Memristor Array Based Neural Network Processing in Memory Architecture. Journal of Computer Research and Development, 2019, 56(6): 1149-1160 doi: 10.7544/issn1000-1239.2019.20190099.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Mao H, Shu J. 3D Memristor Array Based Neural Network Processing in Memory Architecture. Journal of Computer Research and Development, 2019, 56(6): 1149-1160 doi: 10.7544/issn1000-1239.2019.20190099&
[70]
Ji Y, Zhang Y, Xie X, et al. Fpsa: a full system stack solution for reconfigurable reram-based nn accelerator architecture. In: Proceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems, Providence, 2019. 733--747.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Ji Y, Zhang Y, Xie X, et al. Fpsa: a full system stack solution for reconfigurable reram-based nn accelerator architecture. In: Proceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems, Providence, 2019. 733--747&
[71]
Witten
I H,
Frank
E.
Data mining.
SIGMOD Rec,
2002, 31: 76-77
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=Data mining&author=Witten I H&author=Frank E&publication_year=2002&journal=SIGMOD Rec&volume=31&pages=76-77
[72]
Sharif Razavian A, Azizpour H, Sullivan J, et al. CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Columbus: IEEE, 2014. 806--813.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Sharif Razavian A, Azizpour H, Sullivan J, et al. CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Columbus: IEEE, 2014. 806--813&
[73]
Manning C D., Surdeanu M, Bauer J, et al. The Stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations. Baltimore, 2014. 55--66.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Manning C D., Surdeanu M, Bauer J, et al. The Stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations. Baltimore, 2014. 55--66&
[74]
Schmidhuber
J.
Deep learning in neural networks: An overview.
Neural Networks,
2015, 61: 85-117
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=Deep learning in neural networks: An overview&author=Schmidhuber J&publication_year=2015&journal=Neural Networks&volume=61&pages=85-117
[75]
Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets. In: Proceedings of Advances in Neural Information Processing Systems, 2014. 2672--2680.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets. In: Proceedings of Advances in Neural Information Processing Systems, 2014. 2672--2680&
[76]
Kaelbling
L P,
Littman
M L,
Moore
A W.
Reinforcement Learning: A Survey.
jair,
1996, 4: 237-285
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=Reinforcement Learning: A Survey&author=Kaelbling L P&author=Littman M L&author=Moore A W&publication_year=1996&journal=jair&volume=4&pages=237-285
[77]
Low Y, Gonzalez J E, et al. Graphlab: A new framework for parallel machine learning. 2014,.
arXiv
Google Scholar
http://scholar.google.com/scholar_lookup?title=Low Y, Gonzalez J E, et al. Graphlab: A new framework for parallel machine learning. 2014,&
[78]
Low Y, Gonzalez J, Kyrola A, et al. Distributed graphlab: A framework for machine learning in the cloud. 2012,.
arXiv
Google Scholar
http://scholar.google.com/scholar_lookup?title=Low Y, Gonzalez J, Kyrola A, et al. Distributed graphlab: A framework for machine learning in the cloud. 2012,&
[79]
LeBeane M, Song S, Panda R, et al. Data partitioning strategies for graph workloads on heterogeneous clusters. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. Austin: IEEE, 2015. 1--12.
Google Scholar
http://scholar.google.com/scholar_lookup?title=LeBeane M, Song S, Panda R, et al. Data partitioning strategies for graph workloads on heterogeneous clusters. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. Austin: IEEE, 2015. 1--12&
[80]
Gonzalez J E, Low Y, Gu H, et al. Powergraph: distributed graph-parallel computation on natural graphs. In: Presented as Part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12), Hollywood, 2012. 17--30.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Gonzalez J E, Low Y, Gu H, et al. Powergraph: distributed graph-parallel computation on natural graphs. In: Presented as Part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12), Hollywood, 2012. 17--30&
[81]
Chen R, Shi J, Chen Y, et al. Powerlyra: Differentiated graph computation and partitioning on skewed graphs. ACM Transactions on Parallel Computing (TOPC), 2019, 5(3): 1-39.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Chen R, Shi J, Chen Y, et al. Powerlyra: Differentiated graph computation and partitioning on skewed graphs. ACM Transactions on Parallel Computing (TOPC), 2019, 5(3): 1-39&
[82]
Vigna G, Kemmerer R A. NetSTAT: a network-based intrusion detection approach. In: Proceedings of the 14th Annual Computer Security Applications Conference (Cat. No. 98EX217). Scottsdale: IEEE, 1998. 25--34.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Vigna G, Kemmerer R A. NetSTAT: a network-based intrusion detection approach. In: Proceedings of the 14th Annual Computer Security Applications Conference (Cat. No. 98EX217). Scottsdale: IEEE, 1998. 25--34&
[83]
Agichtein E, Castillo C, Donato D, et al. Finding high-quality content in social media. In: Proceedings of the 2008 International Conference on Web Search and Data Mining. Palo Alto: ACM, 2008. 183--194.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Agichtein E, Castillo C, Donato D, et al. Finding high-quality content in social media. In: Proceedings of the 2008 International Conference on Web Search and Data Mining. Palo Alto: ACM, 2008. 183--194&
[84]
Page L, Brin S, Motwani R, et al. The Pagerank Citation Ranking: Bringing Order to the Web. Stanford InfoLab, 1999.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Page L, Brin S, Motwani R, et al. The Pagerank Citation Ranking: Bringing Order to the Web. Stanford InfoLab, 1999&
[85]
Biemann C. Chinese whispers: an efficient graph clustering algorithm and its application to natural language processing problems. In: Proceedings of the 1st Workshop on Graph Based Methods for Natural Language Processing, 2006. 73--80.
Google Scholar
http://scholar.google.com/scholar_lookup?title=Biemann C. Chinese whispers: an efficient graph clustering algorithm and its application to natural language processing problems. In: Proceedings of the 1st Workshop on Graph Based Methods for Natural Language Processing, 2006. 73--80&
[86]
Chesler
E J,
Lu
L,
Shou
S.
Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function.
Nat Genet,
2005, 37: 233-242
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function&author=Chesler E J&author=Lu L&author=Shou S&publication_year=2005&journal=Nat Genet&volume=37&pages=233-242
[87]
Linden
G,
Smith
B,
York
J.
Amazon.com recommendations: item-to-item collaborative filtering.
IEEE Internet Comput,
2003, 7: 76-80
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=Amazon.com recommendations: item-to-item collaborative filtering&author=Linden G&author=Smith B&author=York J&publication_year=2003&journal=IEEE Internet Comput&volume=7&pages=76-80
[88]
Shendure
J,
Ji
H.
Next-generation DNA sequencing.
Nat Biotechnol,
2008, 26: 1135-1145
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=Next-generation DNA sequencing&author=Shendure J&author=Ji H&publication_year=2008&journal=Nat Biotechnol&volume=26&pages=1135-1145
[89]
Shendure
J,
Balasubramanian
S,
Church
G M.
DNA sequencing at 40: past, present and future.
Nature,
2017, 550: 345-353
CrossRef
ADS
Google Scholar
http://scholar.google.com/scholar_lookup?title=DNA sequencing at 40: past, present and future&author=Shendure J&author=Balasubramanian S&author=Church G M&publication_year=2017&journal=Nature&volume=550&pages=345-353
[90]
Altschul
S.
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
Nucleic Acids Res,
1997, 25: 3389-3402
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=Gapped BLAST and PSI-BLAST: a new generation of protein database search programs&author=Altschul S&publication_year=1997&journal=Nucleic Acids Res&volume=25&pages=3389-3402
[91]
Jha
M,
Malhotra
R,
Acharya
R.
A Generalized Lattice Based Probabilistic Approach for Metagenomic Clustering.
IEEE/ACM Trans Comput Biol Bioinf,
2017, 14: 749-761
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=A Generalized Lattice Based Probabilistic Approach for Metagenomic Clustering&author=Jha M&author=Malhotra R&author=Acharya R&publication_year=2017&journal=IEEE/ACM Trans Comput Biol Bioinf&volume=14&pages=749-761
[92]
Altschul
S F,
Gish
W,
Miller
W.
Basic local alignment search tool.
J Mol Biol,
1990, 215: 403-410
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=Basic local alignment search tool&author=Altschul S F&author=Gish W&author=Miller W&publication_year=1990&journal=J Mol Biol&volume=215&pages=403-410
[93]
Ning
Z.
SSAHA: A Fast Search Method for Large DNA Databases.
Genome Res,
2001, 11: 1725-1729
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=SSAHA: A Fast Search Method for Large DNA Databases&author=Ning Z&publication_year=2001&journal=Genome Res&volume=11&pages=1725-1729
[94]
Lancaster
J,
Buhler
J,
Chamberlain
R D.
Acceleration of ungapped extension in Mercury BLAST.
Microprocessors MicroSyst,
2009, 33: 281-289
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=Acceleration of ungapped extension in Mercury BLAST&author=Lancaster J&author=Buhler J&author=Chamberlain R D&publication_year=2009&journal=Microprocessors MicroSyst&volume=33&pages=281-289
[95]
Ling
C,
Benkrid
K.
Design and implementation of a CUDA-compatible GPU-based core for gapped BLAST algorithm.
Procedia Comput Sci,
2010, 1: 495-504
CrossRef
Google Scholar
http://scholar.google.com/scholar_lookup?title=Design and implementation of a CUDA-compatible GPU-based core for gapped BLAST algorithm&author=Ling C&author=Benkrid K&publication_year=2010&journal=Procedia Comput Sci&volume=1&pages=495-504