142 results on '"*PHASE change memory"'
Search Results
2. ML-HW Co-Design of Noise-Robust TinyML Models and Always-On Analog Compute-in-Memory Edge Accelerator.
- Author
-
Zhou, Chuteng, Redondo, Fernando Garcia, Buchel, Julian, Boybat, Irem, Comas, Xavier Timoneda, Nandakumar, S. R., Das, Shidhartha, Sebastian, Abu, Le Gallo, Manuel, and Whatmough, Paul N.
- Subjects
- *
PHASE change memory , *NONVOLATILE memory , *DATA conversion , *PARTICIPATORY design , *INTERNET of things - Abstract
Always-on TinyML perception tasks in Internet of Things applications require very high energy efficiency. Analog compute-in-memory (CiM) using nonvolatile memory (NVM) promises high energy efficiency and self-contained on-chip model storage. However, analog CiM introduces new practical challenges, including conductance drift, read/write noise, fixed analog-to-digital (ADC) converter gain, etc. These must be addressed to achieve models that can be deployed on analog CiM with acceptable accuracy loss. This article describes AnalogNets: TinyML models for the popular always-on tasks of keyword spotting (KWS) and visual wake word (VWW). The model architectures are specifically designed for analog CiM, and we detail a comprehensive training methodology, to retain accuracy in the face of analog nonidealities, and low-precision data converters at inference time. We also describe AON-CiM, a programmable, minimal-area phase-change memory (PCM) analog CiM accelerator, with a layer-serial approach to remove the cost of complex interconnects associated with a fully pipelined design. We evaluate the AnalogNets on a calibrated simulator, as well as real hardware, and find that accuracy degradation is limited to 0.8%/1.2% after 24 h of PCM drift (8 bits) for KWS/VWW. AnalogNets running on the 14-nm AON-CiM accelerator demonstrate 8.55/26.55/56.67 and 4.34/12.64/25.2 TOPS/W for KWS and VWWs with 8-/6-/4-bit activations, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
3. Multi-Dimensional Randomized Response.
- Author
-
Domingo-Ferrer, Josep and Soria-Comas, Jordi
- Subjects
- *
PHASE change memory - Abstract
In our data world, a host of not necessarily trusted controllers gather data on individual subjects. To preserve her privacy and, more generally, her informational self-determination, the individual has to be empowered by giving her agency on her own data. Maximum agency is afforded by local anonymization, that allows each individual to anonymize her own data before handing them to the data controller. Randomized response (RR) is a local anonymization approach able to yield multi-dimensional full sets of anonymized microdata that are valid for exploratory analysis and machine learning. This is so because an unbiased estimate of the distribution of the true data of individuals can be obtained from their pooled randomized data. Furthermore, RR offers rigorous privacy guarantees. The main weakness of RR is the curse of dimensionality when applied to several attributes: as the number of attributes grows, the accuracy of the estimated true data distribution quickly degrades. We propose several complementary approaches to mitigate the dimensionality problem. First, we present two basic protocols, separate RR on each attribute and joint RR for all attributes, and discuss their limitations. Then we introduce an algorithm to form clusters of attributes so that attributes in different clusters can be viewed as independent and joint RR can be performed within each cluster. After that, we introduce an adjustment algorithm for the randomized data set that repairs some of the accuracy loss due to assuming independence between attributes when using RR separately on each attribute or due to assuming independence between clusters in cluster-wise RR. We also present empirical work to illustrate the proposed methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
4. SWEL-COFAE : Wear Leveling and Adaptive Encoding Assisted Compression of Frequent Words in Non-Volatile Main Memories.
- Author
-
Nath, Arijit and Kapoor, Hemangee K.
- Subjects
- *
PHASE change memory , *DYNAMIC random access memory , *RANDOM access memory , *ENCODING , *MEMORY - Abstract
Emerging Non-Volatile memories such as Phase Change Memory (PCM) and Resistive RAM are projected as potential replacements of the traditional DRAM-based main memories. However, limited write endurance and high write energy limit their chances of adoption as a mainstream main memory standard. Therefore, developing solutions that enhance the lifetime of these memories while offering a decent system performance has a great impact in building future large capacity and energy-efficient main memories. In this paper, we propose a word-level compression scheme called COMF to reduce bitflips in PCMs by removing the most repeated words from the cache blocks before writing into memory. COMF is augmented with an adaptive granularity-based encoding technique to form COFAE, which reduces the bitflips to a further extent. We also propose SWEL-COFAE, an intra-line stride-based wear leveling technique to improve lifetime by balancing the bitflip pressure within the cells of the memory lines. Experimental results show that the proposed technique improves lifetime by 101% and reduces bitflips and energy by 59% and 61% respectively over baseline. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
5. Multilayered Sb-Rich GeSbTe Phase-Change Memory for Best Endurance and Reduced Variability.
- Author
-
Lama, Giusy, Bernard, Mathieu, Bourgeois, Guillaume, Garrione, Julien, Meli, Valentina, Castellani, Niccolo, Sabbione, Chiara, Prazakova, Lucie, Fernandez Rodas, Diana-Stephany, Nolot, Emmanuel, Cyrille, Marie Claire, Andrieu, Francois, and Navarro, Gabriele
- Subjects
- *
PHASE change memory , *PHASE change materials , *REFERENCE sources - Abstract
Sb-rich GeSbTe-based phase-change memories (PCMs) were studied in the past years for their high switching speed to target storage class memory (SCM) applications. In this work, we show the advantages of an engineered multilayered Sb-rich GeSbTe stack compared with standard bulk reference materials. The studied multilayer-based PCM devices feature a lower programming current with respect to the equivalent bulk ones, preserving a high programming speed. Furthermore, multilayered Sb-rich GeSbTe brings better endurance performances for a wide programming current range and extremely reduced cycle-to-cycle (C2C) and device-to-device (D2D) variability along cycling verified in 4 kb PCM arrays. These results confirm improved yield and reliability obtained, thanks to multilayered PCM solution. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
6. Leveraging Write Heterogeneity of Phase Change Memory on Supporting Self-Balancing Binary Tree.
- Author
-
Chang, Che-Wei, Wu, Chun-Feng, Chang, Yuan-Hao, Yang, Ming-Chang, and Chang, Chieh-Fu
- Subjects
- *
PHASE change memory , *DYNAMIC random access memory , *NONVOLATILE memory , *HETEROGENEITY , *RANDOM access memory , *ENERGY consumption - Abstract
With the increasing demand of massive/big data applications, nonvolatile memory (NVM), such as phase-change memory (PCM), has become a promising candidate to replace DRAM because of its low leakage power, nonvolatility, and high density. However, most of the existing memory read/write intensive algorithms and data structures are not aware of the PCM write heterogeneity in terms of both energy consumption and latency. In particular, self-balancing binary search trees, which are widely used to manage massive data in the big-data era, were designed without the consideration of PCM characteristics. Thus, the multiple rotations of the tree balancing process would degrade the memory performance. This work explores the relations among nodes and analyzes tree operations, and the node indexing and address mapping are redesigned to reduce the tree management overhead on single-level cell (SLC) PCM by decreasing the number of bit flips of tree rotations. When multilevel cell (MLC) PCM is included, our address mapping algorithm is developed to reduce the total energy consumption and latency with considerations of the heterogeneous write operations of different cell states. Experimental results show that our solution significantly outperforms the original implementation of a self-balancing binary search tree when the amount of data is large. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
7. An Adaptive Memory-Side Encryption Method for Improving Security and Lifetime of PCM-Based Main Memory.
- Author
-
Soltani, Morteza, Kamal, Mehdi, Afzali-Kusha, Ali, and Pedram, Massoud
- Subjects
- *
PHASE change memory , *RECURRENT neural networks , *MEMORY , *RANDOM access memory - Abstract
In this article, we present a main memory system for improving the lifetime and security of phase-change main memories. Storing encrypted data increases the bit-flip rates in memory cells, which adversely affects the lifetime of the phase-change memory cells. Thus, to improve the lifetime and security, the proposed system reduces the bit-flip rates by introducing two techniques. The first technique is a memory-side encryption which provides security against DIMM stealing attacks. To prevent unauthorized accesses, in this technique, the encrypted data are not saved in the main memory. As the second technique, we suggest an adaptive partial encryption approach, which makes use of behavior tracking of the application in the CPU side to minimize the latency overhead of the first technique. Additionally, it prevents the loss of data against application-based attacks. This technique uses a recurrent neural network (RNN) to do sequence classification and detect malicious applications. In addition, an auxiliary method, called periodic encryption (PE), which overcomes the security loss in some applications induced by the low accuracy of the employed neural network, is presented. The efficacy of the proposed method is evaluated using gem5 simulator and some benchmarks. Compared to DEUCE and Crypto-Comp methods, the results for the lifetime evaluation show an average bit-flip rate reduction of 11%. In addition, the security improvements against the DIMM stealing and application-based attacks are about 100% and 92.5%, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
8. CEnT: An Efficient Architecture to Eliminate Intra-Array Write Disturbance in PCM.
- Author
-
Imran, Muhammad, Kwon, Taehyun, Touba, Nur A., and Yang, Joon-Sung
- Subjects
- *
PHASE change memory , *DYNAMIC random access memory , *ENERGY consumption , *RANDOM access memory , *PHASE change materials - Abstract
Phase Change Memory (PCM), with its better scaling potential compared to DRAM, is seen as a promising candidate to replace or complement DRAM. The heat generated from a RESET programming pulse to a PCM cell can disturb the neighboring cells which are not being programmed. Write disturbance (WD) poses a critical reliability challenge in high-density PCM memory with scaling below 20nm process technology node. Increasing the intra-cell space can eliminate the WD, however, it reduces the storage density which counteracts the benefits of scalability in PCM. At architectural level, a verify and correct (VnC) technique can be used to address this problem. However, this leads to an increased number of write operations, thus degrading performance, energy efficiency and memory lifetime. Due to its dependence on the type of programming operation and the state of the neighboring cell, WD is a data-dependent problem. Exploiting this property, encoding techniques have been proposed to reduce the frequency of WD-vulnerable data patterns. These techniques, however, do not eliminate the WD in an array and ultimately rely on the VnC method to ensure reliable memory operation. This article introduces a novel architecture, based on encoding and multi-level programming characteristics of PCM, to eliminate the intra-array WD in PCM. By eliminating WD and hence the need for a VnC operation, the proposed architecture improves performance, energy efficiency and memory lifetime. Our evaluation of the proposed architecture shows an average reduction of 57 percent in the number of writes (to service one write request) over the existing state-of-the-art intra-array WD-mitigation technique. Depending on the PCM write bandwidth, the proposed architecture can reduce the write service time by up to 27 percent, on average, compared to the existing best-performing technique. This leads to an average improvement of 15 percent in IPC. Additionally, by eliminating the overhead of a verify operation, the write energy efficiency is also improved by 8 percent over the previous art. Finally, with an average reduction of 26 percent in bit flips, the proposed method also improves the memory lifetime. The proposed method is also proven to be effective when considering WD both within the word-lines and across the bit-lines. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
9. Endurance-Limited Memories: Capacity and Codes.
- Author
-
Chee, Yeow Meng, Horovitz, Michal, Vardy, Alexander, Vu, Van Khu, and Yaakobi, Eitan
- Subjects
- *
PHASE change memory , *NONVOLATILE random-access memory - Abstract
Resistive memories, such as phase change memories and resistive random access memories have attracted significant attention in recent years due to their better scalability, speed, rewritability, and yet non-volatility. However, their limited endurance is still a major drawback that has to be improved before they can be widely adapted in large-scale systems. In this work, in order to reduce the wear out of the cells, we propose a new coding scheme, called endurance-limited memories (ELM) codes, that increases the endurance of these memories by limiting the number of cell programming operations. Namely, an $\ell $ -change $t$ -write ELM code is a coding scheme that allows to write $t$ messages into some $n$ binary cells while guaranteeing that each cell is programmed at most $\ell $ times. In case $\ell =1$ , these codes coincide with the well-studied write-once memory (WOM) codes. We study some models of these codes which depend upon whether the encoder knows on each write the number of times each cell was programmed, knows only the memory state, or even does not know anything. For the decoder, we consider these similar three cases. We fully characterize the capacity regions and the maximum sum-rates of three models where the encoder knows on each write the number of times each cell was programmed. In particular, it is shown that in these models the maximum sum-rate is $\log \sum _{i=0}^{\ell } {\binom{t }{ i}}$. We also study and expose the capacity regions of the models where the decoder is informed with the number of times each cell was programmed. Finally we present the most practical model where the encoder read the memory before encoding new data and the decoder has no information about the previous states of the memory. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
10. CMOS-Compatible Low-Power Gated Diode Synaptic Device for Hardware- Based Neural Network.
- Author
-
Park, Min-Kyu, Yoo, Ho-Nam, Hwang, Joon, Woo, Sung Yun, Kwon, Dongseok, Seo, Young-Tak, Lee, Jong-Ho, and Bae, Jong-Ho
- Subjects
- *
PHASE change memory , *DIODES , *INFERIOR colliculus - Abstract
A gated diode with a charge trap insulator stack (Al2O3/Si3N4/SiO2) is proposed as a synaptic device and its potentiation and depression operations have been demonstrated. Using the band-to-band tunneling current, the gated diode operates with low current (in nanoampere range) and is suitable for low-power hardware-based neural networks. Since the proposed device has merits on simple and compact structure (half of a MOSFET) and compatibility with conventional CMOS technology, integration with CMOS peripheral circuits including neuron circuits and driving IC is possible. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
11. A Drift-Resilient Hardware Implementation of Neural Accelerators Based on Phase Change Memory Devices.
- Author
-
Munoz-Martin, Irene, Bianchi, Stefano, Melnic, Octavian, Bonfanti, Andrea Giovanni, and Ielmini, Daniele
- Subjects
- *
PHASE change memory , *COMPUTER storage devices , *ARTIFICIAL neural networks , *DEEP learning , *BIOLOGICAL neural networks , *PHASE change materials - Abstract
Memory devices, such as the phase change memory (PCM), have recently shown significant breakthroughs in terms of compactness, 3-D stacking capability, and speed up for deep learning neural accelerators. However, PCM is affected by the conductance drift, which prevents a precise definition of the synaptic weights in artificial neural networks. Here, we propose an efficient system-level methodology to develop drift-resilient multilayer perceptron (MLP) networks. The procedure guarantees high testing accuracy under conductance drift of the devices and enables the use of only positive weights. We validate the methodology using MNIST, rand-MNIST, and Fashion-MNIST datasets, thus offering a roadmap for the implementation of integrated nonvolatile memory-based neural networks. We finally analyze the proposed architecture in terms of throughput and energy efficiency. This work highlights the relevance of robust PCM-based design of neural networks for improving the computational capability and optimizing energetic efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
12. Thermoelectric Effects on Amorphization Process of Blade-Type Phase Change Random Access Memory.
- Author
-
Lian, Xiaojuan, Fu, Jinke, Gao, Zhixuan, and Wang, Lei
- Subjects
- *
PHASE change memory , *THERMOELECTRIC effects , *THERMOELECTRIC apparatus & appliances , *AMORPHIZATION , *RANDOM access memory , *THERMOELECTRIC generators , *EBULLITION - Abstract
Commercial prospect of phase-change random access memory (PCRAM), considered as an encouraging option for future “universal” memory, is however challenged by its large programming current that severely impairs device scalability. The recent advent of the blade-type PCRAM configuration can effectively address this issue by shrinking the heated region. However, the thermoelectric (TE) effects that play a key role in write performances of conventional Lance-type PCRAM have not been systematically studied for this blade-type case to date. The influence of TE effects on the amorphization process of the blade-type PCRAM cell is therefore investigated here through the establishment of a comprehensive 3-D electrothermal and phase-transformation model. Two main typical TE effects, i.e., Thomson heat inside the Ge2Sb2Te5(GST) bulk region and Peltier heat at the GST-TiN heater interface, have been mimicked and discussed in detail. Simulation results show that both TE effects depend on the polarities of the programming currents, and the calculated Thomson heat is more than 20 times as much as Peltier heat. Accordingly, a novel skutterudite-based heater that exhibits more pronounced TE effects than TiN heater has been proposed. The practicality of reducing programming current by 13.8% and peak power by 17.9% is demonstrated using the improved version of the PCRAM cell. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
13. Fully On-Chip MAC at 14 nm Enabled by Accurate Row-Wise Programming of PCM-Based Weights and Parallel Vector-Transport in Duration-Format.
- Author
-
Narayanan, P., Ambrogio, S., Okazaki, A., Hosokawa, K., Tsai, H., Nomura, A., Yasuda, T., Mackin, C., Lewis, S. C., Friz, A., Ishii, M., Kohda, Y., Mori, H., Spoon, K., Khaddam-Aljameh, R., Saulnier, N., Bergendahl, M., Demarest, J., Brew, K. W., and Chan, V.
- Subjects
- *
PHASE change memory , *DEEP learning , *ARTIFICIAL intelligence , *PHASE change materials - Abstract
Hardware acceleration of deep learning using analog non-volatile memory (NVM) requires large arrays with high device yield, high accuracy Multiply-ACcumulate (MAC) operations, and routing frameworks for implementing arbitrary deep neural network (DNN) topologies. In this article, we present a 14-nm test-chip for Analog AI inference—it contains multiple arrays of phase change memory (PCM)-devices, each array capable of storing 512 $\times $ 512 unique DNN weights and executing massively parallel MAC operations at the location of the data. DNN excitations are transported across the chip using a duration representation on a parallel and reconfigurable 2-D mesh. To accurately transfer inference models to the chip, we describe a closed-loop tuning (CLT) algorithm that programs the four PCM conductances in each weight, achieving <3% average weight-error. A row-wise programming scheme and associated circuitry allow us to execute CLT on up to 512 weights concurrently. We show that the test chip can achieve near-software-equivalent accuracy on two different DNNs. We demonstrate tile-to-tile transport with a fully-on-chip two-layer network for MNIST (accuracy degradation ~0.6%) and show resilience to error propagation across long sequences (up to 10 000 characters) with a recurrent long short-term memory (LSTM) network, implementing off-chip activation and vector-vector operations to generate recurrent inputs used in the next on- chip MAC. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
14. PCM-Based Analog Compute-In-Memory: Impact of Device Non-Idealities on Inference Accuracy.
- Author
-
Sun, X., Khwa, W. S., Chen, Y. S., Lee, C. H., Lee, H. Y., Yu, S. M., Naous, R., Wu, J. Y., Chen, T. C., Bao, X., Chang, M. F., Diaz, C. H., Wong, H.-S. P., and Akarvardar, K.
- Subjects
- *
PHASE change memory , *PHASE change materials - Abstract
The impact of phase change memory (PCM) device non-idealities on the deep neural network (DNN) inference accuracy is systematically investigated. Based on the experimental PCM data, statistical models of device non-idealities were extracted and incorporated into our PyTorch-based simulation framework for evaluations on the CIFAR-10 dataset. Our specific results include: 1) nonlinear ${I}$ – ${V}$ could incur a significant accuracy degradation, but it can be eliminated depending on how the input activations are encoded (e.g., no degradation with pulse-encoding schemes); 2) resistance variation and read noise induce a relatively mild accuracy degradation (< 1% with experimentally fit model), which can be further mitigated through variation-aware training (VAT); 3) maximizing accuracy over a given operating temperature range is attained through a “temperature-specific weight remapping” method developed in this work, accuracy variance of < 3% is demonstrated over a temperature range of ${T} \pm 15 ^{\circ }\text{C}$ ; and 4) resistance drift leads to a significant accuracy degradation over time and is the most challenging non-ideality to address by algorithmic means alone (drift coefficient < 0.015 is needed to achieve < 3% degradation in ten years). A “weight transfusion” (WT) method has been proposed to effectively recover the inference accuracy by incrementally activating additional pre-trained neurons over time. The main overhead is the additional area to store pre-trained weights beforehand, which is likely affordable given the high density of MLC PCM. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
15. A CASTLE With TOWERs for Reliable, Secure Phase-Change Memory.
- Author
-
Longofono, Stephen, Kline, Donald, Melhem, Rami, and Jones, Alex K.
- Subjects
- *
PHASE change memory , *CASTLES , *DATA encryption , *ENERGY density , *CLOUD computing - Abstract
The use of hardware encryption and new memory technologies such as phase change memory (PCM) are gaining popularity in a variety of server applications such as cloud systems. While PCM provides energy and density advantages over conventional DRAM memory, it faces endurance challenges. Such challenges are exacerbated when employing memory encryption as the stored data is essentially randomized, losing data similarity and reducing or eliminating the effectiveness of energy and endurance oriented encoding techniques. This results in increasing dynamic energy consumption and accelerated wear out. In this article, we propose CASTLE, a technique for in-memory encryption to leverage this encryption process to improve reliability in the presence of endurance faults. We also propose TOWERs for CASTLE that improve reliability as well as energy for encrypted data through a novel application of compression and encoding. CASTLE and TOWERs are compatible with error-correction codes (ECC) and error correction pointers (ECP), the standard for mitigating endurance faults in PCM. When combining CASTLE and TOWERS, we achieve an average lifetime improvement of over 45× compared to SECDED ECC, 7.1× compared to SECRET, and 3.6× compared to the leading partition-and-flip fault-tolerance approach (AEGIS) for the same area overhead. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
16. Reliability Enhanced Heterogeneous Phase Change Memory Architecture for Performance and Energy Efficiency.
- Author
-
Kwon, Taehyun, Imran, Muhammad, and Yang, Joon-Sung
- Subjects
- *
PHASE change memory , *ENERGY consumption , *FLASH memory , *VIDEO coding , *PHASE change materials - Abstract
Next-generation memories have been actively researched to replace the existing memories like DRAM and flash in deep sub-micron process technology. Unlike the conventional charge-based memories, next-generation memories utilize the resistive properties of different materials to store and read a data. Among the next-generation memories, Phase Change Memory (PCM) is seen as a good choice for future memory systems, given its good read performance, process compatibility and scaling potential. To enhance the storage density, multi-level cell (MLC) operation is seemed promising which can store more than one bit in each PCM cell. However, MLC operation significantly degrades the reliability of PCM, thus requiring a strong Error Correction Code (ECC) to guarantee correct memory operation. The use of heavyweight ECC comes at cost of significant degradations in storage density, performance and energy efficiency. In this article, we propose a heterogeneous PCM architecture which uses both multi-level cell and single-level cell (SLC) together for a single word line. With highly-reliable SLC cells, the overall array reliability is enhanced. To improve the reliability further, a dynamic self-encoding/decoding scheme is performed before the data is written to the PCM cells. The dynamic scheme automatically determines the locations of MLC and SLC cells and sets the corresponding resistance levels to be programmed. Since the proposed encoding/decoding scheme does not require any additional stages or storages for encoding and decoding, the overhead is negligible. The improved reliability allows to use lighter ECC scheme which in turn helps to improve performance and energy efficiency of the MLC PCM. The experimental results show that the reliability is improved by approximately 106 times compared to the conventional 4LC and more than 103 times compared to the existing encoding methods. The performance improvement is 21.5 percent over the conventional 4LC and is more than 4.1 percent higher than the prior encoding techniques. The proposed method is 30.3 percent more energy efficient than the conventional 4LC and this is similar or higher than other energy efficiency improvement methods. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
17. Noise-Resilient DNN: Tolerating Noise in PCM-Based AI Accelerators via Noise-Aware Training.
- Author
-
Kariyappa, Sanjay, Tsai, Hsinyu, Spoon, Katie, Ambrogio, Stefano, Narayanan, Pritish, Mackin, Charles, Chen, An, Qureshi, Moinuddin, and Burr, Geoffrey W.
- Subjects
- *
PHASE change memory , *ARTIFICIAL intelligence , *NOISE , *PHASE change materials , *RECURRENT neural networks - Abstract
Phase change memory (PCM)-based “Analog-AI” accelerators are gaining importance for inference in edge applications due to the energy efficiency offered by in-memory computing. Nevertheless, noise sources inherent to PCM devices cause inaccuracies in the deep neural network (DNN) weight values. Such inaccuracies can lead to severe degradation in model accuracy. To address this, we propose two techniques to improve noise resiliency of DNNs: 1) drift regularization (DR) and 2) multiplicative noise training (MNT). We evaluate convolutional networks trained on image classification and recurrent neural networks trained on language modeling and show that our techniques improve model accuracy by up to 12% over one month. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
18. Table of Contents.
- Subjects
- *
DEEP learning , *WIDE gap semiconductors , *SUPERVISED learning , *INTEGRATED circuit design , *SEMICONDUCTOR manufacturing , *PHASE change memory - Published
- 2023
- Full Text
- View/download PDF
19. Radiation Effects in Advanced and Emerging Nonvolatile Memories.
- Author
-
Marinella, Matthew J.
- Subjects
- *
PHASE change memory , *NONVOLATILE memory , *NONVOLATILE random-access memory , *FLASH memory , *RANDOM access memory - Abstract
Despite hitting major roadblocks in 2-D scaling, NAND flash continues to scale in the vertical direction and dominate the commercial nonvolatile memory market. However, several emerging nonvolatile technologies are under development by major commercial foundries or are already in small volume production, motivated by storage-class memory and embedded application drivers. These include spin-transfer torque magnetic random access memory (STT-MRAM), resistive random access memory (ReRAM), phase change random access memory (PCRAM), and conductive bridge random access memory (CBRAM). Emerging memories have improved resilience to radiation effects compared to flash, which is based on storing charge, and hence may offer an expanded selection from which radiation-tolerant system designers can choose from in the future. This review discusses the material and device physics, fabrication, operational principles, and commercial status of scaled 2-D flash, 3-D flash, and emerging memory technologies. Radiation effects relevant to each of these memories are described, including the physics of and errors caused by total ionizing dose, displacement damage, and single-event effects, with an eye toward the future role of emerging technologies in radiation environments. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
20. Modeling and Simulations of the Integrated Device of Phase Change Memory and Ovonic Threshold Switch Selector With a Confined Structure.
- Author
-
Chen, Ziqi, Tong, Hao, Cai, Wang, Wang, Lun, and Miao, Xiangshui
- Subjects
- *
PHASE change memory , *PHASE change materials , *SIMULATION methods & models - Abstract
We present a finite-element model for the confined-structure device integrating a phase change memory (PCM) and an ovonic threshold switch (OTS) selector. In this model, the threshold switching (TS) characteristics of the PCM and OTS were described by an embedded numerical model to simulate the operation of the integrated device. Both the SET and RESET processes have been well implemented in the integrated device by simulating. The electronic properties of the integrated device with various OTS material parameters have been investigated by simulating. Based on the simulated results, a moderate set-pulse has been obtained by optimizing only the OTS conductivity at a high-conductivity state. Further simulations for multilevel storage have been carried out in the integrated device based on the optimized OTS. The results indicate the confined-structure device with a larger length–diameter ratio will result in a more flexible operation window for multilevel storage. Particularly, when the length–diameter ratio of the confined-structure is 2:1 in the integrated device, five levels of device resistance could be obtained in the simulations of multilevel storage by applying multiple set-pulse or reset-pulse. This could guide further studies on the multilevel storage. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
21. Influence of Cu Doping in Si–Te-Based Chalcogenide Glasses and Thin Films: Electrical Switching, Morphological and Raman Studies.
- Author
-
Roy, Diptoshi, Tanujit, B., Jagannatha, K. B., Asokan, S., and Das, Chandasree
- Subjects
- *
CHALCOGENIDE glass , *THIN films , *PHASE change memory , *THRESHOLD voltage , *RAMAN lasers - Abstract
To understand the electrical switching behavior of Si15Te85-x Cux (1 ≤ x ≤ 10) series, I – V characterization has been performed on bulk as well as amorphous thin films of the as-prepared samples. Both the bulk glasses and amorphous thin films are found to manifest memory-type switching behavior. The threshold voltages of thin-film devices are found to be much lower than the bulk counterparts and hence could find application for phase change memory (PCM). The composition analyses of both have divulged the existence of intermediate phase (IP) in the composition range of 2 ≤ x ≤ 6. To examine the probability of the given glass for PCM application, Set–Reset studies have been performed on the glasses with a triangular pulse of 6 mA for set operation and rectangular pulse of 12 mA for the reset operation. The study has revealed a continuous repetition of few Set–Reset cycle by the Si–Te–Cu series. Raman studies carried out on the bulk glasses report the occurrence of blue shift over the composition in a regular manner. Further, SEM studies have been carried out on Si–Te–Cu samples to understand the morphological changes that would have occurred during switching. Additionally, thickness dependence of threshold voltage of representative Si15Te80Cu5 and Si15Te76Cu9 glasses has been carried out to reveal the relationship between threshold voltage and thickness. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
22. Self-Referenced Single-Ended Resistance Monitoring Write Termination Scheme for STT-RAM Write Energy Reduction.
- Author
-
Choi, Sara, Ahn, Hong Keun, Song, Byungkyu, Kang, Seung H., and Jung, Seong-Ook
- Subjects
- *
PHASE change memory , *RANDOM access memory , *DETECTOR circuits - Abstract
Essential design requirements for a sense amplifier (SA) used in the resistance monitoring write termination (RM-WT) scheme are suggested to reduce the write energy of spin-transfer-torque random access memory (STT-RAM) while achieving a write pass yield comparable to that of a conventional write operation. In addition, a self-referenced single-ended RM-WT (SS-RM-WT) scheme is proposed. To reduce the offset voltage, a single-ended sensing circuit (SE-SC) is used in the SA. A data-aware input voltage-transfer method is also adopted in the SE-SC to maximize the input voltage difference. By adopting a capacitor between the output of the SE-SC and the input of an inverter generating a logical output used for the write termination, the conflict between maintaining and changing the output of the SE-SC is resolved. The simulation results using the industry-compatible 65-nm technology HSPICE model parameters show that the proposed SS-RM-WT scheme achieves a 44% write energy saving on average without increasing the write error rate. Area overhead is only 11.8% for a 256-kb STT-RAM array, whereas that of the previous self-referenced RM-WT schemes is up to 42.5%. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
23. Pattern-Aware Encoding for MLC PCM Storage Density, Energy Efficiency, and Performance Enhancement.
- Author
-
Kwon, Taehyun, Imran, Muhammad, and Yang, Joon-Sung
- Subjects
- *
PHASE change memory , *ENERGY consumption , *DYNAMIC random access memory , *SOFT errors , *ERROR correction (Information theory) , *ENCODING , *ERROR rates - Abstract
With the scaling limitations and increasing leakage power of the existing charge-based memories, next-generation memory technologies to overcome the issues are in development. Among the various emerging memories, phase change memory (PCM) is considered as a promising candidate due to its scalability potential and negligible leakage power. For enhanced storage density, the multilevel cell (MLC) operation has been proposed for PCM. This, however, comes at cost of poor reliability, write energy increase and performance degradation. Unlike DRAM, the MLC PCM has a much higher soft error rate due to the resistance drift phenomenon. Error correction code (ECC) schemes can be utilized to improve the MLC PCM reliability, however, this would lead to a lower storage density and an increase in write energy and latency. The iterative programming required for the MLC PCM also degrades its energy efficiency and performance. This paper introduces a simple yet effective encoding scheme to mitigate the problems of the MLC PCM. By using a simple XOR-based encoding, the proposed architecture minimizes the most drift-prone state in the data. The method divides the original data into several encoding blocks and analyzes initial pattern frequencies for each 2-bit pattern. Based on the initial pattern frequencies, the inputs for the XOR encoding are selected that result in minimal frequency of the drift-prone state. This considerably enhances the MLC PCM reliability, leading to a high storage density with a reduced ECC overhead. The energy efficiency and performance are also improved due to reduction in iterative current pulses and ECC overhead. The simulation results show a reduction of about $10^{5}$ X in soft error rate. The improvements in energy efficiency and performance over the conventional 4-level cell (4LC) PCM are 11.5% and 31.9%, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
24. Global Clean Page First Replacement and Index-Aware Multistream Prefetcher in Hybrid Memory Architecture.
- Author
-
Lin, Ing-Chao, Chang, Da-Wei, Chen, Wei-Jun, Ke, Jian-Ting, and Huang, Po-Han
- Subjects
- *
DYNAMIC random access memory , *PHASE change memory , *NONVOLATILE memory - Abstract
As cloud computing and big data applications become more popular, the demand for large capacity memory and data preservation in memory increases. Therefore, nonvolatile memory (NVM) with high capacity is being actively developed. A hybrid memory that comprises both NVM and DRAM and provides both high access speed and nonvolatility has become a major trend. However, compared to DRAM, NVM in the hybrid memory typically suffers from a shorter lifetime and higher latency. To improve the lifetime and address the latency issues associated with hybrid memory, we propose a global clean page first replacement (GCPF) to reduce the write operations to NVM. We also propose an index-aware multistream prefetcher (IAMSP) that considers the indexes of prefetch candidates individually so as to prefetch pages from NVM more accurately. Benchmarks with a large memory footprint are used to evaluate the proposed schemes. The experimental results show that GCPF enhances lifetime by 56.8% as compared to LRU, on average. When applying prefetching schemes on GCPF, the lifetime is insignificantly degraded. In addition, IAMSP reduces DRAM misses by 42.0% as compared to LRU, while a modern prefetcher that can change the prefetch degree dynamically only reduces DRAM misses by 38.0%, on average. When applying both GCPF and IAMSP, the average access latency can be reduced by 28.8% as compared to LRU. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
25. RowHammer: A Retrospective.
- Author
-
Mutlu, Onur and Kim, Jeremie S.
- Subjects
- *
DYNAMIC random access memory , *PHASE change memory , *FLASH memory , *SCIENTIFIC literature - Abstract
This retrospective paper describes the RowHammer problem in dynamic random access memory (DRAM), which was initially introduced by Kim et al. at the ISCA 2014 Conference. RowHammer is a prime (and perhaps the first) example of how a circuit-level failure mechanism can cause a practical and widespread system security vulnerability. It is the phenomenon that repeatedly accessing a row in a modern DRAM chip causes bit flips in physically adjacent rows at consistently predictable bit locations. RowHammer is caused by a hardware failure mechanism called DRAM disturbance errors, which is a manifestation of circuit-level cell-to-cell interference in a scaled memory technology. Researchers from Google Project Zero demonstrated in 2015 that this hardware failure mechanism can be effectively exploited by user-level programs to gain kernel privileges on real systems. Many other follow-up works demonstrated other practical attacks exploiting RowHammer. In this paper, we comprehensively survey the scientific literature on RowHammer-based attacks as well as mitigation techniques to prevent RowHammer. We also discuss what other related vulnerabilities may be lurking in DRAM and other types of memories, e.g., NAND flash memory or phase change memory, that can potentially threaten the foundations of secure systems, as the memory technologies scale to higher densities. We conclude by describing and advocating a principled approach to memory reliability and security research that can enable us to better anticipate and prevent such vulnerabilities. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
26. Signal Integrity Design and Analysis of 3-D X-Point Memory Considering Crosstalk and IR Drop for Higher Performance Computing.
- Author
-
Son, Kyungjune, Cho, Kyungjun, Kim, Subin, Park, Shinyoung, Jung, Daniel H., Park, Junyong, Park, Gapyeol, Kim, Seongguk, Shin, Taein, Kim, Youngwoo, and Kim, Joungho
- Subjects
- *
HIGH performance computing , *CROSSTALK , *INFRARED absorption , *PHASE change memory , *ANALOG-to-digital converters , *MEMORY - Abstract
In this article, we, for the first time, used signal integrity (SI) to design and analyze 3-D X-Point memory, including a phase-change memory (PCM) cell, ovonic threshold switch (OTS) selector, interconnection lines, and peripheral circuits. With the narrow space and the long interconnection lines that come with 20-nm process technology, crosstalk and IR drop can degrade the voltage margin of the memory cell and affect the memory operation. For SI analysis considering crosstalk and IR drop, the unit size of the memory array tile was considered in designing the interconnection lines. Crosstalk and IR drop are analyzed using full 3-D electromagnetic and circuit simulations. To cover practical conditions, the PCM cell and OTS selector are modeled as behavior models using Verilog-A modules, respectively. Also, the word lines (WLs) and bit lines (BLs) of 3-D X-Point memory are modeled to resistance and capacitance by ANSYS Q3D extractor. The core peripheral circuits, such as decoder, sense amplifier, and analog-to-digital converter, are included in the circuit simulation. To verify the proposed design and analysis, a transient simulation was conducted considering crosstalk and IR drop of 3-D X-Point memory. A tradeoff relationship between crosstalk and IR drop in the interconnection designs was verified. Additionally, to suppress crosstalk and reduce IR drop, the new design of the interconnection lines considering the tradeoff between SI issues is proposed. The newly proposed interconnection design shows 30% improvement in the voltage margin considering the IR drop issues and under 10% enhancement of crosstalk noise. It is expected that the SI analysis and design methodologies could be widely applied in other new memory developments. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
27. Interfacial Resistance Characterization for Blade-Type Phase Change Random Access Memory.
- Author
-
Wen, Jing and Wang, Lei
- Subjects
- *
PHASE change memory , *INTERFACIAL resistance , *INTERFACIAL bonding , *RANDOM access memory - Abstract
The blade-type phase-change random access memory (PCRAM) has recently attained considerable interest due to its potential for providing low programming current, while its interfacial resistance (IR) characteristics that play an important role in temperature and programming current for conventional PCRAMs is yet to be deeply studied. To achieve this, a completely 3-D electro-thermal and phase-transformation model with inclusions of thermal boundary resistance (TBR) and electrical IR (EIR) at different layered interfaces were developed to assess the influence of the IRs on phase-transformation kinetics. It was found that the TBR at the chalcogenide/insulation interface as well as interfacial size dominates the resulting programming current that is almost independent of the TBR and EIR at chalcogenide/heater interface. In this case, an optimized blade-type device having platinum silicide heater and superlattice insulated encapsulation was proposed, allowing for a 38% reduction on “SET” current and a 40% reduction on “RESET” current, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
28. A Compact Phase Change Memory Model With Dynamic State Variables.
- Author
-
Hu, Huifang, Liu, Dayong, Chen, Xuhui, Dong, Deqi, Cui, Xiaole, Liu, Ming, Lin, Xinnan, Zhang, Lining, and Chan, Mansun
- Subjects
- *
PHASE change memory , *PHASE change materials , *DYNAMIC models - Abstract
A SPICE model for phase change memories (PCM) without relying on macro modules is developed in this work. The crystal fraction, physical geometry, and the conduction path of the amorphous region are treated as dynamic state variables to keep track of the memory cell status during SET and RESET. The memory cell resistance is calculated based on a detail 3-D resistance model to capture its transitional behavior during switching. The detail physical formulation correctly reproduced a recent observation of oscillation during the SET operation. The model has been implemented in SPICE, and the convergence of the model is demonstrated by simulations of a complete PCM array. The use of dynamic state variables also significantly reduces the number of internal nodes to one, which helps convergence and reduces the simulation time. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
29. Integration and Boost of a Read-Modify-Write Module in Phase Change Memory System.
- Author
-
Lee, Hyokeun, Kim, Moonsoo, Kim, Hyunchul, Kim, Hyun, and Lee, Hyuk-Jae
- Subjects
- *
PHASE change memory , *DYNAMIC random access memory , *FLASH memory , *COMPUTER storage devices , *RANDOM access memory , *PHASE change materials - Abstract
Phase-change memory (PCM) is a non-volatile memory device with favorable characteristics such as persistence, byte-addressability, and lower latency when compared to flash memory. However, it comprises memory cells that have limited lifetime and higher access latency than DRAM. The row buffer size of a PCM is preferred to be larger than 128B to fill the latency gap between two memories and to reduce the metadata overhead incurred by wear leveling. As the cache line size in a general-purpose processor is 64B, a read-modify-write (RMW) module is required to be placed between the processor and the PCM, which in turn induces a performance degradation. To reduce such an overhead and enhance the reliability of a device, this paper presents a new RMW architecture. The proposed model introduces a DRAM cache in the RMW module, which minimizes redundant read operations for write operations by pre-fetching the entire transaction unit instead of merely caching the 64B requested data. Furthermore, a typeless merge operation is performed with the proposed cache by gathering multiple commands accessing consecutive addresses, irrespective of whether they are READ or WRITE. Simulation results indicate that the proposed method enhances the speed by 3.2 times and the reliability by 49 percent as compared to the baseline model. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
30. DC-PCM: Mitigating PCM Write Disturbance with Low Performance Overhead by Using Detection Cells.
- Author
-
Choi, Jungwhan, Jang, Jaemin, and Kim, Lee-Sup
- Subjects
- *
PHASE change memory , *PULSE-code modulation , *RANDOM access memory , *CELLS , *PHASE change materials , *DYNAMIC random access memory - Abstract
As DRAM scaling becomes ever more difficult, Phase Change Memory (PCM) is attracting attention as a new memory or storage class memory. Unfortunately, PCM cell data can be changed by frequently writing ‘0’ to adjacent cells. This phenomenon is called Write Disturbance (WD). To mitigate WD errors with low performance overhead, we propose a Detection Cell PCM (DC-PCM). In the DC-PCM, additional cells called Detection Cells (DC) are allocated to a memory-line to pre-detect WD errors. For pre-detection, we propose schemes that give DCs higher WD-vulnerability than normal cells. However, additional time is needed to verify DCs. To hide the time needed to perform the verifications during a WRITE, DC-PCM enables the local word-lines of DCs to operate independently (Decoupled Word-line), and verifies different directions in parallel (Parallel DC-Verification). After verification, the DC-PCM increases the WD-vulnerability of the DCs, or restores the memory-line data (DC-Correction). In our simulation, DC-PCMs showed performance comparable to a WD-free PCM for all workloads. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
31. Next-Generation Ultrahigh-Density 3-D Vertical Resistive Switching Memory (VRSM)—Part II: Design Guidelines for Device, Array, and Architecture.
- Author
-
Jiang, Zizhen, Qin, Shengjun, Li, Haitong, Fujii, Shosuke, Lee, Dongjin, Wong, Simon, and Wong, H.-S. Philip
- Subjects
- *
NONVOLATILE random-access memory , *ARCHITECTURE , *PHASE change memory , *EXERCISE tolerance , *NEXT generation networks - Abstract
Using the reduced resistor network developed in Part I of this two-part article, we present practical design guidelines from device to architecture levels to achieve ultrahigh-density 3-D vertical resistive switching memory (VRSM). We first design both hexagon and comb arrays using 7-nm FinFET as pillar driving transistors (pillar drivers). Small-footprint pillar drivers are necessary for a high pillar areal density competitive to 3-D NAND. We then organize the arrays into an architecture using the compact staircase and highly conductive wordplane connection (WPC) to maximize array efficiency and chip density. We investigate the memory and selector requirements, tolerance of parasitic resistances, latency, and energy consumption for both hexagon and comb architectures. The results indicate that the hexagon array with large low-resistance state (LRS) and nonlinearity (NL) is required for ultradense 3-D VRSM. Compared to the comb array, the hexagon array benefits from a continuous WP pattern and yields a better tolerance of parasitic resistances and a smaller latency. The energy consumptions of both architectures are similar. Compared to the most advanced 3-D NAND, 3-D VRSM has higher chip density and shows better potential for future ultradense storage. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
32. A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform.
- Author
-
Manduhu, Manduhu and Jones, Mark W.
- Subjects
- *
EUCLIDEAN algorithm , *PARALLEL programming , *PARALLEL algorithms , *EUCLIDEAN distance , *PHASE change memory - Abstract
A fully-parallelized work-time optimal algorithm is presented for computing the exact Euclidean Distance Transform (EDT) of a 2D binary image with the size of $n\times n$. Unlike existing PRAM (Parallel Random Access Machine) and other algorithms, this algorithm is suitable for implementation on modern SIMD (Single Instruction Multiple Data) architectures such as GPUs. As a fundamental operation of 2D EDT, 1D EDT is efficiently parallelized first. Specifically, the GPU algorithm for the 1D EDT, which uses CUDA (Compute Unified Device Architecture) binary functions, such as ballot(), ffs(), clz(), and shfl(), runs in $O(log_{32}n)$ time and performs $O(n)$ work. Using the 1D EDT as a fundamental operation, the fully-parallelized work-time optimal 2D EDT algorithm is designed. This algorithm consists of three steps. Step 1 of the algorithm runs in $O(log_{32}n)$ time and performs $O(N)$ ($N = n^{2}$) of total work on GPU. Step 2 performs $O(N)$ of total work and has an expected time complexity of $O(logn)$ on GPU. Step 3 runs in $O(log_{32}n)$ time and performs $O(N)$ of total work on GPU. As far as we know, this algorithm is the first fully-parallelized and realized work-time optimal algorithm for GPUs. The experimental results show that this algorithm outperforms the prior state-of-the-art GPU algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
33. Differential Spin Hall Effect-Based Nonvolatile Static Random Access Memory for Energy-Efficient and Fast Data Restoration Application.
- Author
-
Shreya, Sonal and Kaushik, Brajesh Kumar
- Subjects
- *
NONVOLATILE random-access memory , *STATIC random access memory , *SPIN Hall effect , *PHASE change memory , *MAGNETIC tunnelling , *HALL effect , *OPERATIONS research - Abstract
Spintronics, being a burgeoning area of research, aims to incorporate magnetic tunnel junction (MTJ), as a basic storage building block, to various electronic applications. In continuation of many device structure using MTJ such as spin-transfer torque (STT)-MTJ, spin Hall effect (SHE)-MTJ, spin–orbit torque (SOT)-MTJ, domain wall-based MTJ, and complementary polarizer MTJ, this paper presents a differential spin Hall effect (DSHE). The working and operational analysis of the DSHE-based memory element is presented. It provides 50% improved write energy and more than 1.5 times faster read as compared to a single-ended SHE-MTJ. The device structure is well suited for various differential circuit applications, for example, nonvolatile static random access memory (NVSRAM), nonvolatile flip flop (NVFF), magnetic full adder, and nonvolatile differential sense latch, with a fast and energy-efficient operation. In addition, DSHE-MTJ application for NVSRAM (named as DSNVM) is proposed. Performance of DSNVM is compared with the STT+SHE-based NVSRAM (named as SHENVM). DSNVM shows improved performance in terms of area overhead, restoration delay, and energy. DSNVM provides 40% faster restoration and 16.7% lesser energy as compared to SHENVM. Furthermore, a computational investigation for cell stability is depicted using butterfly curve and N-curve methods. Usually, write noise margins deteriorate in NVSRAMs due to the constituent NV cell. However, read as well as write noise margin is improved in DSNVM. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
34. Improving the Lifetime of Non-Volatile Cache by Write Restriction.
- Author
-
Agarwal, Sukarn and Kapoor, Hemangee K.
- Subjects
- *
CACHE memory , *PHASE change memory - Abstract
The attractive features such as low static power and high density exhibited by the Non-Volatile Memory (NVM) technologies makes them a promising candidate in the memory hierarchy, including caches. However, the limited write endurance with the write variations governed by the access patterns and the applied replacement policies reduce the chance of NVMs as a successor of SRAM. These write variations are of concern as they not only breakdown the NVM cells but also reduce the effective lifetime. This paper proposes efficient techniques to mitigate the intra-set write variation to improve the lifetime of the NVM cache. Our first two techniques partition the cache into windows of equal size and distribute the writes uniformly across the cache set by employing the window as write-restricted or read-only. The selection of the window in these techniques is by rotation or with the help of counters. In our third technique, different cache ways are employed as a write-restricted over the period of execution to distribute the writes uniformly. Experimental results using full system simulation show the significant reduction in intra-set write variation along with improvement in the cache lifetime. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
35. Energy Bandpass Filtering in Superlattice Phase Change Memories.
- Author
-
Bahl, Jyotsna, Priyadarshi, Pankaj, and Muralidharan, Bhaskaran
- Subjects
- *
PHASE change memory , *BANDPASS filters , *SUPERLATTICES , *GERMANIUM telluride , *GREEN'S functions , *RESONANT states , *PHASE change materials - Abstract
We propose energy bandpass filtering employed using the idea of antireflection heterostructures as means to reduce the energy requirements of a superlattice phase change memory (PCM) based on germanium telluride (GeTe) and Sb2Te3 heterostructures. Different configurations of GeTe/Sb2Te3 superlattices are studied using the nonequilibrium Green’s function approach. Our electronic transport simulations calculate the coupling parameter for the high-resistance covalent state, to 97% that of the stable low-resistance resonant state, maintaining the ON/OFF ratio of 100 for a reliable read operation. By examining various configurations of the superlattice structures, we conclude that the inclusion of antireflection units on both sides of the superlattice increases the overall ON/OFF ratio by an order of magnitude which can further help in scaling down of the memory device. It is also observed that the device with such antireflection units exhibits 32% lesser RESET voltage than the most common PCM superlattice configurations. Moreover, we also find that the ON/OFF ratio in these devices is also resilient to the variations in the periodicity of the superlattice. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
36. Sparse-Insertion Write Cache to Mitigate Write Disturbance Errors in Phase Change Memory.
- Author
-
Jang, Jaemin, Shin, Wongyu, Choi, Jungwhan, Kim, Yongju, and Kim, Lee-Sup
- Subjects
- *
PHASE change memory , *DYNAMIC random access memory , *COMPUTER systems - Abstract
As the number of datasets processed in computing systems has increased in recent years, there is growing demand for high capacity main memory subsystems. However, further increases in the capacity of conventional DRAM-based main memory systems have stalled due to scaling limitations. Recent studies have shown that PCM, which can provide greater capacity than DRAM, is emerging as a candidate for high capacity memory. However, PCM suffers from problems related to the thermal mechanisms employed for storing data. The Write Disturbance (WD) phenomenon occurs when the thermal mechanisms of the PCM severely damage the data reliability of proximate cells. WD in PCM has become more significant below 20 nm. In this paper, we propose Sparse-Insertion Write Cache (SIWC), a practical, low-cost approach to mitigate WD errors. In PCM, repeated writes gradually degrade the validity of data in neighboring cells. SIWC uses a private write cache for the PCM write data to prevent repeated writes to the same address. The sparse-insertion technique can reduce cache eviction and minimize increases in the total write count. Our experimental results show that SIWC effectively reduces repeated writes and reduces the number of WD-vulnerable addresses across a wide range of applications. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
37. Exploring Cycle-to-Cycle and Device-to-Device Variation Tolerance in MLC Storage-Based Neural Network Training.
- Author
-
Lee, Jung-Hoon, Lim, Dong-Hyeok, Jeong, Hongsik, Ma, Huimin, and Shi, Luping
- Subjects
- *
MULTILAYER perceptrons , *PHASE change memory , *ARTIFICIAL neural networks , *RANDOM access memory - Abstract
A multilevel cell (MLC) memristor that provides high-density on-chip memory has become a promising solution for energy-efficient artificial neural networks (ANNs). However, MLC storage that stores multiple bits per cell is prone to device variation. In this paper, the device variation tolerance of ANN training is investigated based on our cell-specific variation modeling method, which focuses on characterizing realistic cell-level variation. The parameters of cycle-to-cycle variation (CCV) and device-to-device variation (DDV) are extracted separately from the experimental data of a 39-nm, 1-Gb phase-change random access memory (PCRAM) array. A quantized neural network designed for low bit-width (≥6-bit) training is used for simulations to demonstrate the potential of MLC storage. Our results demonstrate that training is more vulnerable to DDV than CCV, and CCV can even compensate for accuracy degradation caused by severe DDV. As a result, for a multilayer perceptron (MLP) on Modified National Institute of Standards and Technology (MNIST) database, 95% accuracy can be achieved with three MLC PCRAM devices per weight, which is a 40% reduction in the number of cells compared with using conventional single-level cells (SLCs). If the size of DDV is reduced by half, then only two cells, that is 60% fewer cells than using SLC, are needed. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
38. Adaptive Quantization as a Device-Algorithm Co-Design Approach to Improve the Performance of In-Memory Unsupervised Learning With SNNs.
- Author
-
Shi, Yuhan, Huang, Zhisheng, Oh, Sangheon, Kaslan, Nathan, Song, Jungwoo, and Kuzum, Duygu
- Subjects
- *
NONVOLATILE memory , *PHASE change memory , *ARTIFICIAL neural networks , *PULSE modulation , *ALGORITHMS - Abstract
Off-chip memory access is the primary bottleneck toward accelerating neural network operations and reducing energy consumption. In-memory training and computation using emerging nonvolatile memories (eNVMs) have been proposed to address this problem. However, a small number of conductance states limit in-memory online learning performance. Here, we introduce a device-algorithm co-design approach and its application to phase change memory (PCM) for improving learning accuracy. We present an adaptive quantization method, which compensates the accuracy loss due to limited conductance levels and enables high-accuracy unsupervised learning with low-precision eNVM devices. We develop a spiking neural network framework for NeuroSim platform to compare online learning performance of PCM arrays for analog and digital implementations and benchmark the tradeoffs in energy consumption, latency, and area. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
39. Coding for Write Latency Reduction in a Multi-Level Cell (MLC) Phase Change Memory (PCM).
- Author
-
Namba, Kazuteru and Lombardi, Fabrizio
- Subjects
- *
PHASE change memory , *NONVOLATILE random-access memory , *X-ray diffraction , *HEAT storage , *NUMERICAL analysis - Abstract
This paper presents a new write latency reduction scheme for a Phase Change Memory (PCM) made of Multi-Level Cells (MLCs). This scheme improves over an existing scheme found in the technical literature and known as CABS. The proposed scheme is based on the utilization of a new coding arrangement for the selection of candidate codewords. The code relies on the two-step feature found in the write operation of a MLC PCM and avoids the symbol that incurs in the largest latency at a higher rate than CABS. A detailed simulation based evaluation and comparison are also pursued; the proposed scheme accomplishes improvements in write latency (for parallel writing) as well as coding rate (16/17 for the proposed scheme versus 16/18 for CABS for 16 symbols or 32-bit word). As the proposed scheme utilizes novel selection criteria for the candidates, the design of the required circuitry (encoder and decoder) has also been changed with respect to CABS; in terms of hardware, the areas of the encoder and decoder for the proposed scheme are reduced by 73 and 56 percent respectively compared with CABS. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
40. A Study on OTS-PCM Pillar Cell for 3-D Stackable Memory.
- Author
-
Chien, Wei-Chih, Yeh, Chiao-Wen, Bruce, Robert L., Cheng, Huai-Yu, Kuo, I. T., Yang, Chih-Hsiang, Ray, A., Miyazoe, Hiroyuki, Kim, W., Carta, Fabio, Lai, Erh-Kun, BrightSky, Matthew J., and Lung, Hsiang-Lan
- Subjects
- *
PHASE change memory , *COMPUTER performance , *NAND gates , *DYNAMIC random access memory , *METAL oxide semiconductor field-effect transistors - Abstract
High endurance ovonic threshold switch (OTS, here, TeAsGeSiSe-based) is integrated with phase change memory (PCM, here, doped Ge2Sb2Te5) to form a 3-D stackable pillar-type device. With the help of an etch buffer layer and a damage-free pillar reactive-ion etching process, we successfully demonstrate one-selector (OTS)/one-resistor (PCM) (1S1R OTS-PCM) pillar device without OTS/PCM composition modification. High temperature 400 °C annealing tests show this 1S1R OTS-PCM pillar device is back end of line compatible. We report the fundamental behavior of the OTS and the operation scheme of the 1S1R OTS-PCM device. The new Vth read scheme is proposed and excellent electrical performance is demonstrated. It provides the fast turn ON/ OFF speed which enables 10-ns fast RESET speed. Program endurance greater than 109 cycles is achieved, and read endurance is higher than 1011 cycles. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
41. Energy Management of Applications With Varying Resource Usage on Smartphones.
- Author
-
Mukherjee, Anway and Chantem, Thidapat
- Subjects
- *
SMARTPHONES , *ENERGY management , *ENERGY consumption of computers , *APPLICATION software , *PHASE change memory , *NEXUS 6 (Smartphone) , *RUN time systems (Computer science) - Abstract
The split-screen mode in smartphones allows for the simultaneous side-by-side execution of multiple applications, which permits multitasking and improves users’ experience. However, such technology results in simultaneously running multiple foreground processes, which increases the power consumption of a smartphone and reduces its battery lifetime. We present an integrated system-level resource management framework that aims to minimize the total energy consumption of a smartphone with negligible impact on the quality of service (QoS) of applications whose resource usage characteristics are not precisely known offline or vary over time. Our proposed solution: 1) leverages applications’ offline profiles to detect instantaneous phase changes (i.e., dynamic changes in resource usage patterns) of the workload of a given application at runtime and 2) adaptively adjusts both voltage and frequency settings of the processor and memory bandwidth to achieve the most energy-efficient configuration subject to QoS constraints. Our approach is also able to progressively reduce the energy consumption of newly installed real-world applications for which there exists no prior resource usage data. Experiments on a Nexus 6 smartphone show that our approach achieves an average energy reduction of 23% (19%) and up to 31% (27%) compared to existing work (and default Android governor) for different combinations of real-world applications running side-by-side in split-screen mode. For applications with no prior resource usage data, the proposed framework saves up to 22% (18%) of energy within at most 14 s when compared to existing work (and default Android governor). [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
42. Comprehensive Phase-Change Memory Compact Model for Circuit Simulation.
- Author
-
Pigot, Corentin, Bocquet, Marc, Gilibert, Fabien, Reyboz, Marina, Cueto, Olga, Marca, Vincenzo Della, Zuliani, Paola, and Portal, Jean-Michel
- Subjects
- *
PHASE change memory , *SIMULATION methods & models , *INTEGRATED circuits , *ELECTRIC potential , *MEMRISTORS - Abstract
In this paper, a new continuous multilevel compact model for phase-change memory (PCM) is proposed. It is based on the modified rate equations with the introduction of a variable related to material melting. The model is evaluated using a large set of dynamic measurements and shows a good accuracy with a single model card. All fitting parameters are discussed, and their impacts are detailed. Full circuit simulation is performed. Good convergence and fast simulation time suggest that this new compact model can be exploited for PCM circuit design. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
43. DyPhase: A Dynamic Phase Change Memory Architecture With Symmetric Write Latency and Restorable Endurance.
- Author
-
Thakkar, Ishan G. and Pasricha, Sudeep
- Subjects
- *
PHASE change memory , *DYNAMIC random access memory , *ON-chip charge pumps , *CHALCOGENIDES , *COMPUTER storage devices - Abstract
A major challenge for the widespread adoption of phase change memory (PCM) as main memory is its asymmetric write latency. Generally, for a PCM, the latency of a SET operation (i.e., an operation that writes “1”) is 2–5 times longer than the latency of a RESET operation (i.e., an operation that writes “0”). For this reason, the average write latency of a PCM system is limited by the high-latency SET operations. This paper presents a novel PCM architecture called DyPhase, which uses partial-SET operations instead of the conventional SET operations to introduce a symmetry in write latency, thereby increasing write performance and throughput. However, use of partial-SET decreases data retention time. As a remedy to this problem, DyPhase employs novel distributed refresh operations in PCM that leverage the available power budget to periodically rewrite the stored data with minimal performance overhead. Unfortunately, the use of periodic refresh operations increases the write rate of the memory, which in turn accelerates memory degradation and decreases its lifetime. DyPhase overcomes this shortcoming by utilizing a proactive in-situ self-annealing (PISA) technique that periodically heals degraded memory cells, resulting in decelerated degradation and increased memory lifetime. Experiments with PARSEC benchmarks indicate that our DyPhase architecture-based hybrid dynamic random access memory (DRAM)–PCM memory system, when enabled with PISA, yields orders of magnitude higher lifetime, 8.3% less CPI, and 44.3% less EDP on average over other hybrid DRAM–PCM memory systems that utilize PCM architectures from prior works. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
44. Novel Magnetic Tunneling Junction Memory Cell With Negative Capacitance-Amplified Voltage-Controlled Magnetic Anisotropy Effect.
- Author
-
Lang Zeng, Tianqi Gao, Deming Zhang, Shouzhong Peng, Lezhi Wang, Fanghui Gong, Xiaowan Qin, Mingzhi Long, Youguang Zhang, Wang, Kang L., and Weisheng Zhao
- Subjects
- *
SPIN transfer torque , *SPIN-polarized currents , *RANDOM access memory , *COMPUTER storage devices , *PHASE change memory - Abstract
The high current density required by magnetic tunneling junction (MTJ) switching driven by the spin transfer torque (STT) effect leads to large power consumption and severe reliability issues, hindering the timetable for STT magnetic random access memory to mass market. By utilizing the voltage-controlled magnetic anisotropy (VCMA) effect, the MTJ can be switched by the voltage effect and is postulated to achieve ultralow power (fJ). However, the VCMA coefficient measured in experiments cannot meet the requirement for MTJ with dimensions below 100 nm. And an external in-plane magnetic field usually is demanded for precessional VCMA switching. Here, in this paper, a novel approach for the amplification of the VCMA effect, which borrows ideas from negative capacitance, is proposed. The feasibility of the proposal is proved by physical simulation and in-depth analysis. Since the amplified VCMA effect, the external magnetic field can be eliminated. A three-terminal novel MTJ memory cell is designed with which both low power and high speed can be achieved. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
45. Device and Circuit Interaction Analysis of Stochastic Behaviors in Cross-Point RRAM Arrays.
- Author
-
Haitong Li, Peng Huang, Gao, Bin, Xiaoyan Liu, Jinfeng Kang, and Philip Wong, H.-S
- Subjects
- *
RANDOM access memory , *COMPUTER storage devices , *PHASE change memory , *NONVOLATILE random-access memory , *FLASH memory - Abstract
Stochastic behaviors of resistive random access memory (RRAM) play an important role in the design of cross-point memory arrays. A Monte Carlo (MC) compact model of oxide RRAM is developed and calibrated with experiments on various device stack configurations. With MC SPICE simulations, we show that an increase in array size and interconnect wire resistance will statistically deteriorate write functionality. Write failure probability (WFP) has an exponential dependence on device uniformity and supply voltage (VDD), and the array bias scheme is a key knob. Lowering array VDD leads to higher effective energy consumption (EEC) due to the increase in WFP when the variation statistics are included in the analysis. Random access simulations indicate that data sparsity statistically benefit write functionality and energy consumption. Finally, we show that a pseudo-subarray topology with uniformly distributed preforming cells in the pristine high-resistance state is able to reduce both WFP and EEC, enabling higher net capacity for memory circuits due to improved variation tolerance. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
46. Durable and Energy Efficient In-Memory Frequent-Pattern Mining.
- Author
-
Liu, Duo, Lin, Yi, Huang, Po-Chun, Zhu, Xiao, and Liang
- Subjects
- *
DATA mining , *PATTERNS (Mathematics) , *NONVOLATILE memory , *ENERGY consumption of computers , *PHASE change memory - Abstract
It is a significant problem to efficiently identify the frequently occurring patterns in a given dataset, so as to unveil the trends hidden behind the dataset. This paper is motivated by the serious demands of a high-performance in-memory frequent-pattern mining strategy, with joint optimization over the mining performance and system durability. While the widely used frequent-pattern tree (FP-tree) serves as an efficient approach for frequent-pattern mining, its construction procedure often makes it unfriendly for nonvolatile memories (NVMs). In particular, the incremental construction of FP-tree could generate many unnecessary writes to the NVM and greatly degrade the energy efficiency, because NVM writes typically take more time and energy than reads. To overcome the drawbacks of FP-tree on NVMs, this paper proposes evergreen FP-tree (EvFP-tree), which includes a lazy counter and a minimum-bit-altered (MBA) encoding scheme to make FP-tree friendly for NVMs. The basic idea of the lazy counter is to greatly eliminate the redundant writes generated in FP-tree construction. On the other hand, the MBA encoding scheme is to complement existing wear-leveling techniques to evenly write each memory cell to extend the NVM lifetime. As verified by experiments, EvFP-tree greatly enhances the mining performance and system lifetime by 40.28% and 87.20% on average, respectively. And EvFP-tree reduces the energy consumption by 50.30% on average. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
47. Non-volatile translation layer for PCM+NAND in wearable devices.
- Author
-
Kwon, Se Jin
- Subjects
- *
RANDOM access memory , *FLASH memory , *PHASE change memory , *COMPUTER programming , *PERSONAL computers - Abstract
Recently, there have been approaches of using phase change memory (PCM) for the wearable devices. PCM can prolong the lifetime of the wearable devices, because it can endure approximately 108 writes per cell. Unfortunately, because previous well-known software translation algorithms were designed to use DRAM as the main memory, they execute frequent write operations on the PCM. As a solution, this paper proposes a software layer called ?load-balancing flash translation layer (Load-FTL)? that enhances the performance of the PCM-based wearable devices by efficiently identifying the hot data and managing them in the PCM. Furthermore, Load-FTL prolongs the durability of the PCM using a window-based wear-leveling algorithm. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
48. Dual-Layer Dielectric Stack for Thermally Isolated Low-Energy Phase-Change Memory.
- Author
-
Fong, Scott W., Neumann, Christopher M., Yalon, Eilam, Rojo, Miguel Munoz, Pop, Eric, and Wong, H.-S Philip
- Subjects
- *
PHASE change memory , *THERMAL conductivity measurement , *ISOLATED systems (Thermodynamics) , *SILICA-alumina catalysts , *DIELECTRIC devices testing - Abstract
High reset energy is an ongoing issue for phase-change memory (PCM) devices. Prior work demonstrates that smaller PCM switching volume and thermal isolation can reduce the reset energy. In this paper, we fabricate and measure a planar confined PCM device with a multilayer dual-layer stack (D'S) of SiO2/Al2O3 insulator. Devices with contact area of 500 × 20 nm and lengths of 2 μm show exceptionally low reset energies of 18.25 ± 15.8 pJ and low reset current densities of 0.94 ± 0.51 MA/cm2. Implementing the D'S enables a 60% reduction in reset energy compared with SiO2-isolated devices. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
49. Phase-Change Memory—Towards a Storage-Class Memory.
- Author
-
Fong, Scott W., Neumann, Christopher M., and Wong, H.-S Philip
- Subjects
- *
PHASE change memory , *NAND gates , *DYNAMIC random access memory , *CHALCOGENIDES , *FLASH memory - Abstract
Phase-change memory (PCM) has undergone significant academic and industrial research in the last 15 years. After much development, it is now poised to enter the market as a storage-class memory (SCM), with performance and cost between that of NAND flash and DRAM. In this paper, we review the history of phase-transforming chalcogenides leading up to our current understanding of PCM as either a storage-type SCM, with high-density and better than NAND flash endurance, write speeds, and retention, or a memory-type SCM, with fast read/write times to function as a nonvolatile DRAM. Several of the key findings from the community relating to device dimensional scaling, cell design, thermal engineering, material exploration, and storing multiple levels per cell will be discussed. These areas have dramatically impacted the course of development and understanding of PCM. We will highlight the performance gains attained and the future prospects, which will help drive PCM to be as ubiquitous as NAND flash in the upcoming decade. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
50. HL-PCM: MLC PCM Main Memory with Accelerated Read.
- Author
-
Arjomand, Mohammad, Jadidi, Amin, Kandemir, Mahmut T., Sivasubramaniam, Anand, and Das, Chita R.
- Subjects
- *
PHASE change memory , *COMPUTER storage devices , *COMPUTER performance , *MICROPROCESSOR performance , *READ-only memory - Abstract
Multi-Level Cell Phase Change Memory (MLC PCM) is a promising candidate technology for DRAM replacement in main memory of modern computers. Despite of its high density and low power advantages, this technology seriously suffers from slow read and write operations. While prior works extensively studied the problem of slow write, this paper targets high read latency problem in MLC PCM and introduces an architecture mechanism to overcome it. To this end, we rely on the fact that reading different bits from an MLC cell takes different latencies, i.e., for a 2-bit MLC, reading its Most-Significant Bit (MSB) is fast, while reading its Least-Significant Bits (LSBs) is slower. We then propose Half-Line PCM (HL-PCM), a novel memory architecture that leverages this non-uniformity in reading MLC PCM’s content to send a requested memory block to the processor in different cycles–it sends half of a memory block to the processor ahead of the other half. If the processor requested a word belonging to the first half, it can resume its execution on receiving the first half, while the other half has not sent yet and scheduled to be received by the memory controller later. HL-PCM is easy and simple to implement, i.e., it needs minor modifications at memory controller, the search/evict policies at last level cache, as well as data layout in main memory. Our experimental results show that the proposed design improves the average memory access latency by 33–43 percent and program’s execution time by 23 percent, on average, while incurring negligible overhead at memory controller and PCM DIMM, in a 16-core chip multiprocessor (CMP) running memory-intensive benchmarks. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.