Start Over

MnnFast

Authors :: Joonsung Kim
Hanhwi Jang
Jaewon Lee
Jangwoo Kim
Jae-Eon Jo
Source :: ISCA
Publication Year :: 2019
Publisher :: ACM, 2019.
Abstract: Memory-augmented neural networks are getting more attention from many researchers as they can make an inference with the previous history stored in memory. Especially, among these memory-augmented neural networks, memory networks are known for their huge reasoning power and capability to learn from a large number of inputs rather than other networks. As the size of input datasets rapidly grows, the necessity of large-scale memory networks continuously arises. Such large-scale memory networks provide excellent reasoning power; however, the current computer infrastructure cannot achieve scalable performance due to its limited system architecture. In this paper, we propose MnnFast, a novel system architecture for large-scale memory networks to achieve fast and scalable reasoning performance. We identify the performance problems of the current architecture by conducting extensive performance bottleneck analysis. Our in-depth analysis indicates that the current architecture suffers from three major performance problems: high memory bandwidth consumption, heavy computation, and cache contention. To overcome these performance problems, we propose three novel optimizations. First, to reduce the memory bandwidth consumption, we propose a new column-based algorithm with streaming which minimizes the size of data spills and hides most of the off-chip memory accessing overhead. Second, to decrease the high computational overhead, we propose a zero-skipping optimization to bypass a large amount of output computation. Lastly, to eliminate the cache contention, we propose an embedding cache dedicated to efficiently cache the embedding matrix. Our evaluations show that MnnFast is significantly effective in various types of hardware: CPU, GPU, and FPGA. MnnFast improves the overall throughput by up to 5.38×, 4.34×, and 2.01× on CPU, GPU, and FPGA respectively. Also, compared to CPU-based MnnFast, our FPGA-based MnnFast achieves 6.54× higher energy efficiency.

Subjects :: 010302 applied physics
Artificial neural network
Computer science
business.industry
Memory bandwidth
Throughput
02 engineering and technology
01 natural sciences
020202 computer hardware & architecture
High memory
Embedded system
0103 physical sciences
Scalability
0202 electrical engineering, electronic engineering, information engineering
Systems architecture
Overhead (computing)
Cache
business

Details

Database :: OpenAIRE
Journal :: Proceedings of the 46th International Symposium on Computer Architecture
Accession number :: edsair.doi...........2b1cfad05965c7b7d52b76adb99cb0f2
Full Text :: https://doi.org/10.1145/3307650.3322214