Author: "Xinhai Chen" / Topic: 0202 electrical engineering, electronic engineering, information engineering - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Xinhai Chen"' showing total 9 results

Start Over Author "Xinhai Chen" Topic 0202 electrical engineering, electronic engineering, information engineering

9 results on '"Xinhai Chen"'

1. Configurable Multi-directional Systolic Array Architecture for Convolutional Neural Networks

Author: Yang Guo, Yaohua Wang, Rui Xu, Sheng Ma, and Xinhai Chen
Subjects: 010302 applied physics, Speedup, business.industry, Dataflow, Computer science, Systolic array, 02 engineering and technology, Energy consumption, 01 natural sciences, Convolutional neural network, 020202 computer hardware & architecture, Convolution, Transmission (telecommunications), Hardware and Architecture, 0103 physical sciences, 0202 electrical engineering, electronic engineering, information engineering, Hardware acceleration, business, Software, Computer hardware, Information Systems
Abstract: The systolic array architecture is one of the most popular choices for convolutional neural network hardware accelerators. The biggest advantage of the systolic array architecture is its simple and efficient design principle. Without complicated control and dataflow, hardware accelerators with the systolic array can calculate traditional convolution very efficiently. However, this advantage also brings new challenges to the systolic array. When computing special types of convolution, such as the small-scale convolution or depthwise convolution, the processing element (PE) utilization rate of the array decreases sharply. The main reason is that the simple architecture design limits the flexibility of the systolic array. In this article, we design a configurable multi-directional systolic array (CMSA) to address these issues. First, we added a data path to the systolic array. It allows users to split the systolic array through configuration to speed up the calculation of small-scale convolution. Second, we redesigned the PE unit so that the array has multiple data transmission modes and dataflow strategies. This allows users to switch the dataflow of the PE array to speed up the calculation of depthwise convolution. In addition, unlike other works, we only make a few changes and modifications to the existing systolic array architecture. It avoids additional hardware overheads and can be easily deployed in application scenarios that require small systolic arrays such as mobile terminals. Based on our evaluation, CMSA can increase the PE utilization rate by up to 1.6 times compared to the typical systolic array when running the last layers of ResNet-18. When running depthwise convolution in MobileNet, CMSA can increase the utilization rate by up to 14.8 times. At the same time, CMSA and the traditional systolic arrays are similar in area and energy consumption.
Published: 2021
Full Text: View/download PDF

2. OHTMA: an optimized heuristic topology-aware mapping algorithm on the Tianhe-3 exascale supercomputer prototype

Author: Chunye Gong, Bo Yang, Xu Han, Shengguo Li, Jie Liu, Gan Xinbiao, Yi-shui Li, and Xinhai Chen
Subjects: 020203 distributed computing, Computer Networks and Communications, Computer science, Heuristic (computer science), Locality, Process (computing), 020207 software engineering, Topology (electrical circuits), 02 engineering and technology, Supercomputer architecture, Parallel computing, Supercomputer, Hardware and Architecture, Signal Processing, Metric (mathematics), 0202 electrical engineering, electronic engineering, information engineering, Electrical and Electronic Engineering, Greedy algorithm
Abstract: With the rapid increase of the size of applications and the complexity of the supercomputer architecture, topology-aware process mapping becomes increasingly important. High communication cost has become a dominant constraint of the performance of applications running on the supercomputer. To avoid a bad mapping strategy which can lead to terrible communication performance, we propose an optimized heuristic topology-aware mapping algorithm (OHTMA). The algorithm attempts to minimize the hop-byte metric that we use to measure the mapping results. OHTMA incorporates a new greedy heuristic method and pair-exchange-based optimization. It reduces the number of long-distance communications and effectively enhances the locality of the communication. Experimental results on the Tianhe-3 exascale supercomputer prototype indicate that OHTMA can significantly reduce the communication costs.
Published: 2020
Full Text: View/download PDF

3. Word-level BERT-CNN-RNN Model for Chinese Punctuation Restoration

Author: Xinhai Chen, Zhe Zhang, Lihua Chi, and Jie Liu
Subjects: Artificial neural network, Grammar, Computer science, Speech recognition, media_common.quotation_subject, 02 engineering and technology, Semantics, Punctuation, Convolutional neural network, 030507 speech-language pathology & audiology, 03 medical and health sciences, Recurrent neural network, Kernel (image processing), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Chinese language, 0305 other medical science, Word (computer architecture), media_common
Abstract: Punctuation restoration in speech recognition has a wide range of application scenarios. Despite the widespread success of neural networks methods at performing punctuation restoration for English, there have been only limited attempts for Chinese punctuation restoration. Due to the differences between Chinese and English in terms of grammar and basic semantic units, existing methods for English is not suitable for Chinese punctuation restoration. To tackle this problem, we propose a hybrid model combining the kernel of Bidirectional Encoder Representations from Transformers (BERT), Convolution Neural Network (CNN) and Recurrent Neural Network (RNN). This model employs a flexible structure and special CNN design which can extract word-level features for Chinese language. We compared the performance of the hybrid model with five widely-used punctuation restoration models on the public dataset. Experimental results demonstrate that our hybrid model is simple and efficient. It outperforms other models and achieves an accuracy of 69.1%.
Published: 2020
Full Text: View/download PDF

4. A Novel Message Dissemination Scheme Based on Pub/Sub Model and Vehicular Fog Computing

Author: Ze Wang, Bingyi Liu, and Xinhai Chen
Subjects: 050210 logistics & transportation, Vehicular communication systems, business.industry, Computer science, 05 social sciences, Message passing, 020206 networking & telecommunications, Cloud computing, 02 engineering and technology, computer.software_genre, Middleware (distributed applications), Models of communication, 0502 economics and business, Scalability, 0202 electrical engineering, electronic engineering, information engineering, business, computer, Intelligent transportation system, Information exchange, Computer network
Abstract: Modern vehicular communication systems are facing significant challenges since the rapid development of intelligent transportation systems and the ever-growing vehicular applications have activated intense information exchange among vehicles, infrastructures, and pedestrians. Without the support of powerful communication, many vehicular services will just stay in conceptual phase and cannot be put into practice. To provide low-latency services to vehicles on roads, vehicular fog computing (VFC) is recently introduced as a promising solution, which utilizes enormous vehicles on roads as communication and computation resources and extends cloud computing service to an edge network. On the other hand, the publish/subscribe (pub/sub) paradigm provides a loosely coupled and scalable communication which can facilitate flexible and dynamic vehicular network services. Motivated by the merits of these two research fields, in this paper, we propose a novel joint design of pub/sub communication model based on VFC architecture, which employs fog nodes as the data platform for messages aggregation. Specifically, we describe a method to measure vehicle mobility and construct a stable VFC on roads. Then, a message dissemination approach is designed based on our communication model. Finally, the experimental results confirm the efficiency of our proposed model in various urban scenarios.
Published: 2020
Full Text: View/download PDF

5. An Airfoil Mesh Quality Criterion using Deep Neural Networks

Author: Bo Chen, Xinhai Chen, Yufei Pang, Jie Liu, and Chunye Gong
Subjects: Pointwise, 0209 industrial biotechnology, Gambit, Computer science, business.industry, Deep learning, media_common.quotation_subject, 02 engineering and technology, Computational fluid dynamics, 020901 industrial engineering & automation, Software, Computer engineering, Offline learning, 0202 electrical engineering, electronic engineering, information engineering, Preprocessor, 020201 artificial intelligence & image processing, Quality (business), Artificial intelligence, business, media_common
Abstract: The quality of the mesh is one of the most critical aspects for solving partial differential equations (PDEs) in applications of Computational Fluid Dynamics. Many geometry criteria have been proposed and are widely used in business preprocessing software like ICEM CFD, PointWise, Gambit. However, these traditional geometry criteria fail to recognize some quality features that seriously affect the accuracy of numerical calculations, such as density and distribution of mesh elements. These quality features are usually evaluated based on engineering experience, which heavily increases the pre-processing cost and requires extensive engineering experience. In this paper, we introduce a deep learning model to solve the mentioned issues by offline learning. The proposed model is small and fast and can be embedded in pre-processing software. Experiment results show that the derived model is capable of performing the quality evaluating task and achieve an accuracy of 93.8%.
Published: 2020
Full Text: View/download PDF

6. A mesh quality discrimination method based on convolutional neural network

Author: Jie Liu, Chunye Gong, Zhenpeng Xu, Xinhai Chen, and Lihua Chi
Subjects: 0209 industrial biotechnology, Computer science, business.industry, media_common.quotation_subject, Mesh networking, Process (computing), 02 engineering and technology, Aerodynamics, Computational fluid dynamics, Convolutional neural network, Convolution, 020901 industrial engineering & automation, Computer engineering, 0202 electrical engineering, electronic engineering, information engineering, Key (cryptography), 020201 artificial intelligence & image processing, Quality (business), business, media_common
Abstract: With the rapid development of highperformance computing, computational fluid dynamics (CFD) has become an important part of hydrodynamics and aerodynamics. Mesh quality is the key factor that affects the accuracy and efficiency of CFD numerical calculation. However, the current the process of mesh quality discrimination is very time-consuming. The manpower time needed for this process takes up a large proportion in the whole numerical calculation process. A large number of artificial intelligence algorithms have been put forward to replace the human to efficiently complete all kinds of tedious tasks. In this paper, we propose a convolutional neural network (CNN) based mesh quality discrimination method, MeshNet. MeshNet uses residual neural network structure to learn mesh features and automatically judge the mesh quality. The experimental results show that the proposed network can greatly save labor time cost and achieve an accuracy of 94.41% for mesh quality discrimination.
Published: 2020
Full Text: View/download PDF

7. TAMM: A New Topology-Aware Mapping Method for Parallel Applications on the Tianhe-2A Supercomputer

Author: Shengguo Li, Qinglin Wang, Chi Lihua, Peizhen Xie, Jie Liu, and Xinhai Chen
Subjects: 020203 distributed computing, Computer science, Iterative method, Locality, 020207 software engineering, Topology (electrical circuits), 02 engineering and technology, Supercomputer, Network topology, Topology, Bottleneck, 0202 electrical engineering, electronic engineering, information engineering, Key (cryptography), Overhead (computing)
Abstract: With the increasing size of high performance computing systems, the expensive communication overhead between processors has become a key factor leading to the performance bottleneck. However, default process-to-processor mapping strategies do not take into account the topology of the interconnection network, and thus the distance spanned by communication messages may be particularly far. In order to enhance the communication locality, we propose a new topology-aware mapping method called TAMM. By generating an accurate description of the communication pattern and network topology, TAMM employs a two-step optimization strategy to obtain an efficient mapping solution for various parallel applications. This strategy first extracts an appropriate subset of all idle computing resources on the underlying system and then constructs an optimized one-to-one mapping with a refined iterative algorithm. Experimental results demonstrate that TAMM can effectively improve the communication performance on the Tianhe-2A supercomputer.
Published: 2018
Full Text: View/download PDF

8. Erasure code of small file in a distributed file system

Author: Peizhen Xie, Jie Liu, and Xinhai Chen
Subjects: Name server, 020203 distributed computing, Distributed database, Computer science, business.industry, Byte, 02 engineering and technology, computer.software_genre, 020202 computer hardware & architecture, Server, Data file, Scalability, Computer data storage, Data_FILES, 0202 electrical engineering, electronic engineering, information engineering, Operating system, Distributed File System, Erasure code, business, File storage, computer, Block (data storage)
Abstract: With the development of Internet applications, the data storage of many applications has some new characters. There always exists a huge amount of pictures in distributed file systems, and the size of these pictures is usually no more than 1M bytes. To meet the demands in small file storage system which contains so many pictures, we provide a kind of distributed file system to store large amounts of small files. In our system, small files (actual data files) are merged into large files (default 64M) which we call block, and we use two-indexed way to reduce the pressure of nameserver, which maintains the information of correspondence between blocks and dataservers. The result shows that the memory usage of block information in nameserver is less than 2G when the system capacity is 1PB. It shows good scalability. Meanwhile, to reduce the cost of storage, we introduce the technique of erasure code which is an alternative offers the same data protection but reduces significantly the storage consumption. After grouping, the cost of storage dropped by 25% compared with 2 replications.
Published: 2017
Full Text: View/download PDF

9. An efficient SIMD compression format for sparse matrix-vector multiplication

Author: Xinhai Chen, Jie Liu, Chunye Gong, Lihua Chi, and Peizhen Xie
Subjects: 020203 distributed computing, Computer Networks and Communications, Computer science, Sparse matrix-vector multiplication, 010103 numerical & computational mathematics, 02 engineering and technology, Parallel computing, 01 natural sciences, Computer Science Applications, Theoretical Computer Science, Computational Theory and Mathematics, Compression (functional analysis), 0202 electrical engineering, electronic engineering, information engineering, SIMD, 0101 mathematics, Software
Published: 2018
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

9 results on '"Xinhai Chen"'

1. Configurable Multi-directional Systolic Array Architecture for Convolutional Neural Networks

2. OHTMA: an optimized heuristic topology-aware mapping algorithm on the Tianhe-3 exascale supercomputer prototype

3. Word-level BERT-CNN-RNN Model for Chinese Punctuation Restoration

4. A Novel Message Dissemination Scheme Based on Pub/Sub Model and Vehicular Fog Computing

5. An Airfoil Mesh Quality Criterion using Deep Neural Networks

6. A mesh quality discrimination method based on convolutional neural network

7. TAMM: A New Topology-Aware Mapping Method for Parallel Applications on the Tianhe-2A Supercomputer

8. Erasure code of small file in a distributed file system

9. An efficient SIMD compression format for sparse matrix-vector multiplication

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

9 results on '"Xinhai Chen"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources