9 results on '"Hu, Jianhao"'
Search Results
2. High Throughput and Hardware Efficient Hybrid LDPC Decoder Using Bit-Serial Stochastic Updating
- Author
-
Hu, Shuai, Han, Kaining, Zhu, Yubin, Shen, Guodong, Wang, Fujie, and Hu, Jianhao
- Abstract
Hybrid low-density parity-check (LDPC) decoding combines conventional Belief-Propagation (BP) algorithm with stochastic decoding to achieve high performance and low complexity simultaneously. However, lossy and inefficient stochastic-to-binary (S2B) conversion brings extra performance degradation and decoding latency. In this paper, a bit-serial stochastic updating based hybrid decoding (BSSU-HD) is proposed, which employs fully correlated stochastic (FCS) check nodes (CNs) and probability tracers assisted variable nodes (VNs) to accomplish accurate and efficient S2B conversion. Two strategies, including random source selection and tracing speed switching, are proposed to further improve performance and convergence. A BSSU LDPC decoder for IEEE 802.3an is designed in a 65-nm CMOS process, which occupies 4.6 mm2 silicon area and achieves a throughput of 200.8 Gb/s at
$E_{b}/N_{0} = 4.4$ - Published
- 2023
- Full Text
- View/download PDF
3. A Nonlinear Function Logic Computing Architecture With Low Switching Activity
- Author
-
Zhu, Yubin, Zhang, Yanyan, Han, Kaining, and Hu, Jianhao
- Abstract
Nonlinear functions are widely involved in modern digital signal processing systems, which are usually calculated by polynomial approximation, CORDIC algorithm, or look-up tables. Due to the quite complex logic computing architectures, these methods suffer from high switching activity of logic elements, resulting in tremendous dynamic power consumption. Reducing the switching activity is believed an efficient way to save dynamic power. In this brief, we first analyzed and proved the low switching activity feature of the conventional unary number representation. Afterward, a novel multi-hot unary number representation method and the corresponding logic computing architecture with low switching activity are proposed for nonlinear function, which significantly reduces switching activity and power consumption. Moreover, the proposed nonlinear function logic computing architecture is extended to single input multiple output nonlinear function calculation, which shares the multi-hot encoder to further reduce the hardware cost. According to the post-synthesis power evaluation using PrimePower tools, the proposed architecture achieves significant switching activity and dynamic power reduction compared to conventional computing architectures.
- Published
- 2023
- Full Text
- View/download PDF
4. Symbol detection based on temporal convolutional network in optical communications
- Author
-
Luo, Yingzhe and Hu, Jianhao
- Abstract
Deep learning (DL) is one of the fastest developing areas in artificial intelligence, it has been recently gained studies and application in computer vision, automatic driving, automatic speech recognition, and communication. This paper uses the DL method to design a symbol detection algorithm in receiver for optical communication systems. The proposed DL based method is implemented by a non-causal temporal convolutional network (ncTCN), which is a convolutional neural network and appropriate for sequence processing. Meanwhile, we adopt three methods to realize the training process for multiple signal-to-noise ratios of the AWGN channel. Furthermore, we apply two nonlinear activation functions for the noise robustness to the proposed ncTCN. Without losing generality, we apply the ncTCN-based receiver to the 16-ary quadrature amplitude modulation optical communication system in the simulation experiment. According to the experiment results, the proposed method can obtain some bit error rate performance gain compared to some conventional receivers.
- Published
- 2022
- Full Text
- View/download PDF
5. Low-Cost Implementation Techniques for Interleave Division Multiple Access
- Author
-
Hu, Yang, Liang, Chulong, Hu, Jianhao, and Ping, Li
- Abstract
We consider a low-cost code shift division multiple-access (CSDMA) scheme, in which user-specific shifting is used to replace user-specific interleaving in interleave division multiple access (IDMA). We also outline a low-cost Gaussian approximation-based linear minimum mean square error message passing detection technique for CSDMA. We show that CSDMA can offer almost the same performance as the original IDMA in low-density parity-check or turbo coded systems, but with considerably lower implementation cost.
- Published
- 2018
- Full Text
- View/download PDF
6. A pseudo-random sequence generation scheme based on RNS and permutation polynomials
- Author
-
Ma, Shang, Liu, Jianfeng, Yang, Zeguo, Zhang, Yan, and Hu, Jianhao
- Abstract
Long period pseudo-random sequence plays an important role in modern information processing systems. Base on residue number system (RNS) and permutation polynomials over finite fields, a pseudorandom sequence generation scheme is proposed in this paper. It extends several short period random sequences to a long period pseudo-random sequence by using RNS. The short period random sequences are generated parallel by the iterations of permutation polynomials over finite fields. Due to the small dynamic range of each iterative calculation, the bit width in hardware implementation is reduced. As a result, we can use full look-up table (LUT) architecture to achieve high-speed sequence output. The methods to find proper permutation polynomials to generate long period sequences and the optimization algorithm of Chinese remainder theorem (CRT) mapping are also proposed in this paper. The period of generated pseudorandom sequence can exceed 2100easily based on common used field programmable gate array (FPGA) chips. Meanwhile, this scheme has extensive freedom in choosing permutation polynomials. For example, 10905 permutation polynomials meet the long period requirement over the finite field Fqwith q≢ 1(mod 3) and q⩽ 503. The hardware implementation architecture is simple and multiplier free. Using Xilinx XC7020 FPGA chip, we implement a sequence generator with the period over 250, which only costs 20 18kb-BRAMs (block RAM) and a small amount of logics. And the speed can reach 449.236 Mbps. The National Institute of Standards and Technology (NIST) test results show that the sequence has good random properties.
- Published
- 2018
- Full Text
- View/download PDF
7. Hardware Efficient Massive MIMO Detector Based on the Monte Carlo Tree Search Method
- Author
-
Chen, Jienan, Fei, Chao, Lu, Hao, Sobelman, Gerald E., and Hu, Jianhao
- Abstract
A Monte Carlo tree search (MCTS)-based large-scale multiple-input, multiple-output (MIMO) detector is proposed. We describe how the MCTS algorithm, which has been successfully used in decision-making and game-playing problems, can be applied to MIMO detection. In particular, we discuss how the tree policy, default policy, simulation, and backpropagation steps of MCTS can be adapted to MIMO detection. We also describe some optimizations that reduce both the bit error rate and the computational complexity. The proposed MCTS MIMO detector exhibits performance that is comparable to existing methods while having a lower computational load. The design has been implemented in a 65-nm CMOS technology. For a
$64 \times 8$ 2 , and it exhibits higher hardware efficiency than previous MIMO detector designs in the literature.- Published
- 2017
- Full Text
- View/download PDF
8. Prediction of low-LET ion induced single event upset cross sections for advanced SRAM
- Author
-
Zhou, Wanting, Hu, Jianhao, and Li, Lei
- Abstract
This paper describes a simple circuit-level simulation-based approach to predict single event upset cross section induced by low-linear energy transfer (LET) ions for advanced bulk static random access memory (SRAM). A basic Simulation Program with Integrated Circuit Emphasis (SPICE) model with effective collection depth considered is developed for performing single event analysis quickly and efficiently. Through this circuit-level simulation model, radiation effects can be shown as the SPICE-simulated curve of LETs versus the corresponding affected distances, which are used for upset cross-section prediction. Furthermore, a fine-grain geometric model for cross-section prediction with fine sensitivity coefficient considered is utilized in the prediction. The calculated results based on this method are in good agreement with experimentally measured results reported for six-transistor SRAM fabricated in 90 nm and 65 nm process technologies.
- Published
- 2013
- Full Text
- View/download PDF
9. A 2n scaling scheme for signed RNS integers and its VLSI implementation
- Author
-
Ma, Shang, Hu, JianHao, Ye, YanLong, Zhang, Lin, and Ling, Xiang
- Abstract
Abstract: High efficient implementation of scaling in residue number system (RNS) is one of the critical issues for the applications of RNS in digital signal processing (DSP) systems. In this paper, an efficient scaling algorithm for signed integers in RNS is proposed firstly through introducing a correction constant in negative integers scaling procedure. Based on the proposed scaling algorithm, an efficient RNS 2
n scaling implementation method is presented, in which Chinese remainder theorem (CRT) and a redundant modulus are used to perform the base extension to obtain the least significant n bits of RNS integers. With the redundant modulus, the RNS sign detection can be achieved by the parity detection. And then, an approach to update the residue digit of the redundant channel is also proposed. Meanwhile, this paper provides a method of computing the correction constant of the redundant channel in negative integers scaling. The analysis results indicate that the complexity of the proposed scaling algorithm grows linearly with the word-length of the RNS dynamic range without using Look-up Table (LUT). Furthermore, the proposed algorithm is employed for a specific moduli set 2n scaling. The synthesis results show that the critical path of the proposed algorithm is shortened by 12%, the area and power consumption performance is improved by about 35%, compared to the existing cascading 2n scaling method for very large scale integration (VLSI) implementation under the same restriction. Besides, the VLSI layout indicates that the parallel structure is simpler.- Published
- 2010
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.