Author: "Ma, Xindian" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Ma, Xindian"' showing total 10 results

Start Over Author "Ma, Xindian"

10 results on '"Ma, Xindian"'

1. CrossQuant: A Post-Training Quantization Method with Smaller Quantization Kernel for Precise Large Language Model Compression

Author: Liu, Wenyuan, Ma, Xindian, Zhang, Peng, and Wang, Yan
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Post-Training Quantization (PTQ) is an effective technique for compressing Large Language Models (LLMs). While many studies focus on quantizing both weights and activations, it is still a challenge to maintain the accuracy of LLM after activating quantization. To investigate the primary cause, we extend the concept of kernel from linear algebra to quantization functions to define a new term, "quantization kernel", which refers to the set of elements in activations that are quantized to zero. Through quantitative analysis of the quantization kernel, we find that these elements are crucial for maintaining the accuracy of quantized LLMs. With the decrease of quantization kernel, the precision of quantized LLMs increases. If the quantization kernel proportion is kept below 19% for OPT models and below 1% for LLaMA models, the precision loss from quantizing activations to INT8 becomes negligible. Motivated by the goal of developing a quantization method with small quantization kernel, we propose CrossQuant: a simple yet effective method for quantizing activations. CrossQuant cross-quantizes elements using row and column-wise absolute maximum vectors, achieving a quantization kernel of approximately 16% for OPT models and less than 0.1% for LLaMA models. Experimental results on LLMs (LLaMA, OPT) ranging from 6.7B to 70B parameters demonstrate that CrossQuant improves or maintains perplexity and accuracy in language modeling, zero-shot, and few-shot tasks.
Published: 2024

2. 3D-RPE: Enhancing Long-Context Modeling Through 3D Rotary Position Encoding

Author: Ma, Xindian, Liu, Wenyuan, Zhang, Peng, and Xu, Nan
Subjects: Computer Science - Computation and Language
Abstract: Inspired by the Bloch Sphere representation, we propose a novel rotary position encoding on a three-dimensional sphere, named 3D Rotary Position Encoding (3D-RPE). 3D-RPE is an advanced version of the widely used 2D Rotary Position Encoding (RoPE), with two major advantages for modeling long contexts: controllable long-term decay and improved position resolution. For controllable long-term decay, 3D-RPE allows for the regulation of long-term decay within the chunk size, ensuring the modeling of relative positional information between tokens at a distant relative position. For enhanced position resolution, 3D-RPE can mitigate the degradation of position resolution caused by position interpolation on RoPE. We have conducted experiments on long-context Natural Language Understanding (NLU) and long-sequence Language Modeling (LM) tasks. From the experimental results, 3D-RPE achieved performance improvements over RoPE, especially in long-context NLU tasks.
Published: 2024

3. TensorCoder: Dimension-Wise Attention via Tensor Representation for Natural Language Modeling

Author: Zhang, Shuai, Zhang, Peng, Ma, Xindian, Wei, Junqiu, Wang, Ningning, and Liu, Qun
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Transformer has been widely-used in many Natural Language Processing (NLP) tasks and the scaled dot-product attention between tokens is a core module of Transformer. This attention is a token-wise design and its complexity is quadratic to the length of sequence, limiting its application potential for long sequence tasks. In this paper, we propose a dimension-wise attention mechanism based on which a novel language modeling approach (namely TensorCoder) can be developed. The dimension-wise attention can reduce the attention complexity from the original $O(N^2d)$ to $O(Nd^2)$, where $N$ is the length of the sequence and $d$ is the dimensionality of head. We verify TensorCoder on two tasks including masked language modeling and neural machine translation. Compared with the original Transformer, TensorCoder not only greatly reduces the calculation of the original model but also obtains improved performance on masked language modeling task (in PTB dataset) and comparable performance on machine translation tasks.
Published: 2020

4. A Tensorized Transformer for Language Modeling

Author: Ma, Xindian, Zhang, Peng, Zhang, Shuai, Duan, Nan, Hou, Yuexian, Song, Dawei, and Zhou, Ming
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Latest development of neural models has connected the encoder and decoder through a self-attention mechanism. In particular, Transformer, which is solely based on self-attention, has led to breakthroughs in Natural Language Processing (NLP) tasks. However, the multi-head attention mechanism, as a key component of Transformer, limits the effective deployment of the model to a resource-limited setting. In this paper, based on the ideas of tensor decomposition and parameters sharing, we propose a novel self-attention model (namely Multi-linear attention) with Block-Term Tensor Decomposition (BTD). We test and verify the proposed attention method on three language modeling tasks (i.e., PTB, WikiText-103 and One-billion) and a neural machine translation task (i.e., WMT-2016 English-German). Multi-linear attention can not only largely compress the model parameters but also obtain performance improvements, compared with a number of language modeling approaches, such as Transformer, Transformer-XL, and Transformer with tensor train decomposition., Comment: Accepted by NeurIPS 2019
Published: 2019

5. A Generalized Language Model in Tensor Space

Author: Zhang, Lipeng, Zhang, Peng, Ma, Xindian, Gu, Shuqin, Su, Zhan, and Song, Dawei
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: In the literature, tensors have been effectively used for capturing the context information in language models. However, the existing methods usually adopt relatively-low order tensors, which have limited expressive power in modeling language. Developing a higher-order tensor representation is challenging, in terms of deriving an effective solution and showing its generality. In this paper, we propose a language model named Tensor Space Language Model (TSLM), by utilizing tensor networks and tensor decomposition. In TSLM, we build a high-dimensional semantic space constructed by the tensor product of word vectors. Theoretically, we prove that such tensor representation is a generalization of the n-gram language model. We further show that this high-order tensor representation can be decomposed to a recursive calculation of conditional probability for language modeling. The experimental results on Penn Tree Bank (PTB) dataset and WikiText benchmark demonstrate the effectiveness of TSLM.
Published: 2019

6. Intracellular mRNA phase separation induced by cationic polymers for tumor immunotherapy

Author: Xing, Zhen, Xue, Jing, Ma, Xindian, Han, Congwei, Wang, Zhenzhen, Luo, Shunhuang, Wang, Chunming, Dong, Lei, and Zhang, Junfeng
Published: 2022
Full Text: View/download PDF

7. Link Prediction with Attention-Based Semantic Influence of Multiple Neighbors

Author: Song, Meixian, Wang, Bo, Ma, Xindian, Hu, Qinghua, Wang, Xin, Hou, Yuexian, Song, Dawei, Barbosa, Simone Diniz Junqueira, Editorial Board Member, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Kotenko, Igor, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Gedeon, Tom, editor, Wong, Kok Wai, editor, and Lee, Minho, editor
Published: 2019
Full Text: View/download PDF

8. Additional file 1 of Intracellular mRNA phase separation induced by cationic polymers for tumor immunotherapy

Author: Xing, Zhen, Xue, Jing, Ma, Xindian, Han, Congwei, Wang, Zhenzhen, Luo, Shunhuang, Wang, Chunming, Dong, Lei, and Zhang, Junfeng
Abstract: Additional file 1: Figure S1. Characterization of cDex and DETA-Dex. Figure S2. Kinetic analysis of the RNA droplets induced by the cationic polymers. Figure S3. Quantification of the transcription levels of common markers by RNA-seq. Figure S4. Gene Set Enrichment Analysis. Figure S5. Evaluation of antitumor activity of the cationic polymers in the BALB/c mouse model and BALB/c nude mouse model. Figure S6. Examples of the gating strategies for intracellular staining flow cytometry analysis. Figure S7. Evaluation of the antitumor activity of the cationic polymer combined with an anti-PD-1 antibody. Table S1. GPC analysis of the cationic polymers. Table S2. Dextran standards for GPC analysis. Table S3. qPCR primers and probes. Table S4. Flow cytometry antibodies.
Published: 2022
Full Text: View/download PDF

9. A Generalized Language Model in Tensor Space

Author: Zhang, Lipeng, primary, Zhang, Peng, additional, Ma, Xindian, additional, Gu, Shuqin, additional, Su, Zhan, additional, and Song, Dawei, additional
Published: 2019
Full Text: View/download PDF

10. A survey of quantum language models

Author: SONG, Dawei, primary, ZHANG, Peng, additional, and MA, Xindian, additional
Published: 2018
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

10 results on '"Ma, Xindian"'

1. CrossQuant: A Post-Training Quantization Method with Smaller Quantization Kernel for Precise Large Language Model Compression

2. 3D-RPE: Enhancing Long-Context Modeling Through 3D Rotary Position Encoding

3. TensorCoder: Dimension-Wise Attention via Tensor Representation for Natural Language Modeling

4. A Tensorized Transformer for Language Modeling

5. A Generalized Language Model in Tensor Space

6. Intracellular mRNA phase separation induced by cationic polymers for tumor immunotherapy

7. Link Prediction with Attention-Based Semantic Influence of Multiple Neighbors

8. Additional file 1 of Intracellular mRNA phase separation induced by cationic polymers for tumor immunotherapy

9. A Generalized Language Model in Tensor Space

10. A survey of quantum language models

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

10 results on '"Ma, Xindian"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources