Author: "Duan, Sufeng" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Duan, Sufeng"' showing total 21 results

Start Over Author "Duan, Sufeng"

21 results on '"Duan, Sufeng"'

1. McEval: Massively Multilingual Code Evaluation

Author: Chai, Linzheng, Liu, Shukai, Yang, Jian, Yin, Yuwei, Jin, Ke, Liu, Jiaheng, Sun, Tao, Zhang, Ge, Ren, Changyu, Guo, Hongcheng, Wang, Zekun, Wang, Boyang, Wu, Xianjie, Wang, Bing, Li, Tongliang, Yang, Liqun, Duan, Sufeng, and Li, Zhoujun
Subjects: Computer Science - Programming Languages
Abstract: Code large language models (LLMs) have shown remarkable advances in code understanding, completion, and generation tasks. Programming benchmarks, comprised of a selection of code challenges and corresponding test cases, serve as a standard to evaluate the capability of different LLMs in such tasks. However, most existing benchmarks primarily focus on Python and are still restricted to a limited number of languages, where other languages are translated from the Python samples (e.g. MultiPL-E) degrading the data diversity. To further facilitate the research of code LLMs, we propose a massively multilingual code benchmark covering 40 programming languages (McEval) with 16K test samples, which substantially pushes the limits of code LLMs in multilingual scenarios. The benchmark contains challenging code completion, understanding, and generation evaluation tasks with finely curated massively multilingual instruction corpora McEval-Instruct. In addition, we introduce an effective multilingual coder mCoder trained on McEval-Instruct to support multilingual programming language generation. Extensive experimental results on McEval show that there is still a difficult journey between open-source models and closed-source LLMs (e.g. GPT-series models) in numerous languages. The instruction corpora, evaluation benchmark, and leaderboard are available at \url{https://mceval.github.io/}., Comment: 22 pages
Published: 2024

2. Improving Non-autoregressive Machine Translation with Error Exposure and Consistency Regularization

Author: Chen, Xinran, Duan, Sufeng, and Liu, Gongshen
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Being one of the IR-NAT (Iterative-refinemennt-based NAT) frameworks, the Conditional Masked Language Model (CMLM) adopts the mask-predict paradigm to re-predict the masked low-confidence tokens. However, CMLM suffers from the data distribution discrepancy between training and inference, where the observed tokens are generated differently in the two cases. In this paper, we address this problem with the training approaches of error exposure and consistency regularization (EECR). We construct the mixed sequences based on model prediction during training, and propose to optimize over the masked tokens under imperfect observation conditions. We also design a consistency learning method to constrain the data distribution for the masked tokens under different observing situations to narrow down the gap between training and inference. The experiments on five translation benchmarks obtains an average improvement of 0.68 and 0.40 BLEU scores compared to the base models, respectively, and our CMLMC-EECR achieves the best performance with a comparable translation quality with the Transformer. The experiments results demonstrate the effectiveness of our method.
Published: 2024

3. Multi-grained Evidence Inference for Multi-choice Reading Comprehension

Author: Zhao, Yilin, Zhao, Hai, and Duan, Sufeng
Subjects: Computer Science - Computation and Language
Abstract: Multi-choice Machine Reading Comprehension (MRC) is a major and challenging task for machines to answer questions according to provided options. Answers in multi-choice MRC cannot be directly extracted in the given passages, and essentially require machines capable of reasoning from accurate extracted evidence. However, the critical evidence may be as simple as just one word or phrase, while it is hidden in the given redundant, noisy passage with multiple linguistic hierarchies from phrase, fragment, sentence until the entire passage. We thus propose a novel general-purpose model enhancement which integrates multi-grained evidence comprehensively, named Multi-grained evidence inferencer (Mugen), to make up for the inability. Mugen extracts three different granularities of evidence: coarse-, middle- and fine-grained evidence, and integrates evidence with the original passages, achieving significant and consistent performance improvement on four multi-choice MRC benchmarks., Comment: Accepted by TASLP 2023, vol. 31, pp. 3896-3907
Published: 2023
Full Text: View/download PDF

4. SDPSAT: Syntactic Dependency Parsing Structure-Guided Semi-Autoregressive Machine Translation

Author: Chen, Xinran, Zhao, Yuran, Guo, Jianming, Duan, Sufeng, Liu, Gongshen, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Luo, Biao, editor, Cheng, Long, editor, Wu, Zheng-Guang, editor, Li, Hongyi, editor, and Li, Chaojie, editor
Published: 2024
Full Text: View/download PDF

5. SDPSAT: Syntactic Dependency Parsing Structure-Guided Semi-Autoregressive Machine Translation

Author: Chen, Xinran, primary, Zhao, Yuran, additional, Guo, Jianming, additional, Duan, Sufeng, additional, and Liu, Gongshen, additional
Published: 2023
Full Text: View/download PDF

6. To Understand Representation of Layer-aware Sequence Encoders as Multi-order-graph

Author: Duan, Sufeng and Zhao, Hai
Subjects: Computer Science - Computation and Language
Abstract: In this paper, we propose an explanation of representation for self-attention network (SAN) based neural sequence encoders, which regards the information captured by the model and the encoding of the model as graph structure and the generation of these graph structures respectively. The proposed explanation applies to existing works on SAN-based models and can explain the relationship among the ability to capture the structural or linguistic information, depth of model, and length of sentence, and can also be extended to other models such as recurrent neural network based models. We also propose a revisited multigraph called Multi-order-Graph (MoG) based on our explanation to model the graph structures in the SAN-based model as subgraphs in MoG and convert the encoding of SAN-based model to the generation of MoG. Based on our explanation, we further introduce a Graph-Transformer by enhancing the ability to capture multiple subgraphs of different orders and focusing on subgraphs of high orders. Experimental results on multiple neural machine translation tasks show that the Graph-Transformer can yield effective performance improvement., Comment: arXiv admin note: text overlap with arXiv:2009.07489
Published: 2021

7. SG-Net: Syntax Guided Transformer for Language Representation

Author: Zhang, Zhuosheng, Wu, Yuwei, Zhou, Junru, Duan, Sufeng, Zhao, Hai, and Wang, Rui
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Information Retrieval
Abstract: Understanding human language is one of the key themes of artificial intelligence. For language representation, the capacity of effectively modeling the linguistic knowledge from the detail-riddled and lengthy texts and getting rid of the noises is essential to improve its performance. Traditional attentive models attend to all words without explicit constraint, which results in inaccurate concentration on some dispensable words. In this work, we propose using syntax to guide the text modeling by incorporating explicit syntactic constraints into attention mechanisms for better linguistically motivated word representations. In detail, for self-attention network (SAN) sponsored Transformer-based encoder, we introduce syntactic dependency of interest (SDOI) design into the SAN to form an SDOI-SAN with syntax-guided self-attention. Syntax-guided network (SG-Net) is then composed of this extra SDOI-SAN and the SAN from the original Transformer encoder through a dual contextual architecture for better linguistics inspired representation. The proposed SG-Net is applied to typical Transformer encoders. Extensive experiments on popular benchmark tasks, including machine reading comprehension, natural language inference, and neural machine translation show the effectiveness of the proposed SG-Net design., Comment: The early version accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI). Journal extension of arXiv:1908.05147 (AAAI 2020)
Published: 2020

8. Graph-to-Sequence Neural Machine Translation

Author: Duan, Sufeng, Zhao, Hai, and Wang, Rui
Subjects: Computer Science - Computation and Language
Abstract: Neural machine translation (NMT) usually works in a seq2seq learning way by viewing either source or target sentence as a linear sequence of words, which can be regarded as a special case of graph, taking words in the sequence as nodes and relationships between words as edges. In the light of the current NMT models more or less capture graph information among the sequence in a latent way, we present a graph-to-sequence model facilitating explicit graph information capturing. In detail, we propose a graph-based SAN-based NMT model called Graph-Transformer by capturing information of subgraphs of different orders in every layers. Subgraphs are put into different groups according to their orders, and every group of subgraphs respectively reflect different levels of dependency between words. For fusing subgraph representations, we empirically explore three methods which weight different groups of subgraphs of different orders. Results of experiments on WMT14 English-German and IWSLT14 German-English show that our method can effectively boost the Transformer with an improvement of 1.1 BLEU points on WMT14 English-German dataset and 1.0 BLEU points on IWSLT14 German-English dataset.
Published: 2020

9. Capsule-Transformer for Neural Machine Translation

Author: Duan, Sufeng, Cao, Juncheng, and Zhao, Hai
Subjects: Computer Science - Computation and Language
Abstract: Transformer hugely benefits from its key design of the multi-head self-attention network (SAN), which extracts information from various perspectives through transforming the given input into different subspaces. However, its simple linear transformation aggregation strategy may still potentially fail to fully capture deeper contextualized information. In this paper, we thus propose the capsule-Transformer, which extends the linear transformation into a more general capsule routing algorithm by taking SAN as a special case of capsule network. So that the resulted capsule-Transformer is capable of obtaining a better attention distribution representation of the input sequence via information aggregation among different heads and words. Specifically, we see groups of attention weights in SAN as low layer capsules. By applying the iterative capsule routing algorithm they can be further aggregated into high layer capsules which contain deeper contextualized information. Experimental results on the widely-used machine translation datasets show our proposed capsule-Transformer outperforms strong Transformer baseline significantly.
Published: 2020

10. Syntax-aware Data Augmentation for Neural Machine Translation

Author: Duan, Sufeng, Zhao, Hai, Zhang, Dongdong, and Wang, Rui
Subjects: Computer Science - Computation and Language
Abstract: Data augmentation is an effective performance enhancement in neural machine translation (NMT) by generating additional bilingual data. In this paper, we propose a novel data augmentation enhancement strategy for neural machine translation. Different from existing data augmentation methods which simply choose words with the same probability across different sentences for modification, we set sentence-specific probability for word selection by considering their roles in sentence. We use dependency parse tree of input sentence as an effective clue to determine selecting probability for every words in each sentence. Our proposed method is evaluated on WMT14 English-to-German dataset and IWSLT14 German-to-English dataset. The result of extensive experiments show our proposed syntax-aware data augmentation method may effectively boost existing sentence-independent methods for significant translation performance improvement.
Published: 2020

11. Attention Is All You Need for Chinese Word Segmentation

Author: Duan, Sufeng and Zhao, Hai
Subjects: Computer Science - Computation and Language
Abstract: Taking greedy decoding algorithm as it should be, this work focuses on further strengthening the model itself for Chinese word segmentation (CWS), which results in an even more fast and more accurate CWS model. Our model consists of an attention only stacked encoder and a light enough decoder for the greedy segmentation plus two highway connections for smoother training, in which the encoder is composed of a newly proposed Transformer variant, Gaussian-masked Directional (GD) Transformer, and a biaffine attention scorer. With the effective encoder design, our model only needs to take unigram features for scoring. Our model is evaluated on SIGHAN Bakeoff benchmark datasets. The experimental results show that with the highest segmentation speed, the proposed model achieves new state-of-the-art or comparable performance against strong baselines in terms of strict closed test setting., Comment: 11 pages, to appear in EMNLP 2020 as a long paper
Published: 2019

12. SG-Net: Syntax-Guided Machine Reading Comprehension

Author: Zhang, Zhuosheng, Wu, Yuwei, Zhou, Junru, Duan, Sufeng, Zhao, Hai, and Wang, Rui
Subjects: Computer Science - Computation and Language
Abstract: For machine reading comprehension, the capacity of effectively modeling the linguistic knowledge from the detail-riddled and lengthy passages and getting ride of the noises is essential to improve its performance. Traditional attentive models attend to all words without explicit constraint, which results in inaccurate concentration on some dispensable words. In this work, we propose using syntax to guide the text modeling by incorporating explicit syntactic constraints into attention mechanism for better linguistically motivated word representations. In detail, for self-attention network (SAN) sponsored Transformer-based encoder, we introduce syntactic dependency of interest (SDOI) design into the SAN to form an SDOI-SAN with syntax-guided self-attention. Syntax-guided network (SG-Net) is then composed of this extra SDOI-SAN and the SAN from the original Transformer encoder through a dual contextual architecture for better linguistics inspired representation. To verify its effectiveness, the proposed SG-Net is applied to typical pre-trained language model BERT which is right based on a Transformer encoder. Extensive experiments on popular benchmarks including SQuAD 2.0 and RACE show that the proposed SG-Net design helps achieve substantial performance improvement over strong baselines., Comment: Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-2020)
Published: 2019

13. Fast Neural Chinese Word Segmentation for Long Sentences

Author: Duan, Sufeng, Li, Jiangtong, and Zhao, Hai
Subjects: Computer Science - Computation and Language
Abstract: Rapidly developed neural models have achieved competitive performance in Chinese word segmentation (CWS) as their traditional counterparts. However, most of methods encounter the computational inefficiency especially for long sentences because of the increasing model complexity and slower decoders. This paper presents a simple neural segmenter which directly labels the gap existence between adjacent characters to alleviate the existing drawback. Our segmenter is fully end-to-end and capable of performing segmentation very fast. We also show a performance difference with different tag sets. The experiments show that our segmenter can provide comparable performance with state-of-the-art.
Published: 2018

14. Syntax-Aware Data Augmentation for Neural Machine Translation

Author: Duan, Sufeng, primary, Zhao, Hai, additional, and Zhang, Dongdong, additional
Published: 2023
Full Text: View/download PDF

15. Encoder and Decoder, Not One Less for Pre-trained Language Model Sponsored NMT

Author: Duan, Sufeng, primary and Zhao, Hai, additional
Published: 2023
Full Text: View/download PDF

16. Multi-grained Evidence Inference for Multi-choice Reading Comprehension

Author: Zhao, Yilin, primary, Zhao, Hai, additional, and Duan, Sufeng, additional
Published: 2023
Full Text: View/download PDF

17. SG-Net: Syntax Guided Transformer for Language Representation

Author: Zhang, Zhuosheng, primary, Wu, Yuwei, additional, Zhou, Junru, additional, Duan, Sufeng, additional, Zhao, Hai, additional, and Wang, Rui, additional
Published: 2022
Full Text: View/download PDF

18. SG-Net: Syntax-Guided Machine Reading Comprehension

Author: Zhang, Zhuosheng, primary, Wu, Yuwei, additional, Zhou, Junru, additional, Duan, Sufeng, additional, Zhao, Hai, additional, and Wang, Rui, additional
Published: 2020
Full Text: View/download PDF

19. Attention Is All You Need for Chinese Word Segmentation

Author: Duan, Sufeng, primary and Zhao, Hai, additional
Published: 2020
Full Text: View/download PDF

20. Syntax-aware Transformer Encoder for Neural Machine Translation

Author: Duan, Sufeng, primary, Zhao, Hai, additional, Zhou, Junru, additional, and Wang, Rui, additional
Published: 2019
Full Text: View/download PDF

21. A fault tolerating multiple connection system for portable smart device

Author: Duan, Sufeng, primary and Dong, Yu, additional
Published: 2016
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

21 results on '"Duan, Sufeng"'

1. McEval: Massively Multilingual Code Evaluation

2. Improving Non-autoregressive Machine Translation with Error Exposure and Consistency Regularization

3. Multi-grained Evidence Inference for Multi-choice Reading Comprehension

4. SDPSAT: Syntactic Dependency Parsing Structure-Guided Semi-Autoregressive Machine Translation

5. SDPSAT: Syntactic Dependency Parsing Structure-Guided Semi-Autoregressive Machine Translation

6. To Understand Representation of Layer-aware Sequence Encoders as Multi-order-graph

7. SG-Net: Syntax Guided Transformer for Language Representation

8. Graph-to-Sequence Neural Machine Translation

9. Capsule-Transformer for Neural Machine Translation

10. Syntax-aware Data Augmentation for Neural Machine Translation

11. Attention Is All You Need for Chinese Word Segmentation

12. SG-Net: Syntax-Guided Machine Reading Comprehension

13. Fast Neural Chinese Word Segmentation for Long Sentences

14. Syntax-Aware Data Augmentation for Neural Machine Translation

15. Encoder and Decoder, Not One Less for Pre-trained Language Model Sponsored NMT

16. Multi-grained Evidence Inference for Multi-choice Reading Comprehension

17. SG-Net: Syntax Guided Transformer for Language Representation

18. SG-Net: Syntax-Guided Machine Reading Comprehension

19. Attention Is All You Need for Chinese Word Segmentation

20. Syntax-aware Transformer Encoder for Neural Machine Translation

21. A fault tolerating multiple connection system for portable smart device

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

21 results on '"Duan, Sufeng"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources