Author: "Zhu, Hengshu" / Topic: computer science - computation and language - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Zhu, Hengshu"' showing total 10 results

Start Over Author "Zhu, Hengshu" Topic computer science - computation and language

10 results on '"Zhu, Hengshu"'

1. Mixture of In-Context Experts Enhance LLMs' Long Context Awareness

Author: Lin, Hongzhan, Lv, Ang, Chen, Yuhan, Zhu, Chen, Song, Yang, Zhu, Hengshu, and Yan, Rui
Subjects: Computer Science - Computation and Language
Abstract: Many studies have revealed that large language models (LLMs) exhibit uneven awareness of different contextual positions. Their limited context awareness can lead to overlooking critical information and subsequent task failures. While several approaches have been proposed to enhance LLMs' context awareness, achieving both effectiveness and efficiency remains challenging. In this paper, for LLMs utilizing RoPE as position embeddings, we introduce a novel method called "Mixture of In-Context Experts" (MoICE) to address this challenge. MoICE comprises two key components: a router integrated into each attention head within LLMs and a lightweight router-only training optimization strategy: (1) MoICE views each RoPE angle as an `in-context' expert, demonstrated to be capable of directing the attention of a head to specific contextual positions. Consequently, each attention head flexibly processes tokens using multiple RoPE angles dynamically selected by the router to attend to the needed positions. This approach mitigates the risk of overlooking essential contextual information. (2) The router-only training strategy entails freezing LLM parameters and exclusively updating routers for only a few steps. When applied to open-source LLMs including Llama and Mistral, MoICE surpasses prior methods across multiple tasks on long context understanding and generation, all while maintaining commendable inference efficiency., Comment: Accepted by Neurips2024
Published: 2024

2. Towards Efficient Resume Understanding: A Multi-Granularity Multi-Modal Pre-Training Approach

Author: Jiang, Feihu, Qin, Chuan, Zhang, Jingshuai, Yao, Kaichun, Chen, Xi, Shen, Dazhong, Zhu, Chen, Zhu, Hengshu, and Xiong, Hui
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: In the contemporary era of widespread online recruitment, resume understanding has been widely acknowledged as a fundamental and crucial task, which aims to extract structured information from resume documents automatically. Compared to the traditional rule-based approaches, the utilization of recently proposed pre-trained document understanding models can greatly enhance the effectiveness of resume understanding. The present approaches have, however, disregarded the hierarchical relations within the structured information presented in resumes, and have difficulty parsing resumes in an efficient manner. To this end, in this paper, we propose a novel model, namely ERU, to achieve efficient resume understanding. Specifically, we first introduce a layout-aware multi-modal fusion transformer for encoding the segments in the resume with integrated textual, visual, and layout information. Then, we design three self-supervised tasks to pre-train this module via a large number of unlabeled resumes. Next, we fine-tune the model with a multi-granularity sequence labeling task to extract structured information from resumes. Finally, extensive experiments on a real-world dataset clearly demonstrate the effectiveness of ERU., Comment: ICME 2024 Accepted
Published: 2024

3. Enhancing Question Answering for Enterprise Knowledge Bases using Large Language Models

Author: Jiang, Feihu, Qin, Chuan, Yao, Kaichun, Fang, Chuyu, Zhuang, Fuzhen, Zhu, Hengshu, and Xiong, Hui
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Information Retrieval
Abstract: Efficient knowledge management plays a pivotal role in augmenting both the operational efficiency and the innovative capacity of businesses and organizations. By indexing knowledge through vectorization, a variety of knowledge retrieval methods have emerged, significantly enhancing the efficacy of knowledge management systems. Recently, the rapid advancements in generative natural language processing technologies paved the way for generating precise and coherent answers after retrieving relevant documents tailored to user queries. However, for enterprise knowledge bases, assembling extensive training data from scratch for knowledge retrieval and generation is a formidable challenge due to the privacy and security policies of private data, frequently entailing substantial costs. To address the challenge above, in this paper, we propose EKRG, a novel Retrieval-Generation framework based on large language models (LLMs), expertly designed to enable question-answering for Enterprise Knowledge bases with limited annotation costs. Specifically, for the retrieval process, we first introduce an instruction-tuning method using an LLM to generate sufficient document-question pairs for training a knowledge retriever. This method, through carefully designed instructions, efficiently generates diverse questions for enterprise knowledge bases, encompassing both fact-oriented and solution-oriented knowledge. Additionally, we develop a relevance-aware teacher-student learning strategy to further enhance the efficiency of the training process. For the generation process, we propose a novel chain of thought (CoT) based fine-tuning method to empower the LLM-based generator to adeptly respond to user questions using retrieved documents. Finally, extensive experiments on real-world datasets have demonstrated the effectiveness of our proposed framework., Comment: DASFAA 2024 Accepted
Published: 2024

4. Make Large Language Model a Better Ranker

Author: Chao, Wen-Shuo, Zheng, Zhi, Zhu, Hengshu, and Liu, Hao
Subjects: Computer Science - Information Retrieval, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Large Language Models (LLMs) demonstrate robust capabilities across various fields, leading to a paradigm shift in LLM-enhanced Recommender System (RS). Research to date focuses on point-wise and pair-wise recommendation paradigms, which are inefficient for LLM-based recommenders due to high computational costs. However, existing list-wise approaches also fall short in ranking tasks due to misalignment between ranking objectives and next-token prediction. Moreover, these LLM-based methods struggle to effectively address the order relation among candidates, particularly given the scale of ratings. To address these challenges, this paper introduces the large language model framework with Aligned Listwise Ranking Objectives (ALRO). ALRO is designed to bridge the gap between the capabilities of LLMs and the nuanced requirements of ranking tasks. Specifically, ALRO employs explicit feedback in a listwise manner by introducing soft lambda loss, a customized adaptation of lambda loss designed for optimizing order relations. This mechanism provides more accurate optimization goals, enhancing the ranking process. Additionally, ALRO incorporates a permutation-sensitive learning mechanism that addresses position bias, a prevalent issue in generative models, without imposing additional computational burdens during inference. Our evaluative studies reveal that ALRO outperforms both existing embedding-based recommendation methods and LLM-based recommendation baselines., Comment: 12 pages, 5 figures
Published: 2024

5. KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning over Knowledge Graph

Author: Jiang, Jinhao, Zhou, Kun, Zhao, Wayne Xin, Song, Yang, Zhu, Chen, Zhu, Hengshu, and Wen, Ji-Rong
Subjects: Computer Science - Computation and Language
Abstract: In this paper, we aim to improve the reasoning ability of large language models (LLMs) over knowledge graphs (KGs) to answer complex questions. Inspired by existing methods that design the interaction strategy between LLMs and KG, we propose an autonomous LLM-based agent framework, called KG-Agent, which enables a small LLM to actively make decisions until finishing the reasoning process over KGs. In KG-Agent, we integrate the LLM, multifunctional toolbox, KG-based executor, and knowledge memory, and develop an iteration mechanism that autonomously selects the tool then updates the memory for reasoning over KG. To guarantee the effectiveness, we leverage program language to formulate the multi-hop reasoning process over the KG, and synthesize a code-based instruction dataset to fine-tune the base LLM. Extensive experiments demonstrate that only using 10K samples for tuning LLaMA-7B can outperform state-of-the-art methods using larger LLMs or more data, on both in-domain and out-domain datasets. Our code and data will be publicly released., Comment: work in progress; efficient 7B LLM-based agent
Published: 2024

6. Exploring Large Language Model for Graph Data Understanding in Online Job Recommendations

Author: Wu, Likang, Qiu, Zhaopeng, Zheng, Zhi, Zhu, Hengshu, and Chen, Enhong
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Information Retrieval
Abstract: Large Language Models (LLMs) have revolutionized natural language processing tasks, demonstrating their exceptional capabilities in various domains. However, their potential for behavior graph understanding in job recommendations remains largely unexplored. This paper focuses on unveiling the capability of large language models in understanding behavior graphs and leveraging this understanding to enhance recommendations in online recruitment, including the promotion of out-of-distribution (OOD) application. We present a novel framework that harnesses the rich contextual information and semantic representations provided by large language models to analyze behavior graphs and uncover underlying patterns and relationships. Specifically, we propose a meta-path prompt constructor that leverages LLM recommender to understand behavior graphs for the first time and design a corresponding path augmentation module to alleviate the prompt bias introduced by path-based sequence input. By leveraging this capability, our framework enables personalized and accurate job recommendations for individual users. We evaluate the effectiveness of our approach on a comprehensive dataset and demonstrate its ability to improve the relevance and quality of recommended quality. This research not only sheds light on the untapped potential of large language models but also provides valuable insights for developing advanced recommendation systems in the recruitment market. The findings contribute to the growing field of natural language processing and offer practical implications for enhancing job search experiences. We release the code at https://github.com/WLiK/GLRec.
Published: 2023

7. Generative Job Recommendations with Large Language Model

Author: Zheng, Zhi, Qiu, Zhaopeng, Hu, Xiao, Wu, Likang, Zhu, Hengshu, and Xiong, Hui
Subjects: Computer Science - Information Retrieval, Computer Science - Computation and Language
Abstract: The rapid development of online recruitment services has encouraged the utilization of recommender systems to streamline the job seeking process. Predominantly, current job recommendations deploy either collaborative filtering or person-job matching strategies. However, these models tend to operate as "black-box" systems and lack the capacity to offer explainable guidance to job seekers. Moreover, conventional matching-based recommendation methods are limited to retrieving and ranking existing jobs in the database, restricting their potential as comprehensive career AI advisors. To this end, here we present GIRL (GeneratIve job Recommendation based on Large language models), a novel approach inspired by recent advancements in the field of Large Language Models (LLMs). We initially employ a Supervised Fine-Tuning (SFT) strategy to instruct the LLM-based generator in crafting suitable Job Descriptions (JDs) based on the Curriculum Vitae (CV) of a job seeker. Moreover, we propose to train a model which can evaluate the matching degree between CVs and JDs as a reward model, and we use Proximal Policy Optimization (PPO)-based Reinforcement Learning (RL) method to further fine-tine the generator. This aligns the generator with recruiter feedback, tailoring the output to better meet employer preferences. In particular, GIRL serves as a job seeker-centric generative model, providing job suggestions without the need of a candidate set. This capability also enhances the performance of existing job recommendation models by supplementing job seeking features with generated content. With extensive experiments on a large-scale real-world dataset, we demonstrate the substantial effectiveness of our approach. We believe that GIRL introduces a paradigm-shifting approach to job recommendation systems, fostering a more personalized and comprehensive job-seeking experience.
Published: 2023

8. Towards Robust Knowledge Graph Embedding via Multi-task Reinforcement Learning

Author: Zhang, Zhao, Zhuang, Fuzhen, Zhu, Hengshu, Li, Chao, Xiong, Hui, He, Qing, and Xu, Yongjun
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Nowadays, Knowledge graphs (KGs) have been playing a pivotal role in AI-related applications. Despite the large sizes, existing KGs are far from complete and comprehensive. In order to continuously enrich KGs, automatic knowledge construction and update mechanisms are usually utilized, which inevitably bring in plenty of noise. However, most existing knowledge graph embedding (KGE) methods assume that all the triple facts in KGs are correct, and project both entities and relations into a low-dimensional space without considering noise and knowledge conflicts. This will lead to low-quality and unreliable representations of KGs. To this end, in this paper, we propose a general multi-task reinforcement learning framework, which can greatly alleviate the noisy data problem. In our framework, we exploit reinforcement learning for choosing high-quality knowledge triples while filtering out the noisy ones. Also, in order to take full advantage of the correlations among semantically similar relations, the triple selection processes of similar relations are trained in a collective way with multi-task learning. Moreover, we extend popular KGE models TransE, DistMult, ConvE and RotatE with the proposed framework. Finally, the experimental validation shows that our approach is able to enhance existing KGE models and can provide more robust representations of KGs in noisy scenarios., Comment: Accepted to TKDE
Published: 2021

9. Promotion of Answer Value Measurement with Domain Effects in Community Question Answering Systems

Author: Jin, Binbin, Chen, Enhong, Zhao, Hongke, Huang, Zhenya, Liu, Qi, Zhu, Hengshu, and Yu, Shui
Subjects: Computer Science - Computation and Language, Computer Science - Information Retrieval
Abstract: In the area of community question answering (CQA), answer selection and answer ranking are two tasks which are applied to help users quickly access valuable answers. Existing solutions mainly exploit the syntactic or semantic correlation between a question and its related answers (Q&A), where the multi-facet domain effects in CQA are still underexplored. In this paper, we propose a unified model, Enhanced Attentive Recurrent Neural Network (EARNN), for both answer selection and answer ranking tasks by taking full advantages of both Q&A semantics and multi-facet domain effects (i.e., topic effects and timeliness). Specifically, we develop a serialized LSTM to learn the unified representations of Q&A, where two attention mechanisms at either sentence-level or word-level are designed for capturing the deep effects of topics. Meanwhile, the emphasis of Q&A can be automatically distinguished. Furthermore, we design a time-sensitive ranking function to model the timeliness in CQA. To effectively train EARNN, a question-dependent pairwise learning strategy is also developed. Finally, we conduct extensive experiments on a real-world dataset from Quora. Experimental results validate the effectiveness and interpretability of our proposed EARNN model., Comment: IEEE Transactions on Systems, Man, and Cybernetics: Systems
Published: 2019
Full Text: View/download PDF

10. Enhancing Person-Job Fit for Talent Recruitment: An Ability-aware Neural Network Approach

Author: Qin, Chuan, Zhu, Hengshu, Xu, Tong, Zhu, Chen, Jiang, Liang, Chen, Enhong, and Xiong, Hui
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: The wide spread use of online recruitment services has led to information explosion in the job market. As a result, the recruiters have to seek the intelligent ways for Person Job Fit, which is the bridge for adapting the right job seekers to the right positions. Existing studies on Person Job Fit have a focus on measuring the matching degree between the talent qualification and the job requirements mainly based on the manual inspection of human resource experts despite of the subjective, incomplete, and inefficient nature of the human judgement. To this end, in this paper, we propose a novel end to end Ability aware Person Job Fit Neural Network model, which has a goal of reducing the dependence on manual labour and can provide better interpretation about the fitting results. The key idea is to exploit the rich information available at abundant historical job application data. Specifically, we propose a word level semantic representation for both job requirements and job seekers' experiences based on Recurrent Neural Network. Along this line, four hierarchical ability aware attention strategies are designed to measure the different importance of job requirements for semantic representation, as well as measuring the different contribution of each job experience to a specific ability requirement. Finally, extensive experiments on a large scale real world data set clearly validate the effectiveness and interpretability of the APJFNN framework compared with several baselines., Comment: This is an extended version of our SIGIR18 paper
Published: 2018
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

10 results on '"Zhu, Hengshu"'

1. Mixture of In-Context Experts Enhance LLMs' Long Context Awareness

2. Towards Efficient Resume Understanding: A Multi-Granularity Multi-Modal Pre-Training Approach

3. Enhancing Question Answering for Enterprise Knowledge Bases using Large Language Models

4. Make Large Language Model a Better Ranker

5. KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning over Knowledge Graph

6. Exploring Large Language Model for Graph Data Understanding in Online Job Recommendations

7. Generative Job Recommendations with Large Language Model

8. Towards Robust Knowledge Graph Embedding via Multi-task Reinforcement Learning

9. Promotion of Answer Value Measurement with Domain Effects in Community Question Answering Systems

10. Enhancing Person-Job Fit for Talent Recruitment: An Ability-aware Neural Network Approach

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

10 results on '"Zhu, Hengshu"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources