Author: "Ma, Ziqiao" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Ma, Ziqiao"' showing total 25 results

Start Over Author "Ma, Ziqiao"

25 results on '"Ma, Ziqiao"'

1. Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference Under Ambiguities

Author: Zhang, Zheyuan, Hu, Fengyuan, Lee, Jayjun, Shi, Freda, Kordjamshidi, Parisa, Chai, Joyce, and Ma, Ziqiao
Subjects: Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition
Abstract: Spatial expressions in situated communication can be ambiguous, as their meanings vary depending on the frames of reference (FoR) adopted by speakers and listeners. While spatial language understanding and reasoning by vision-language models (VLMs) have gained increasing attention, potential ambiguities in these models are still under-explored. To address this issue, we present the COnsistent Multilingual Frame Of Reference Test (COMFORT), an evaluation protocol to systematically assess the spatial reasoning capabilities of VLMs. We evaluate nine state-of-the-art VLMs using COMFORT. Despite showing some alignment with English conventions in resolving ambiguities, our experiments reveal significant shortcomings of VLMs: notably, the models (1) exhibit poor robustness and consistency, (2) lack the flexibility to accommodate multiple FoRs, and (3) fail to adhere to language-specific or culture-specific conventions in cross-lingual tests, as English tends to dominate other languages. With a growing effort to align vision-language models with human cognitive intuitions, we call for more attention to the ambiguous nature and cross-cultural diversity of spatial reasoning., Comment: Accepted to Pluralistic Alignment @ NeurIPS 2024 | Project page: https://spatial-comfort.github.io/
Published: 2024

2. Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models

Author: Zhang, Yue, Ma, Ziqiao, Li, Jialu, Qiao, Yanyuan, Wang, Zun, Chai, Joyce, Wu, Qi, Bansal, Mohit, and Kordjamshidi, Parisa
Subjects: Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition
Abstract: Vision-and-Language Navigation (VLN) has gained increasing attention over recent years and many approaches have emerged to advance their development. The remarkable achievements of foundation models have shaped the challenges and proposed methods for VLN research. In this survey, we provide a top-down review that adopts a principled framework for embodied planning and reasoning, and emphasizes the current methods and future opportunities leveraging foundation models to address VLN challenges. We hope our in-depth discussions could provide valuable resources and insights: on one hand, to milestone the progress and explore opportunities and potential roles for foundation models in this field, and on the other, to organize different challenges and solutions in VLN to foundation model researchers., Comment: Authors contributed equally to this work, and supervisors contributed equal advising to this work
Published: 2024

3. Multi-Object Hallucination in Vision-Language Models

Author: Chen, Xuweiyi, Ma, Ziqiao, Zhang, Xuejun, Xu, Sihan, Qian, Shengyi, Yang, Jianing, Fouhey, David F., and Chai, Joyce
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Large vision language models (LVLMs) often suffer from object hallucination, producing objects not present in the given images. While current benchmarks for object hallucination primarily concentrate on the presence of a single object class rather than individual entities, this work systematically investigates multi-object hallucination, examining how models misperceive (e.g., invent nonexistent objects or become distracted) when tasked with focusing on multiple objects simultaneously. We introduce Recognition-based Object Probing Evaluation (ROPE), an automated evaluation protocol that considers the distribution of object classes within a single image during testing and uses visual referring prompts to eliminate ambiguity. With comprehensive empirical studies and analysis of potential factors leading to multi-object hallucination, we found that (1) LVLMs suffer more hallucinations when focusing on multiple objects compared to a single object. (2) The tested object class distribution affects hallucination behaviors, indicating that LVLMs may follow shortcuts and spurious correlations.(3) Hallucinatory behaviors are influenced by data-specific factors, salience and frequency, and model intrinsic behaviors. We hope to enable LVLMs to recognize and reason about multiple objects that often occur in realistic visual scenes, provide insights, and quantify our progress towards mitigating the issues., Comment: Accepted to ALVR @ ACL 2024 | Project page: https://multi-object-hallucination.github.io/
Published: 2024

4. Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions

Author: Shen, Hua, Knearem, Tiffany, Ghosh, Reshmi, Alkiek, Kenan, Krishna, Kundan, Liu, Yachuan, Ma, Ziqiao, Petridis, Savvas, Peng, Yi-Hao, Qiwei, Li, Rakshit, Sushrita, Si, Chenglei, Xie, Yutong, Bigham, Jeffrey P., Bentley, Frank, Chai, Joyce, Lipton, Zachary, Mei, Qiaozhu, Mihalcea, Rada, Terry, Michael, Yang, Diyi, Morris, Meredith Ringel, Resnick, Paul, and Jurgens, David
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Recent advancements in general-purpose AI have highlighted the importance of guiding AI systems towards the intended goals, ethical principles, and values of individuals and groups, a concept broadly recognized as alignment. However, the lack of clarified definitions and scopes of human-AI alignment poses a significant obstacle, hampering collaborative efforts across research domains to achieve this alignment. In particular, ML- and philosophy-oriented alignment research often views AI alignment as a static, unidirectional process (i.e., aiming to ensure that AI systems' objectives match humans) rather than an ongoing, mutual alignment problem. This perspective largely neglects the long-term interaction and dynamic changes of alignment. To understand these gaps, we introduce a systematic review of over 400 papers published between 2019 and January 2024, spanning multiple domains such as Human-Computer Interaction (HCI), Natural Language Processing (NLP), Machine Learning (ML). We characterize, define and scope human-AI alignment. From this, we present a conceptual framework of "Bidirectional Human-AI Alignment" to organize the literature from a human-centered perspective. This framework encompasses both 1) conventional studies of aligning AI to humans that ensures AI produces the intended outcomes determined by humans, and 2) a proposed concept of aligning humans to AI, which aims to help individuals and society adjust to AI advancements both cognitively and behaviorally. Additionally, we articulate the key findings derived from literature analysis, including literature gaps and trends, human values, and interaction techniques. To pave the way for future studies, we envision three key challenges and give recommendations for future research., Comment: proposing "bidirectional human-AI alignment" framework after a systematic review of over 400 alignment papers
Published: 2024

5. DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experiences

Author: Huang, Yidong, Sansom, Jacob, Ma, Ziqiao, Gervits, Felix, and Chai, Joyce
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Recent advancements in foundation models (FMs) have unlocked new prospects in autonomous driving, yet the experimental settings of these studies are preliminary, over-simplified, and fail to capture the complexity of real-world driving scenarios in human environments. It remains under-explored whether FM agents can handle long-horizon navigation tasks with free-from dialogue and deal with unexpected situations caused by environmental dynamics or task changes. To explore the capabilities and boundaries of FMs faced with the challenges above, we introduce DriVLMe, a video-language-model-based agent to facilitate natural and effective communication between humans and autonomous vehicles that perceive the environment and navigate. We develop DriVLMe from both embodied experiences in a simulated environment and social experiences from real human dialogue. While DriVLMe demonstrates competitive performance in both open-loop benchmarks and closed-loop human studies, we reveal several limitations and challenges, including unacceptable inference time, imbalanced training data, limited visual understanding, challenges with multi-turn interactions, simplified language generation from robotic experiences, and difficulties in handling on-the-fly unexpected situations like environmental dynamics and task changes., Comment: 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Published: 2024

6. Babysit A Language Model From Scratch: Interactive Language Learning by Trials and Demonstrations

Author: Ma, Ziqiao, Wang, Zekun, and Chai, Joyce
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Humans are efficient language learners and inherently social creatures. Our language development is largely shaped by our social interactions, for example, the demonstration and feedback from caregivers. Contrary to human language learning, recent advancements in large language models have primarily adopted a non-interactive training paradigm, and refined pre-trained models through feedback afterward. In this work, we aim to examine how corrective feedback from interactions influences neural language acquisition from the ground up through systematically controlled experiments, assessing whether it contributes to learning efficiency in language models. We introduce a trial-and-demonstration (TnD) learning framework that incorporates three components: student trials, teacher demonstrations, and a reward conditioned on language competence at various developmental stages. Our experiments reveal that the TnD approach accelerates word acquisition for student models of equal and smaller numbers of parameters, and we highlight the significance of both trials and demonstrations. We further show that the teacher's choices of words influence students' word-specific learning efficiency, and a practice-makes-perfect effect is evident by a strong correlation between the frequency of words in trials and their respective learning curves. Our findings suggest that interactive language learning, with teacher demonstrations and student trials, can facilitate efficient word learning in language models.
Published: 2024

7. GROUNDHOG: Grounding Large Language Models to Holistic Segmentation

Author: Zhang, Yichi, Ma, Ziqiao, Gao, Xiaofeng, Shakiah, Suhaila, Gao, Qiaozi, and Chai, Joyce
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Most multimodal large language models (MLLMs) learn language-to-object grounding through causal language modeling where grounded objects are captured by bounding boxes as sequences of location tokens. This paradigm lacks pixel-level representations that are important for fine-grained visual understanding and diagnosis. In this work, we introduce GROUNDHOG, an MLLM developed by grounding Large Language Models to holistic segmentation. GROUNDHOG incorporates a masked feature extractor and converts extracted features into visual entity tokens for the MLLM backbone, which then connects groundable phrases to unified grounding masks by retrieving and merging the entity masks. To train GROUNDHOG, we carefully curated M3G2, a grounded visual instruction tuning dataset with Multi-Modal Multi-Grained Grounding, by harvesting a collection of segmentation-grounded datasets with rich annotations. Our experimental results show that GROUNDHOG achieves superior performance on various language grounding tasks without task-specific fine-tuning, and significantly reduces object hallucination. GROUNDHOG also demonstrates better grounding towards complex forms of visual input and provides easy-to-understand diagnosis in failure cases., Comment: Accepted to CVPR 2024. Website: https://groundhog-mllm.github.io/
Published: 2024

8. Inversion-Free Image Editing with Natural Language

Author: Xu, Sihan, Huang, Yidong, Pan, Jiayi, Ma, Ziqiao, and Chai, Joyce
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Despite recent advances in inversion-based editing, text-guided image manipulation remains challenging for diffusion models. The primary bottlenecks include 1) the time-consuming nature of the inversion process; 2) the struggle to balance consistency with accuracy; 3) the lack of compatibility with efficient consistency sampling methods used in consistency models. To address the above issues, we start by asking ourselves if the inversion process can be eliminated for editing. We show that when the initial sample is known, a special variance schedule reduces the denoising step to the same form as the multi-step consistency sampling. We name this Denoising Diffusion Consistent Model (DDCM), and note that it implies a virtual inversion strategy without explicit inversion in sampling. We further unify the attention control mechanisms in a tuning-free framework for text-guided editing. Combining them, we present inversion-free editing (InfEdit), which allows for consistent and faithful editing for both rigid and non-rigid semantic changes, catering to intricate modifications without compromising on the image's integrity and explicit inversion. Through extensive experiments, InfEdit shows strong performance in various editing tasks and also maintains a seamless workflow (less than 3 seconds on one single A40), demonstrating the potential for real-time applications. Project Page: https://sled-group.github.io/InfEdit/, Comment: Project Page: https://sled-group.github.io/InfEdit/
Published: 2023

9. Towards A Holistic Landscape of Situated Theory of Mind in Large Language Models

Author: Ma, Ziqiao, Sansom, Jacob, Peng, Run, and Chai, Joyce
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Large Language Models (LLMs) have generated considerable interest and debate regarding their potential emergence of Theory of Mind (ToM). Several recent inquiries reveal a lack of robust ToM in these models and pose a pressing demand to develop new benchmarks, as current ones primarily focus on different aspects of ToM and are prone to shortcuts and data leakage. In this position paper, we seek to answer two road-blocking questions: (1) How can we taxonomize a holistic landscape of machine ToM? (2) What is a more effective evaluation protocol for machine ToM? Following psychological studies, we taxonomize machine ToM into 7 mental state categories and delineate existing benchmarks to identify under-explored aspects of ToM. We argue for a holistic and situated evaluation of ToM to break ToM into individual components and treat LLMs as an agent who is physically situated in environments and socially situated in interactions with humans. Such situated evaluation provides a more comprehensive assessment of mental states and potentially mitigates the risk of shortcuts and data leakage. We further present a pilot study in a grid world setup as a proof of concept. We hope this position paper can facilitate future research to integrate ToM with LLMs and offer an intuitive means for researchers to better position their work in the landscape of ToM. Project page: https://github.com/Mars-tin/awesome-theory-of-mind, Comment: Theme Track, Findings of EMNLP 2023
Published: 2023

10. CycleNet: Rethinking Cycle Consistency in Text-Guided Diffusion for Image Manipulation

Author: Xu, Sihan, Ma, Ziqiao, Huang, Yidong, Lee, Honglak, and Chai, Joyce
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Diffusion models (DMs) have enabled breakthroughs in image synthesis tasks but lack an intuitive interface for consistent image-to-image (I2I) translation. Various methods have been explored to address this issue, including mask-based methods, attention-based methods, and image-conditioning. However, it remains a critical challenge to enable unpaired I2I translation with pre-trained DMs while maintaining satisfying consistency. This paper introduces Cyclenet, a novel but simple method that incorporates cycle consistency into DMs to regularize image manipulation. We validate Cyclenet on unpaired I2I tasks of different granularities. Besides the scene and object level translation, we additionally contribute a multi-domain I2I translation dataset to study the physical state changes of objects. Our empirical studies show that Cyclenet is superior in translation consistency and quality, and can generate high-quality images for out-of-domain distributions with a simple change of the textual prompt. Cyclenet is a practical framework, which is robust even with very limited training data (around 2k) and requires minimal computational resources (1 GPU) to train. Project homepage: https://cyclenetweb.github.io/, Comment: NeurIPS 2023
Published: 2023

11. World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models

Author: Ma, Ziqiao, Pan, Jiayi, and Chai, Joyce
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: The ability to connect language units to their referents in the physical world, referred to as grounding, is crucial to learning and understanding grounded meanings of words. While humans demonstrate fast mapping in new word learning, it remains unclear whether modern vision-language models can truly represent language with their grounded meanings and how grounding may further bootstrap new word learning. To this end, we introduce Grounded Open Vocabulary Acquisition (GOVA) to examine grounding and bootstrapping in open-world language learning. As an initial attempt, we propose object-oriented BERT (OctoBERT), a novel visually-grounded language model by pre-training on image-text pairs highlighting grounding as an objective. Through extensive experiments and analysis, we demonstrate that OctoBERT is a more coherent and fast grounded word learner, and that the grounding ability acquired during pre-training helps the model to learn unseen words more rapidly and robustly. Our code is available at https://github.com/sled-group/world-to-words, Comment: ACL 2023
Published: 2023

12. NLP Reproducibility For All: Understanding Experiences of Beginners

Author: Storks, Shane, Yu, Keunwoo Peter, Ma, Ziqiao, and Chai, Joyce
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: As natural language processing (NLP) has recently seen an unprecedented level of excitement, and more people are eager to enter the field, it is unclear whether current research reproducibility efforts are sufficient for this group of beginners to apply the latest developments. To understand their needs, we conducted a study with 93 students in an introductory NLP course, where students reproduced the results of recent NLP papers. Surprisingly, we find that their programming skill and comprehension of research papers have a limited impact on their effort spent completing the exercise. Instead, we find accessibility efforts by research authors to be the key to success, including complete documentation, better coding practice, and easier access to data files. Going forward, we recommend that NLP researchers pay close attention to these simple aspects of open-sourcing their work, and use insights from beginners' feedback to provide actionable ideas on how to better support them., Comment: ACL 2023 Theme Track
Published: 2023

13. Towards Collaborative Plan Acquisition through Theory of Mind Modeling in Situated Dialogue

Author: Bara, Cristian-Paul, Ma, Ziqiao, Yu, Yingzhuo, Shah, Julie, and Chai, Joyce
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Collaborative tasks often begin with partial task knowledge and incomplete initial plans from each partner. To complete these tasks, agents need to engage in situated communication with their partners and coordinate their partial plans towards a complete plan to achieve a joint task goal. While such collaboration seems effortless in a human-human team, it is highly challenging for human-AI collaboration. To address this limitation, this paper takes a step towards collaborative plan acquisition, where humans and agents strive to learn and communicate with each other to acquire a complete plan for joint tasks. Specifically, we formulate a novel problem for agents to predict the missing task knowledge for themselves and for their partners based on rich perceptual and dialogue history. We extend a situated dialogue benchmark for symmetric collaborative tasks in a 3D blocks world and investigate computational strategies for plan acquisition. Our empirical results suggest that predicting the partner's missing knowledge is a more viable approach than predicting one's own. We show that explicit modeling of the partner's dialogue moves and mental states produces improved and more stable results than without. These results provide insight for future AI agents that can predict what knowledge their partner is missing and, therefore, can proactively communicate such information to help their partner acquire such missing knowledge toward a common understanding of joint tasks.
Published: 2023
Full Text: View/download PDF

14. DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in Interactive Autonomous Driving Agents

Author: Ma, Ziqiao, VanDerPloeg, Ben, Bara, Cristian-Paul, Yidong, Huang, Kim, Eui-In, Gervits, Felix, Marge, Matthew, and Chai, Joyce
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Robotics
Abstract: In the real world, autonomous driving agents navigate in highly dynamic environments full of unexpected situations where pre-trained models are unreliable. In these situations, what is immediately available to vehicles is often only human operators. Empowering autonomous driving agents with the ability to navigate in a continuous and dynamic environment and to communicate with humans through sensorimotor-grounded dialogue becomes critical. To this end, we introduce Dialogue On the ROad To Handle Irregular Events (DOROTHIE), a novel interactive simulation platform that enables the creation of unexpected situations on the fly to support empirical studies on situated communication with autonomous driving agents. Based on this platform, we created the Situated Dialogue Navigation (SDN), a navigation benchmark of 183 trials with a total of 8415 utterances, around 18.7 hours of control streams, and 2.9 hours of trimmed audio. SDN is developed to evaluate the agent's ability to predict dialogue moves from humans as well as generate its own dialogue moves and physical navigation actions. We further developed a transformer-based baseline model for these SDN tasks. Our empirical results indicate that language guided-navigation in a highly dynamic environment is an extremely difficult task for end-to-end models. These results will provide insight towards future work on robust autonomous driving agents. The DOROTHIE platform, SDN benchmark, and code for the baseline model are available at https://github.com/sled-group/DOROTHIE., Comment: Findings of EMNLP, 2022
Published: 2022

15. DANLI: Deliberative Agent for Following Natural Language Instructions

Author: Zhang, Yichi, Yang, Jianing, Pan, Jiayi, Storks, Shane, Devraj, Nikhil, Ma, Ziqiao, Yu, Keunwoo Peter, Bao, Yuwei, and Chai, Joyce
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Robotics
Abstract: Recent years have seen an increasing amount of work on embodied AI agents that can perform tasks by following human language instructions. However, most of these agents are reactive, meaning that they simply learn and imitate behaviors encountered in the training data. These reactive agents are insufficient for long-horizon complex tasks. To address this limitation, we propose a neuro-symbolic deliberative agent that, while following language instructions, proactively applies reasoning and planning based on its neural and symbolic representations acquired from past experience (e.g., natural language and egocentric vision). We show that our deliberative agent achieves greater than 70% improvement over reactive baselines on the challenging TEACh benchmark. Moreover, the underlying reasoning and planning processes, together with our modular framework, offer impressive transparency and explainability to the behaviors of the agent. This enables an in-depth understanding of the agent's capabilities, which shed light on challenges and opportunities for future embodied agents for instruction following. The code is available at https://github.com/sled-group/DANLI., Comment: Accepted in EMNLP 2022
Published: 2022

16. Partition-Based Active Learning for Graph Neural Networks

Author: Ma, Jiaqi, Ma, Ziqiao, Chai, Joyce, and Mei, Qiaozhu
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: We study the problem of semi-supervised learning with Graph Neural Networks (GNNs) in an active learning setup. We propose GraphPart, a novel partition-based active learning approach for GNNs. GraphPart first splits the graph into disjoint partitions and then selects representative nodes within each partition to query. The proposed method is motivated by a novel analysis of the classification error under realistic smoothness assumptions over the graph and the node features. Extensive experiments on multiple benchmark datasets demonstrate that the proposed method outperforms existing active learning methods for GNNs under a wide range of annotation budget constraints. In addition, the proposed method does not introduce additional hyperparameters, which is crucial for model training, especially in the active learning setting where a labeled validation set may not be available., Comment: Accepted to Transactions on Machine Learning Research (TMLR). Code available at: https://github.com/Mars-tin/GraphPart
Published: 2022

17. COVID-19 Epidemic Information Needs and Information Seeking Behavior of Overseas Chinese Students

Author: Wang, Lin, Ma, Ziqiao, Jiang, Yuwei, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Toeppe, Katharina, editor, Yan, Hui, editor, and Chu, Samuel Kai Wah, editor
Published: 2021
Full Text: View/download PDF

18. COVID-19 Epidemic Information Needs and Information Seeking Behavior of Overseas Chinese Students

Author: Wang, Lin, primary, Ma, Ziqiao, additional, and Jiang, Yuwei, additional
Published: 2021
Full Text: View/download PDF

19. Online Health Information Seeking Behavior of Users with Health Anxiety: An Empirical Study based on Eye-movement Experiment (Preprint)

Author: Wang, Lin, primary, Yang, Yutong, additional, and Ma, Ziqiao, additional
Published: 2023
Full Text: View/download PDF

20. Towards Collaborative Plan Acquisition through Theory of Mind Modeling in Situated Dialogue

Author: Bara, Cristian-Paul, primary, Ma, Ziqiao, additional, Yu, Yingzhuo, additional, Shah, Julie, additional, and Chai, Joyce, additional
Published: 2023
Full Text: View/download PDF

21. NLP Reproducibility For All: Understanding Experiences of Beginners

Author: Storks, Shane, primary, Yu, Keunwoo, additional, Ma, Ziqiao, additional, and Chai, Joyce, additional
Published: 2023
Full Text: View/download PDF

22. World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models

Author: Ma, Ziqiao, primary, Pan, Jiayi, additional, and Chai, Joyce, additional
Published: 2023
Full Text: View/download PDF

23. Towards A Holistic Landscape of Situated Theory of Mind in Large Language Models

Author: Ma, Ziqiao, primary, Sansom, Jacob, additional, Peng, Run, additional, and Chai, Joyce, additional
Published: 2023
Full Text: View/download PDF

24. DANLI: Deliberative Agent for Following Natural Language Instructions

Author: Zhang, Yichi, primary, Yang, Jianing, additional, Pan, Jiayi, additional, Storks, Shane, additional, Devraj, Nikhil, additional, Ma, Ziqiao, additional, Yu, Keunwoo, additional, Bao, Yuwei, additional, and Chai, Joyce, additional
Published: 2022
Full Text: View/download PDF

25. DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in Interactive Autonomous Driving Agents

Author: Ma, Ziqiao, primary, VanDerPloeg, Benjamin, additional, Bara, Cristian-Paul, additional, Huang, Yidong, additional, Kim, Eui-In, additional, Gervits, Felix, additional, Marge, Matthew, additional, and Chai, Joyce, additional
Published: 2022
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

25 results on '"Ma, Ziqiao"'

1. Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference Under Ambiguities

2. Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models

3. Multi-Object Hallucination in Vision-Language Models

4. Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions

5. DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experiences

6. Babysit A Language Model From Scratch: Interactive Language Learning by Trials and Demonstrations

7. GROUNDHOG: Grounding Large Language Models to Holistic Segmentation

8. Inversion-Free Image Editing with Natural Language

9. Towards A Holistic Landscape of Situated Theory of Mind in Large Language Models

10. CycleNet: Rethinking Cycle Consistency in Text-Guided Diffusion for Image Manipulation

11. World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models

12. NLP Reproducibility For All: Understanding Experiences of Beginners

13. Towards Collaborative Plan Acquisition through Theory of Mind Modeling in Situated Dialogue

14. DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in Interactive Autonomous Driving Agents

15. DANLI: Deliberative Agent for Following Natural Language Instructions

16. Partition-Based Active Learning for Graph Neural Networks

17. COVID-19 Epidemic Information Needs and Information Seeking Behavior of Overseas Chinese Students

18. COVID-19 Epidemic Information Needs and Information Seeking Behavior of Overseas Chinese Students

19. Online Health Information Seeking Behavior of Users with Health Anxiety: An Empirical Study based on Eye-movement Experiment (Preprint)

20. Towards Collaborative Plan Acquisition through Theory of Mind Modeling in Situated Dialogue

21. NLP Reproducibility For All: Understanding Experiences of Beginners

22. World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models

23. Towards A Holistic Landscape of Situated Theory of Mind in Large Language Models

24. DANLI: Deliberative Agent for Following Natural Language Instructions

25. DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in Interactive Autonomous Driving Agents

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

25 results on '"Ma, Ziqiao"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources