Author: "Cheng, Zhenglin" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Cheng, Zhenglin"' showing total 10 results

Start Over Author "Cheng, Zhenglin"

10 results on '"Cheng, Zhenglin"'

1. GMem: A Modular Approach for Ultra-Efficient Generative Models

Author: Tang, Yi, Sun, Peng, Cheng, Zhenglin, and Lin, Tao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Recent studies indicate that the denoising process in deep generative diffusion models implicitly learns and memorizes semantic information from the data distribution. These findings suggest that capturing more complex data distributions requires larger neural networks, leading to a substantial increase in computational demands, which in turn become the primary bottleneck in both training and inference of diffusion models. To this end, we introduce GMem: A Modular Approach for Ultra-Efficient Generative Models. Our approach GMem decouples the memory capacity from model and implements it as a separate, immutable memory set that preserves the essential semantic information in the data. The results are significant: GMem enhances both training, sampling efficiency, and diversity generation. This design on one hand reduces the reliance on network for memorize complex data distribution and thus enhancing both training and sampling efficiency. On ImageNet at $256 \times 256$ resolution, GMem achieves a $50\times$ training speedup compared to SiT, reaching FID $=7.66$ in fewer than $28$ epochs ($\sim 4$ hours training time), while SiT requires $1400$ epochs. Without classifier-free guidance, GMem achieves state-of-the-art (SoTA) performance FID $=1.53$ in $160$ epochs with only $\sim 20$ hours of training, outperforming LightningDiT which requires $800$ epochs and $\sim 95$ hours to attain FID $=2.17$., Comment: 9 pages, 5 figures, 3 tables
Published: 2024

2. Baichuan-Omni Technical Report

Author: Li, Yadong, Sun, Haoze, Lin, Mingan, Li, Tianpeng, Dong, Guosheng, Zhang, Tao, Ding, Bowen, Song, Wei, Cheng, Zhenglin, Huo, Yuqi, Chen, Song, Li, Xu, Pan, Da, Zhang, Shusen, Wu, Xin, Liang, Zheng, Liu, Jun, Lu, Keer, Zhao, Yaqi, Shen, Yanjun, Yang, Fan, Yu, Kaicheng, Lin, Tao, Xu, Jianhua, Zhou, Zenan, and Chen, Weipeng
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition
Abstract: The salient multimodal capabilities and interactive experience of GPT-4o highlight its critical role in practical applications, yet it lacks a high-performing open-source counterpart. In this paper, we introduce Baichuan-omni, the first open-source 7B Multimodal Large Language Model (MLLM) adept at concurrently processing and analyzing modalities of image, video, audio, and text, while delivering an advanced multimodal interactive experience and strong performance. We propose an effective multimodal training schema starting with 7B model and proceeding through two stages of multimodal alignment and multitask fine-tuning across audio, image, video, and text modal. This approach equips the language model with the ability to handle visual and audio data effectively. Demonstrating strong performance across various omni-modal and multimodal benchmarks, we aim for this contribution to serve as a competitive baseline for the open-source community in advancing multimodal understanding and real-time interaction.
Published: 2024

3. Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model

Author: Zhang, Wenqi, Cheng, Zhenglin, He, Yuanyu, Wang, Mengna, Shen, Yongliang, Tan, Zeqi, Hou, Guiyang, He, Mingqian, Ma, Yanna, Lu, Weiming, and Zhuang, Yueting
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Although most current large multimodal models (LMMs) can already understand photos of natural scenes and portraits, their understanding of abstract images, e.g., charts, maps, or layouts, and visual reasoning capabilities remains quite rudimentary. They often struggle with simple daily tasks, such as reading time from a clock, understanding a flowchart, or planning a route using a road map. In light of this, we design a multi-modal self-instruct, utilizing large language models and their code capabilities to synthesize massive abstract images and visual reasoning instructions across daily scenarios. Our strategy effortlessly creates a multimodal benchmark with 11,193 instructions for eight visual scenarios: charts, tables, simulated maps, dashboards, flowcharts, relation graphs, floor plans, and visual puzzles. \textbf{This benchmark, constructed with simple lines and geometric elements, exposes the shortcomings of most advanced LMMs} like Claude-3.5-Sonnet and GPT-4o in abstract image understanding, spatial relations reasoning, and visual element induction. Besides, to verify the quality of our synthetic data, we fine-tune an LMM using 62,476 synthetic chart, table and road map instructions. The results demonstrate improved chart understanding and map navigation performance, and also demonstrate potential benefits for other visual reasoning tasks. Our code is available at: \url{https://github.com/zwq2018/Multi-modal-Self-instruct}., Comment: The paper is accepted by EMNLP-24. Code: https://github.com/zwq2018/Multi-modal-Self-instruct dataset: https://huggingface.co/datasets/zwq2018/Multi-modal-Self-instruct Leaderboard: https://multi-modal-self-instruct.github.io/
Published: 2024

4. Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models

Author: Guo, Yongxin, Cheng, Zhenglin, Tang, Xiaoying, Tu, Zhaopeng, and Lin, Tao
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: The Sparse Mixture of Experts (SMoE) has been widely employed to enhance the efficiency of training and inference for Transformer-based foundational models, yielding promising results. However, the performance of SMoE heavily depends on the choice of hyper-parameters, such as the number of experts and the number of experts to be activated (referred to as top-k), resulting in significant computational overhead due to the extensive model training by searching over various hyper-parameter configurations. As a remedy, we introduce the Dynamic Mixture of Experts (DynMoE) technique. DynMoE incorporates (1) a novel gating method that enables each token to automatically determine the number of experts to activate. (2) An adaptive process automatically adjusts the number of experts during training. Extensive numerical results across Vision, Language, and Vision-Language tasks demonstrate the effectiveness of our approach to achieve competitive performance compared to GMoE for vision and language tasks, and MoE-LLaVA for vision-language tasks, while maintaining efficiency by activating fewer parameters. Our code is available at https://github.com/LINs-lab/DynMoE.
Published: 2024

5. Effect of inlet and outlet positions on heat dissipation performance of lithium-ion battery cold plates: An analysis based on topology optimization

Author: Zhan, Sen, Cheng, Zhenglin, Yin, Yanli, Yu, Cheng, and Zhao, Chen
Published: 2023
Full Text: View/download PDF

6. Mercury accumulation in soil from atmospheric deposition in temperate steppe of Inner Mongolia, China

Author: Cheng, Zhenglin, Tang, Yi, Li, Engui, Wu, Qingru, Wang, Long, Liu, Kaiyun, Wang, Shuxiao, Huang, Yongmei, and Duan, Lei
Published: 2020
Full Text: View/download PDF

7. Sulfur-modified rice husk biochar: A green method for the remediation of mercury contaminated soil

Author: O'Connor, David, Peng, Tianyue, Li, Guanghe, Wang, Shuxiao, Duan, Lei, Mulder, Jan, Cornelissen, Gerard, Cheng, Zhenglin, Yang, Shengmao, and Hou, Deyi
Published: 2018
Full Text: View/download PDF

8. Is surface water acidification a serious regional issue in China?

Author: Yu, Qian, Zhang, Ting, Cheng, Zhenglin, Zhao, Bin, Mulder, Jan, Larssen, Thorjørn, Wang, Shuxiao, and Duan, Lei
Published: 2017
Full Text: View/download PDF

9. Effect of Inlet and Outlet Positions on Heat Dissipation Performance of Lithium-Ion Battery Cold Plates: An Analysis Based on Topology Optimization

Author: Zhan, Sen, primary, Cheng, Zhenglin, additional, Yin, Yanli, additional, Yu, Cheng, additional, and Zhao, Chen, additional
Published: 2023
Full Text: View/download PDF

10. Soil Nitrogen and Sulfur Leaching in a Subtropical Forest at a Transition State under Decreasing Atmospheric Deposition

Author: Ke, Piaopiao, primary, Si, Gaoyue, additional, Luo, Yao, additional, Cheng, Zhenglin, additional, Yu, Qian, additional, and Duan, Lei, additional
Published: 2021
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

10 results on '"Cheng, Zhenglin"'

1. GMem: A Modular Approach for Ultra-Efficient Generative Models

2. Baichuan-Omni Technical Report

3. Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model

4. Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models

5. Effect of inlet and outlet positions on heat dissipation performance of lithium-ion battery cold plates: An analysis based on topology optimization

6. Mercury accumulation in soil from atmospheric deposition in temperate steppe of Inner Mongolia, China

7. Sulfur-modified rice husk biochar: A green method for the remediation of mercury contaminated soil

8. Is surface water acidification a serious regional issue in China?

9. Effect of Inlet and Outlet Positions on Heat Dissipation Performance of Lithium-Ion Battery Cold Plates: An Analysis Based on Topology Optimization

10. Soil Nitrogen and Sulfur Leaching in a Subtropical Forest at a Transition State under Decreasing Atmospheric Deposition

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

10 results on '"Cheng, Zhenglin"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources