Author: "Gosal, Gurpreet" / Database: OAIster - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Gosal, Gurpreet"' showing total 4 results

Start Over Author "Gosal, Gurpreet" Database OAIster

4 results on '"Gosal, Gurpreet"'

1. Med42 -- Evaluating Fine-Tuning Strategies for Medical LLMs: Full-Parameter vs. Parameter-Efficient Approaches

Author: Christophe, Clément, Kanithi, Praveen K, Munjal, Prateek, Raha, Tathagata, Hayat, Nasir, Rajan, Ronnie, Al-Mahrooqi, Ahmed, Gupta, Avani, Salman, Muhammad Umar, Gosal, Gurpreet, Kanakiya, Bhargav, Chen, Charles, Vassilieva, Natalia, Amor, Boulbaba Ben, Pimentel, Marco AF, Khan, Shadab, Christophe, Clément, Kanithi, Praveen K, Munjal, Prateek, Raha, Tathagata, Hayat, Nasir, Rajan, Ronnie, Al-Mahrooqi, Ahmed, Gupta, Avani, Salman, Muhammad Umar, Gosal, Gurpreet, Kanakiya, Bhargav, Chen, Charles, Vassilieva, Natalia, Amor, Boulbaba Ben, Pimentel, Marco AF, and Khan, Shadab
Abstract: This study presents a comprehensive analysis and comparison of two predominant fine-tuning methodologies - full-parameter fine-tuning and parameter-efficient tuning - within the context of medical Large Language Models (LLMs). We developed and refined a series of LLMs, based on the Llama-2 architecture, specifically designed to enhance medical knowledge retrieval, reasoning, and question-answering capabilities. Our experiments systematically evaluate the effectiveness of these tuning strategies across various well-known medical benchmarks. Notably, our medical LLM Med42 showed an accuracy level of 72% on the US Medical Licensing Examination (USMLE) datasets, setting a new standard in performance for openly available medical LLMs. Through this comparative analysis, we aim to identify the most effective and efficient method for fine-tuning LLMs in the medical domain, thereby contributing significantly to the advancement of AI-driven healthcare applications., Comment: Published at AAAI 2024 Spring Symposium - Clinical Foundation Models
Published: 2024

2. Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster

Author: Dey, Nolan, Gosal, Gurpreet, Zhiming, Chen, Khachane, Hemant, Marshall, William, Pathria, Ribhu, Tom, Marvin, Hestness, Joel, Dey, Nolan, Gosal, Gurpreet, Zhiming, Chen, Khachane, Hemant, Marshall, William, Pathria, Ribhu, Tom, Marvin, and Hestness, Joel
Abstract: We study recent research advances that improve large language models through efficient pre-training and scaling, and open datasets and tools. We combine these advances to introduce Cerebras-GPT, a family of open compute-optimal language models scaled from 111M to 13B parameters. We train Cerebras-GPT models on the Eleuther Pile dataset following DeepMind Chinchilla scaling rules for efficient pre-training (highest accuracy for a given compute budget). We characterize the predictable power-law scaling and compare Cerebras-GPT with other publicly-available models to show all Cerebras-GPT models have state-of-the-art training efficiency on both pre-training and downstream objectives. We describe our learnings including how Maximal Update Parameterization ($\mu$P) can further improve large model scaling, improving accuracy and hyperparameter predictability at scale. We release our pre-trained models and code, making this paper the first open and reproducible work comparing compute-optimal model scaling to models trained on fixed dataset sizes. Cerebras-GPT models are available on HuggingFace: https://huggingface.co/cerebras., Comment: 13 pages main text, 16 pages appendix, 13 figures
Published: 2023

3. Improving Resnet-9 Generalization Trained on Small Datasets

Author: Awad, Omar Mohamed, Hajimolahoseini, Habib, Lim, Michael, Gosal, Gurpreet, Ahmed, Walid, Liu, Yang, Deng, Gordon, Awad, Omar Mohamed, Hajimolahoseini, Habib, Lim, Michael, Gosal, Gurpreet, Ahmed, Walid, Liu, Yang, and Deng, Gordon
Abstract: This paper presents our proposed approach that won the first prize at the ICLR competition on Hardware Aware Efficient Training. The challenge is to achieve the highest possible accuracy in an image classification task in less than 10 minutes. The training is done on a small dataset of 5000 images picked randomly from CIFAR-10 dataset. The evaluation is performed by the competition organizers on a secret dataset with 1000 images of the same size. Our approach includes applying a series of technique for improving the generalization of ResNet-9 including: sharpness aware optimization, label smoothing, gradient centralization, input patch whitening as well as metalearning based training. Our experiments show that the ResNet-9 can achieve the accuracy of 88% while trained only on a 10% subset of CIFAR-10 dataset in less than 10 minuets
Published: 2023

4. Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models

Author: Sengupta, Neha, Sahu, Sunil Kumar, Jia, Bokang, Katipomu, Satheesh, Li, Haonan, Koto, Fajri, Marshall, William, Gosal, Gurpreet, Liu, Cynthia, Chen, Zhiming, Afzal, Osama Mohammed, Kamboj, Samta, Pandit, Onkar, Pal, Rahul, Pradhan, Lalit, Mujahid, Zain Muhammad, Baali, Massa, Han, Xudong, Bsharat, Sondos Mahmoud, Aji, Alham Fikri, Shen, Zhiqiang, Liu, Zhengzhong, Vassilieva, Natalia, Hestness, Joel, Hock, Andy, Feldman, Andrew, Lee, Jonathan, Jackson, Andrew, Ren, Hector Xuguang, Nakov, Preslav, Baldwin, Timothy, Xing, Eric, Sengupta, Neha, Sahu, Sunil Kumar, Jia, Bokang, Katipomu, Satheesh, Li, Haonan, Koto, Fajri, Marshall, William, Gosal, Gurpreet, Liu, Cynthia, Chen, Zhiming, Afzal, Osama Mohammed, Kamboj, Samta, Pandit, Onkar, Pal, Rahul, Pradhan, Lalit, Mujahid, Zain Muhammad, Baali, Massa, Han, Xudong, Bsharat, Sondos Mahmoud, Aji, Alham Fikri, Shen, Zhiqiang, Liu, Zhengzhong, Vassilieva, Natalia, Hestness, Joel, Hock, Andy, Feldman, Andrew, Lee, Jonathan, Jackson, Andrew, Ren, Hector Xuguang, Nakov, Preslav, Baldwin, Timothy, and Xing, Eric
Abstract: We introduce Jais and Jais-chat, new state-of-the-art Arabic-centric foundation and instruction-tuned open generative large language models (LLMs). The models are based on the GPT-3 decoder-only architecture and are pretrained on a mixture of Arabic and English texts, including source code in various programming languages. With 13 billion parameters, they demonstrate better knowledge and reasoning capabilities in Arabic than any existing open Arabic and multilingual models by a sizable margin, based on extensive evaluation. Moreover, the models are competitive in English compared to English-centric open models of similar size, despite being trained on much less English data. We provide a detailed description of the training, the tuning, the safety alignment, and the evaluation of the models. We release two open versions of the model -- the foundation Jais model, and an instruction-tuned Jais-chat variant -- with the aim of promoting research on Arabic LLMs. Available at https://huggingface.co/inception-mbzuai/jais-13b-chat, Comment: Arabic-centric, foundation model, large-language model, LLM, generative model, instruction-tuned, Jais, Jais-chat
Published: 2023

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

4 results on '"Gosal, Gurpreet"'

1. Med42 -- Evaluating Fine-Tuning Strategies for Medical LLMs: Full-Parameter vs. Parameter-Efficient Approaches

2. Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster

3. Improving Resnet-9 Generalization Trained on Small Datasets

4. Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Publication Type

Database

4 results on '"Gosal, Gurpreet"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources