Search

Your search keyword '"Panda, Rameswar"' showing total 235 results

Search Constraints

Start Over You searched for: Author "Panda, Rameswar" Remove constraint Author: "Panda, Rameswar"
235 results on '"Panda, Rameswar"'

Search Results

1. Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler

2. Scaling Granite Code Models to 128K Context

3. The infrastructure powering IBM's Gen AI model development

4. Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks

5. Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts

6. Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

7. Granite Code Models: A Family of Open Foundation Models for Code Intelligence

8. Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models

9. Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization

11. Scattered Mixture-of-Experts Implementation

12. Data Engineering for Scaling Language Models to 128K Context

13. API Pack: A Massive Multi-Programming Language Dataset for API Call Generation

14. Diversity Measurement and Subset Selection for Instruction Tuning Datasets

15. Gated Linear Attention Transformers with Hardware-Efficient Training

16. Learning Human Action Recognition Representations Without Real Humans

17. LangNav: Language as a Perceptual Representation for Navigation

18. Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models

20. Going Beyond Nouns With Vision & Language Models Using Synthetic Data

21. Neural Architecture Search for Effective Teacher-Student Knowledge Transfer in Language Models

22. MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge

23. Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning

24. Learning to Grow Pretrained Models for Efficient Transformer Training

25. Energy Transformer

27. Synthetic Pre-Training Tasks for Neural Machine Translation

28. CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual Learning

29. Teaching Structured Vision&Language Concepts to Vision&Language Models

30. ConStruct-VL: Data-Free Continual Structured VL Concepts Learning

31. Semi-Supervised Domain Adaptation with Auto-Encoder via Simultaneous Learning

32. FETA: Towards Specializing Foundation Models for Expert Task Applications

33. VALHALLA: Visual Hallucination for Machine Translation

34. Task2Sim : Towards Effective Pre-training and Transfer from Synthetic Data

35. Cascaded Multilingual Audio-Visual Learning from Videos

36. Selective Regression Under Fairness Criteria

37. Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

38. Dynamic Network Quantization for Efficient Video Inference

39. Can An Image Classifier Suffice For Action Recognition?

40. IA-RED$^2$: Interpretability-Aware Redundancy Reduction for Vision Transformers

41. Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data

42. RegionViT: Regional-to-Local Attention for Vision Transformers

43. AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition

44. Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos

45. Detector-Free Weakly Supervised Grounding by Separation

46. CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

47. A Broad Study on the Transferability of Visual Representations with Contrastive Learning

48. Improved Techniques for Quantizing Deep Networks with Adaptive Bit-Widths

49. VA-RED$^2$: Video Adaptive Redundancy Reduction

50. AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition

Catalog

Books, media, physical & digital resources