Search

Your search keyword '"Electrical Engineering and Systems Science - Audio and Speech Processing"' showing total 41,389 results

Search Constraints

Start Over You searched for: Descriptor "Electrical Engineering and Systems Science - Audio and Speech Processing" Remove constraint Descriptor: "Electrical Engineering and Systems Science - Audio and Speech Processing"
41,389 results on '"Electrical Engineering and Systems Science - Audio and Speech Processing"'

Search Results

1. Parallel Stacked Aggregated Network for Voice Authentication in IoT-Enabled Smart Devices

2. Scaling Transformers for Low-Bitrate High-Quality Speech Coding

3. Musical composition and 2D cellular automata based on music intervals

4. Zero-shot Musical Stem Retrieval with Joint-Embedding Predictive Architectures

5. Noro: A Noise-Robust One-shot Voice Conversion System with Hidden Speaker Representation Capabilities

6. Memristive Nanowire Network for Energy Efficient Audio Classification: Pre-Processing-Free Reservoir Computing with Reduced Latency

7. Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook

8. Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis

9. V2SFlow: Video-to-Speech Generation with Speech Decomposition and Rectified Flow

10. A Voice-based Triage for Type 2 Diabetes using a Conversational Virtual Assistant in the Home Environment

11. AudioSetCaps: An Enriched Audio-Caption Dataset using Automated Generation Pipeline with Large Audio and Language Models

12. CoDiff-VC: A Codec-Assisted Diffusion Model for Zero-shot Voice Conversion

13. Parameter-Efficient Transfer Learning for Music Foundation Models

14. MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models

15. SALMONN-omni: A Codec-free LLM for Full-duplex Speech Understanding and Generation

16. Fusion of Discrete Representations and Self-Augmented Representations for Multilingual Automatic Speech Recognition

17. Music2Fail: Transfer Music to Failed Recorder Style

18. TS3-Codec: Transformer-Based Simple Streaming Single Codec

19. GaussianSpeech: Audio-Driven Gaussian Avatars

20. Novel Class Discovery for Open Set Raga Classification

21. Multiple Choice Learning for Efficient Speech Separation with Many Speakers

22. Continuous Autoregressive Models with Noise Augmentation Avoid Error Accumulation

23. AMPS: ASR with Multimodal Paraphrase Supervision

24. Continual Learning in Machine Speech Chain Using Gradient Episodic Memory

25. Wearable intelligent throat enables natural speech in stroke patients with dysarthria

26. Towards Improved Objective Perceptual Audio Quality Assessment -- Part 1: A Novel Data-Driven Cognitive Model

27. How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario

28. Typical vs. Atypical Disfluency Classification: Introducing the IIITH-TISA Corpus and Temporal Context-Based Feature Representations

29. JPPO: Joint Power and Prompt Optimization for Accelerated Large Language Model Services

30. Speech Separation using Neural Audio Codecs with Embedding Loss

31. Disentangled-Transformer: An Explainable End-to-End Automatic Speech Recognition Model with Speech Content-Context Separation

32. Video-Guided Foley Sound Generation with Multimodal Controls

33. Visatronic: A Multimodal Decoder-Only Model for Speech Synthesis

34. Scaling Speech-Text Pre-training with Synthetic Interleaved Data

35. Towards Maximum Likelihood Training for Transducer-based Streaming Speech Recognition

36. Comparative Analysis of ASR Methods for Speech Deepfake Detection

37. SKQVC: One-Shot Voice Conversion by K-Means Quantization with Self-Supervised Speech Representations

38. k2SSL: A Faster and Better Framework for Self-Supervised Speech Representation Learning

39. A Cross-Corpus Speech Emotion Recognition Method Based on Supervised Contrastive Learning

40. Sonic: Shifting Focus to Global Audio Perception in Portrait Animation

41. The SVASR System for Text-dependent Speaker Verification (TdSV) AAIC Challenge 2024

42. A Training-Free Approach for Music Style Transfer with Latent Diffusion Models

43. DiM-Gestor: Co-Speech Gesture Generation with Adaptive Layer Normalization Mamba-2

44. State-Space Large Audio Language Models

45. Hindi audio-video-Deepfake (HAV-DF): A Hindi language-based Audio-video Deepfake Dataset

46. Comparison of Tiny Machine Learning Techniques for Embedded Acoustic Emission Analysis

47. Gotta Hear Them All: Sound Source Aware Vision to Audio Generation

48. Towards Speaker Identification with Minimal Dataset and Constrained Resources using 1D-Convolution Neural Network

49. Open-Amp: Synthetic Data Framework for Audio Effect Foundation Models

50. DAIRHuM: A Platform for Directly Aligning AI Representations with Human Musical Judgments applied to Carnatic Music

Catalog

Books, media, physical & digital resources