410 results on '"Takaaki Hori"'
Search Results
2. Variable Attention Masking for Configurable Transformer Transducer Speech Recognition.
3. Low-Latency Online Streaming VideoQA Using Audio-Visual Transformers.
4. Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy.
5. Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR.
6. Sequence Transduction with Graph-Based Supervision.
7. Audio-Visual Scene-Aware Dialog and Reasoning Using Audio-Visual Transformers with Joint Student-Teacher Learning.
8. Momentum Pseudo-Labeling: Semi-Supervised ASR With Continuously Improving Pseudo-Labels.
9. End-to-End Speech Recognition: A Survey.
10. Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition.
11. Advanced Long-Context End-to-End Speech Recognition Using Context-Expanded Transformers.
12. Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition.
13. Optimizing Latency for Online Video Captioning Using Audio-Visual Transformers.
14. Unsupervised Domain Adaptation for Speech Recognition via Uncertainty Driven Self-Training.
15. Capturing Multi-Resolution Context by Dilated Self-Attention.
16. Semi-Supervised Speech Recognition Via Graph-Based Temporal Classification.
17. All-in-One Transformer: Unifying Speech Recognition, Audio Tagging, and Event Detection.
18. Transformer-Based Long-Context End-to-End Speech Recognition.
19. Unsupervised Speaker Adaptation Using Attention-Based Speaker Memory for End-to-End ASR.
20. Streaming Automatic Speech Recognition with the Transformer Model.
21. Variable Attention Masking for Configurable Transformer Transducer Speech Recognition.
22. Unidirectional Neural Network Architectures for End-to-End Automatic Speech Recognition.
23. Joint Student-Teacher Learning for Audio-Visual Scene-Aware Dialog.
24. Analysis of Multilingual Sequence-to-Sequence Speech Recognition Systems.
25. Semi-Supervised Sequence-to-Sequence ASR Using Unpaired Speech and Text.
26. Vectorized Beam Search for CTC-Attention-Based Speech Recognition.
27. End-to-End Multilingual Multi-Speaker Speech Recognition.
28. CNN-based Multichannel End-to-End Speech Recognition for Everyday Home Environments*.
29. A Comparative Study on Transformer vs RNN in Speech Applications.
30. Streaming End-to-End Speech Recognition with Joint CTC-Attention Based Models.
31. Cycle-consistency Training for End-to-end Speech Recognition.
32. Triggered Attention for End-to-end Speech Recognition.
33. Language Model Integration Based on Memory Control for Sequence to Sequence Speech Recognition.
34. End-to-end Audio Visual Scene-aware Dialog Using Multimodal Attention-based Video Features.
35. Promising Accurate Prefix Boosting for Sequence-to-sequence ASR.
36. Stream Attention-based Multi-array End-to-end Speech Recognition.
37. Multi-Stream End-to-End Speech Recognition.
38. ESPnet: End-to-End Speech Processing Toolkit.
39. Multimodal Attention for Fusion of Audio and Spatiotemporal Features for Video Description.
40. A Purely End-to-End System for Multi-speaker Speech Recognition.
41. An End-to-End Language-Tracking Speech Recognizer for Mixed-Language Speech.
42. Speaker Adaptation for Multichannel End-to-End Speech Recognition.
43. End-to-End Multi-Speaker Speech Recognition.
44. Back-Translation-Style Data Augmentation for end-to-end ASR.
45. Multilingual Sequence-to-Sequence Speech Recognition: Architecture, Transfer Learning, and Language Modeling.
46. End-to-end Speech Recognition With Word-Based Rnn Language Models.
47. Sequence Transduction with Graph-based Supervision.
48. Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning.
49. Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy.
50. Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers.
Catalog
Books, media, physical & digital resources
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.