110 results on '"Takaaki Hori"'
Search Results
2. Low-Latency Online Streaming VideoQA Using Audio-Visual Transformers.
3. Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy.
4. Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR.
5. Sequence Transduction with Graph-Based Supervision.
6. Audio-Visual Scene-Aware Dialog and Reasoning Using Audio-Visual Transformers with Joint Student-Teacher Learning.
7. Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition.
8. Advanced Long-Context End-to-End Speech Recognition Using Context-Expanded Transformers.
9. Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition.
10. Optimizing Latency for Online Video Captioning Using Audio-Visual Transformers.
11. Unsupervised Domain Adaptation for Speech Recognition via Uncertainty Driven Self-Training.
12. Capturing Multi-Resolution Context by Dilated Self-Attention.
13. Semi-Supervised Speech Recognition Via Graph-Based Temporal Classification.
14. All-in-One Transformer: Unifying Speech Recognition, Audio Tagging, and Event Detection.
15. Transformer-Based Long-Context End-to-End Speech Recognition.
16. Unsupervised Speaker Adaptation Using Attention-Based Speaker Memory for End-to-End ASR.
17. Streaming Automatic Speech Recognition with the Transformer Model.
18. Unidirectional Neural Network Architectures for End-to-End Automatic Speech Recognition.
19. Joint Student-Teacher Learning for Audio-Visual Scene-Aware Dialog.
20. Analysis of Multilingual Sequence-to-Sequence Speech Recognition Systems.
21. Semi-Supervised Sequence-to-Sequence ASR Using Unpaired Speech and Text.
22. Vectorized Beam Search for CTC-Attention-Based Speech Recognition.
23. End-to-End Multilingual Multi-Speaker Speech Recognition.
24. CNN-based Multichannel End-to-End Speech Recognition for Everyday Home Environments*.
25. A Comparative Study on Transformer vs RNN in Speech Applications.
26. Streaming End-to-End Speech Recognition with Joint CTC-Attention Based Models.
27. Cycle-consistency Training for End-to-end Speech Recognition.
28. Triggered Attention for End-to-end Speech Recognition.
29. Language Model Integration Based on Memory Control for Sequence to Sequence Speech Recognition.
30. End-to-end Audio Visual Scene-aware Dialog Using Multimodal Attention-based Video Features.
31. Promising Accurate Prefix Boosting for Sequence-to-sequence ASR.
32. Stream Attention-based Multi-array End-to-end Speech Recognition.
33. ESPnet: End-to-End Speech Processing Toolkit.
34. Multimodal Attention for Fusion of Audio and Spatiotemporal Features for Video Description.
35. A Purely End-to-End System for Multi-speaker Speech Recognition.
36. An End-to-End Language-Tracking Speech Recognizer for Mixed-Language Speech.
37. Speaker Adaptation for Multichannel End-to-End Speech Recognition.
38. End-to-End Multi-Speaker Speech Recognition.
39. Back-Translation-Style Data Augmentation for end-to-end ASR.
40. Multilingual Sequence-to-Sequence Speech Recognition: Architecture, Transfer Learning, and Language Modeling.
41. End-to-end Speech Recognition With Word-Based Rnn Language Models.
42. Attention-Based Multimodal Fusion for Video Description.
43. Advances in Joint CTC-Attention Based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM.
44. Early and late integration of audio features for automatic video description.
45. Language independent end-to-end architecture for joint language identification and speech recognition.
46. Multi-level language modeling and decoding for open vocabulary end-to-end speech recognition.
47. Joint CTC/attention decoding for end-to-end speech recognition.
48. Joint CTC-attention based end-to-end speech recognition using multi-task learning.
49. Student-teacher network learning with enhanced features.
50. BLSTM-HMM hybrid system combined with sound activity detection network for polyphonic Sound Event Detection.
Catalog
Books, media, physical & digital resources
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.