Search

Your search keyword '"Zisserman A"' showing total 57 results

Search Constraints

Start Over You searched for: Author "Zisserman A" Remove constraint Author: "Zisserman A" Publisher arxiv Remove constraint Publisher: arxiv
57 results on '"Zisserman A"'

Search Results

1. Multi-Modal Classifiers for Open-Vocabulary Object Detection

2. TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement

3. Epic-Sounds: A Large-scale Dataset of Actions That Sound

4. Verbs in Action: Improving verb understanding in video-language models

5. VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge

6. Zorro: the masked multimodal transformer

7. Three ways to improve feature alignment for open vocabulary detection

8. AutoAD: Movie Description in Context

9. SpineNetV2: Automated detection, labelling and radiological grading of clinical MR scans

10. Weakly-supervised Fingerspelling Recognition in British Sign Language Videos

11. VoxSRC 2021: The Third VoxCeleb Speaker Recognition Challenge

12. Turbo Training with Token Dropout

13. Is an Object-Centric Video Representation Beneficial for Transfer?

14. Flamingo: a Visual Language Model for Few-Shot Learning

15. HiP: Hierarchical Perceiver

16. Omnimatte: Associating Objects and Their Effects in Video

17. Visual Keyword Spotting with Attention

18. Open-Set Recognition: a Good Closed-Set Classifier is All You Need?

19. Comment on Stochastic Polyak Step-Size: Performance of ALI-G

20. Perceiver IO: A General Architecture for Structured Inputs & Outputs

21. Self-supervised Video Object Segmentation by Motion Grouping

22. BBC-Oxford British Sign Language Dataset

23. NeRF in detail: Learning to sample for view synthesis

24. The End-of-End-to-End: A Video Understanding Pentathlon Challenge (2020)

25. Self-Supervised MultiModal Versatile Networks

26. Self-supervised Co-training for Video Representation Learning

27. QuerYD: A video dataset with high-quality text and audio narrations

28. Speech2Action: Cross-modal Supervision for Action Recognition

29. Inducing Predictive Uncertainty Estimation for Face Recognition

30. CrossTransformers: spatially-aware few-shot transfer

31. The AVA-Kinetics Localized Human Actions Video Dataset

32. Seeing wake words: Audio-visual Keyword Spotting

33. VoxSRC 2019: The first VoxCeleb Speaker Recognition Challenge

34. Semi-Supervised Learning with Scarce Annotations

35. Sim2real transfer learning for 3D human pose estimation: motion to the rescue

36. Object Discovery with a Copy-Pasting GAN

37. Unsupervised Learning of Object Keypoints for Perception and Control

38. My lips are concealed: Audio-visual speech enhancement through obstructions

39. Count, Crop and Recognise: Fine-Grained Recognition in the Wild

40. Exploiting temporal context for 3D human pose estimation in the wild

41. VoxCeleb2: Deep Speaker Recognition

42. Learning to Navigate in Cities Without a Map

43. Comparator Networks

44. Self-supervised learning of a facial attribute embedding from video

45. LRS3-TED: a large-scale dataset for visual speech recognition

46. The Visual Centrifuge: Model-Free Layered Video Representations

47. Seeing Voices and Hearing Faces: Cross-modal biometric matching

48. VoxCeleb: a large-scale speaker identification dataset

49. The Kinetics Human Action Video Dataset

50. You said that?

Catalog

Books, media, physical & digital resources