Abstract is missing.
- Most and Least Retrievable Images in Visual-Language Query SystemsLiuwan Zhu, Rui Ning, Jiang Li, Chunsheng Xin, Hongyi Wu. 1-18 [doi]
- Sports Video Analysis on Large-Scale DataDekun Wu, He Zhao 0004, Xingce Bao, Richard P. Wildes. 19-36 [doi]
- Grounding Visual Representations with Texts for Domain GeneralizationSeonwoo Min, Nokyung Park, Siwon Kim, Seunghyun Park, Jinkyu Kim. 37-53 [doi]
- Bridging the Visual Semantic Gap in VLN via Semantically Richer InstructionsJoaquín Ossandón, Benjamín Earle, Álvaro Soto. 54-69 [doi]
- StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story ContinuationAdyasha Maharana, Darryl Hannan, Mohit Bansal. 70-87 [doi]
- VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language GuidanceKatherine Crowson, Stella Biderman, Daniel Kornis, Dashiell Stander, Eric Hallahan, Louis Castricato, Edward Raff. 88-105 [doi]
- Semantic-Aware Implicit Neural Audio-Driven Video Portrait GenerationXian Liu, Yinghao Xu, Qianyi Wu, Hang Zhou, Wayne Wu, Bolei Zhou. 106-125 [doi]
- End-to-End Active Speaker DetectionJuan León Alcázar, Moritz Cordes, Chen Zhao 0002, Bernard Ghanem. 126-143 [doi]
- Emotion Recognition for Multiple Context AwarenessDingkang Yang, Shuai Huang, Shunli Wang, Yang Liu, Peng Zhai, Liuzhen Su, Mingcheng Li, Lihua Zhang. 144-162 [doi]
- Adaptive Fine-Grained Sketch-Based Image RetrievalAyan Kumar Bhunia, Aneeshan Sain, Parth Hiren Shah, Animesh Gupta, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song. 163-181 [doi]
- Quantized GAN for Complex Music Generation from Dance VideosYe Zhu, Kyle Olszewski, Yu Wu 0011, Panos Achlioptas, Menglei Chai, Yan Yan 0002, Sergey Tulyakov. 182-199 [doi]
- Uncertainty-Aware Multi-modal Learning via Cross-Modal Random Network PredictionHu Wang, Jianpeng Zhang, Yuanhong Chen, Congbo Ma, Jodie Avery, Louise Hull, Gustavo Carneiro. 200-217 [doi]
- Localizing Visual Sounds the Easy WayShentong Mo, Pedro Morgado 0001. 218-234 [doi]
- Learning Visual Styles from Audio-Visual AssociationsTingle Li, Yichen Liu, Andrew Owens, Hang Zhao. 235-252 [doi]
- Remote Respiration Monitoring of Moving Person Using Radio SignalsJae-Ho Choi, Ki-Bong Kang, Kyung Tae Kim. 253-270 [doi]
- Camera Pose Estimation and Localization with Active Audio SensingKarren Yang, Michael Firman, Eric Brachmann, Clément Godard. 271-291 [doi]
- PACS: A Dataset for Physical Audiovisual CommonSense ReasoningSamuel Yu, Peter Wu, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency. 292-309 [doi]
- VoViT: Low Latency Graph-Based Audio-Visual Voice Separation TransformerJuan F. Montesinos, Venkatesh S. Kadandale, Gloria Haro. 310-326 [doi]
- Telepresence Video Quality AssessmentZhenqiang Ying, Deepti Ghadiyaram, Alan C. Bovik. 327-347 [doi]
- MultiMAE: Multi-modal Multi-task Masked AutoencodersRoman Bachmann 0001, David Mizrahi, Andrei Atanov, Amir Zamir. 348-367 [doi]
- AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound SeparationEfthymios Tzinis, Scott Wisdom, Tal Remez, John R. Hershey. 368-385 [doi]
- Audio-Visual SegmentationJinxing Zhou, Jianyuan Wang, Jiayi Zhang, Weixuan Sun, Jing Zhang 0052, Stan Birchfield, Dan Guo, Lingpeng Kong, Meng Wang 0001, Yiran Zhong. 386-403 [doi]
- Unsupervised Night Image Enhancement: When Layer Decomposition Meets Light-Effects SuppressionYeying Jin, Wenhan Yang, Robby T. Tan. 404-421 [doi]
- Relationformer: A Unified Framework for Image-to-Graph GenerationSuprosanna Shit, Rajat Koner, Bastian Wittmann, Johannes C. Paetzold, Ivan Ezhov, Hongwei Li 0004, JiaZhen Pan, Sahand Sharifzadeh, Georgios Kaissis, Volker Tresp, Bjoern H. Menze. 422-439 [doi]
- GAMa: Cross-View Video Geo-LocalizationShruti Vyas, Chen Chen 0001, Mubarak Shah. 440-456 [doi]
- Revisiting a kNN-Based Image Classification System with High-Capacity StorageKengo Nakata, Youyang Ng, Daisuke Miyashita, Asuka Maki, Yu-Chieh Lin, Jun Deguchi. 457-474 [doi]
- Geometric Representation Learning for Document Image RectificationHao Feng, Wengang Zhou, Jiajun Deng, Yuechen Wang, Houqiang Li. 475-492 [doi]
- 2-VER: Semi-supervised Visual Emotion RecognitionGuoli Jia, Jufeng Yang. 493-509 [doi]
- Image Coding for Machines with Omnipotent Feature LearningRuoyu Feng, Xin Jin 0014, Zongyu Guo, Runsen Feng, Yixin Gao, Tianyu He, Zhizheng Zhang 0004, Simeng Sun, Zhibo Chen 0001. 510-528 [doi]
- Feature Representation Learning for Unsupervised Cross-Domain Image RetrievalConghui Hu, Gim Hee Lee. 529-544 [doi]
- Fashionformer: A Simple, Effective and Unified Baseline for Human Fashion Segmentation and RecognitionShilin Xu, Xiangtai Li, Jingbo Wang 0001, Guangliang Cheng, Yunhai Tong, Dacheng Tao. 545-563 [doi]
- Semantic-Guided Multi-mask Image HarmonizationXuqian Ren, Yifan Liu 0001. 564-579 [doi]
- Learning an Isometric Surface Parameterization for Texture UnwrappingSagnik Das, Ke Ma 0001, Zhixin Shu, Dimitris Samaras. 580-597 [doi]
- Towards Regression-Free Neural Networks for Diverse Compute PlatformsRahul Duggal, Hao Zhou, Shuo Yang, Jun Fang, Yuanjun Xiong, Wei Xia. 598-614 [doi]
- Relationship Spatialization for Depth EstimationXiaoyu Xu, Jiayan Qiu, Xinchao Wang, Zhou Wang. 615-637 [doi]
- Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained ModelsChenfeng Xu, Shijia Yang, Tomer Galanti, Bichen Wu, Xiangyu Yue, Bohan Zhai, Wei Zhan, Peter Vajda, Kurt Keutzer, Masayoshi Tomizuka. 638-656 [doi]
- FAR: Fourier Aerial Video RecognitionDivya Kothandaraman, Tianrui Guan, Xijun Wang, Shuowen Hu, Ming C. Lin, Dinesh Manocha. 657-676 [doi]
- Translating a Visual LEGO Manual to a Machine-Executable PlanRuocheng Wang, Yunzhi Zhang, Jiayuan Mao, Chin-Yi Cheng, Jiajun Wu 0001. 677-694 [doi]
- Fabric Material Recovery from Video Using Multi-scale Geometric Auto-EncoderJunbang Liang, Ming C. Lin. 695-714 [doi]
- MegBA: A GPU-Based Distributed Library for Large-Scale Bundle AdjustmentJie Ren, Wenteng Liang, Ran Yan, Luo Mai, Shiwen Liu, Xiao Liu. 715-731 [doi]
- The One Where They Reconstructed 3D Humans and Environments in TV ShowsGeorgios Pavlakos, Ethan Weber, Matthew Tancik, Angjoo Kanazawa. 732-749 [doi]