Abstract is missing.
- Pixel screening based intermediate correction for blind deblurringMeina Zhang, Yingying Fang, Guoxi Ni, Tieyong Zeng. 1-9 [doi]
- Clipped Hyperbolic Classifiers Are Super-Hyperbolic ClassifiersYunhui Guo, Xudong Wang, Yubei Chen, Stella X. Yu. 1-10 [doi]
- When Does Contrastive Visual Representation Learning Work?Elijah Cole, Xuan Yang, Kimberly Wilber, Oisin Mac Aodha, Serge J. Belongie. 1-10 [doi]
- Large-Scale Pre-training for Person Re-identification with Noisy LabelsDengpan Fu, Dongdong Chen 0001, Hao Yang, Jianmin Bao, Lu Yuan, Lei Zhang 0001, Houqiang Li, Fang Wen, Dong Chen 0003. 1-11 [doi]
- CO-SNE: Dimensionality Reduction and Visualization for Hyperbolic DataYunhui Guo, Haoran Guo, Stella X. Yu. 11-20 [doi]
- Efficient Deep Embedded Subspace ClusteringJinyu Cai, Jicong Fan, Wenzhong Guo, Shiping Wang, Yunhe Zhang, Zhao Zhang 0001. 21-30 [doi]
- Noise Is Also Useful: Negative Correlation-Steered Latent Contrastive LearningJiexi Yan, Lei Luo 0001, Chenghao Xu, Cheng Deng, Heng Huang. 31-40 [doi]
- Active Learning for Open-set AnnotationKun-Peng Ning, Xun Zhao, Yu Li 0003, Sheng-Jun Huang. 41-49 [doi]
- Understanding and Increasing Efficiency of Frank-Wolfe Adversarial TrainingTheodoros Tsiligkaridis, Jay Roberts. 50-59 [doi]
- Robust Optimization as Data Augmentation for Large-scale GraphsKezhi Kong, Guohao Li, Mucong Ding, Zuxuan Wu, Chen Zhu, Bernard Ghanem, Gavin Taylor, Tom Goldstein. 60-69 [doi]
- A Re-Balancing Strategy for Class-Imbalanced Classification Based on Instance DifficultySihao Yu, Jiafeng Guo, Ruqing Zhang, Yixing Fan, Zizhen Wang, Xueqi Cheng. 70-79 [doi]
- The Devil is in the Margin: Margin-based Label Smoothing for Network CalibrationBingyuan Liu, Ismail Ben Ayed, Adrian Galdran, Jose Dolz. 80-88 [doi]
- Towards Better Plasticity-Stability Trade-off in Incremental Learning: A Simple Linear ConnectorGuoliang Lin, Hanlu Chu, Hanjiang Lai. 89-98 [doi]
- GCR: Gradient Coreset based Replay Buffer Selection for Continual LearningRishabh Tiwari, KrishnaTeja Killamsetty, Rishabh K. Iyer, Pradeep Shenoy. 99-108 [doi]
- Learning Bayesian Sparse Networks with Full Experience Replay for Continual LearningQingsen Yan, Dong Gong, Yuhang Liu, Anton van den Hengel, Javen Qinfeng Shi. 109-118 [doi]
- A variational Bayesian method for similarity learning in non-rigid image registrationDaniel Grzech, Mohammad Farid Azampour, Ben Glocker, Julia A. Schnabel, Nassir Navab, Bernhard Kainz, Loïc Le Folgoc. 119-128 [doi]
- Learning to Learn by Jointly Optimizing Neural Architecture and WeightsYadong Ding, Yu Wu 0011, Chengyue Huang, Siliang Tang, Yi Yang 0001, Longhui Wei, Yueting Zhuang, Qi Tian 0001. 129-138 [doi]
- Learning to Prompt for Continual LearningZifeng Wang 0002, Zizhao Zhang, Chen-Yu Lee, Han Zhang 0010, Ruoxi Sun, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer G. Dy, Tomas Pfister. 139-149 [doi]
- Meta-attention for ViT-backed Continual LearningMengqi Xue, Haofei Zhang, Jie Song, Mingli Song. 150-159 [doi]
- Multi-Frame Self-Supervised Depth with TransformersVitor Guizilini, Rares Ambrus, Dian Chen 0005, Sergey Zakharov, Adrien Gaidon. 160-170 [doi]
- Continual Learning with Lifelong Vision TransformerZhen Wang 0030, Liu Liu 0014, Yiqun Duan, Yajing Kong, Dacheng Tao. 171-181 [doi]
- Rethinking Bayesian Deep Learning Methods for Semi-Supervised Volumetric Medical Image SegmentationJianfeng Wang, Thomas Lukasiewicz. 182-190 [doi]
- Revisiting Random Channel Pruning for Neural Network CompressionYawei Li, Kamil Adamczewski, Wen Li 0001, Shuhang Gu, Radu Timofte, Luc Van Gool. 191-201 [doi]
- Deep Safe Multi-view Clustering: Reducing the Risk of Clustering Performance Degradation Caused by View IncreaseHuayi Tang, Yong Liu 0018. 202-211 [doi]
- Hypergraph-Induced Semantic Tuplet Loss for Deep Metric LearningJongin Lim, Sangdoo Yun, Seulki Park, Jin Young Choi 0002. 212-222 [doi]
- Towards Robust and Reproducible Active Learning using Neural NetworksPrateek Munjal, Nasir Hayat, Munawar Hayat, Jamshid Sourati, Shadab Khan. 223-232 [doi]
- Non-Iterative Recovery from Nonlinear Observations using Generative ModelsJiulong Liu, Zhaoqiang Liu. 233-243 [doi]
- Gaussian Process Modeling of Approximate Inference Errors for Variational AutoencodersMinyoung Kim. 244-253 [doi]
- Robust Combination of Distributed Gradients Under Adversarial PerturbationsKwang In Kim. 254-263 [doi]
- Do learned representations respect causal relationships?Lan Wang, Vishnu Naresh Boddeti. 264-274 [doi]
- How Much More Data Do I Need? Estimating Requirements for Downstream TasksRafid Mahmood, James Lucas, David Acuna, Daiqing Li, Jonah Philion, Jose M. Alvarez, Zhiding Yu, Sanja Fidler, Marc T. Law. 275-284 [doi]
- Pushing the Envelope of Gradient Boosting Forests via Globally-Optimized Oblique TreesMagzhan Gabidolla, Miguel Á. Carreira-Perpiñán. 285-294 [doi]
- Contrastive Test-Time AdaptationDian Chen 0001, Dequan Wang, Trevor Darrell, Sayna Ebrahimi. 295-305 [doi]
- AutoSDF: Shape Priors for 3D Completion, Reconstruction and GenerationParitosh Mittal, Yen-Chi Cheng, Maneesh Singh 0001, Shubham Tulsiani. 306-315 [doi]
- Selective-Supervised Contrastive Learning with Noisy LabelsShikun Li, Xiaobo Xia, Shiming Ge, Tongliang Liu. 316-325 [doi]
- RecDis-SNN: Rectifying Membrane Potential Distribution for Directly Training Spiking Neural NetworksYufei Guo, Xinyi Tong, Yuanpei Chen, Liwen Zhang, Xiaode Liu, Zhe Ma, Xuhui Huang. 326-335 [doi]
- Hierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality ReductionM. Saquib Sarfraz, Marios Koulakis, Constantin Seibold, Rainer Stiefelhagen. 336-345 [doi]
- Scalable Penalized Regression for Noise Detection in Learning with Noisy LabelsYikai Wang, Xinwei Sun 0001, Yanwei Fu. 346-355 [doi]
- Nested Hyperbolic Spaces for Dimensionality Reduction and Hyperbolic NN DesignXiran Fan, Chun-Hao Yang, Baba C. Vemuri. 356-365 [doi]
- Learning Structured Gaussians to Approximate Deep EnsemblesIvor J. A. Simpson, Sara Vicente, Neill D. F. Campbell. 366-374 [doi]
- Out-of-distribution Generalization with Causal Invariant TransformationsRuoyu Wang 0016, Mingyang Yi, Zhitang Chen, Shengyu Zhu 0001. 375-385 [doi]
- Split Hierarchical Variational CompressionTom Ryder, Chen Zhang, Ning Kang 0001, Shifeng Zhang. 386-395 [doi]
- Implicit Feature Decoupling with Depthwise QuantizationIordanis Fostiropoulos, Barry W. Boehm. 396-405 [doi]
- Understanding Uncertainty Maps in Vision with Statistical TestingJurijs Nazarovs, Zhichun Huang, Songwong Tasneeyapant, Rudrasis Chakraborty, Vikas Singh. 406-416 [doi]
- A Hybrid Quantum-Classical Algorithm for Robust FittingAnh-Dzung Doan, Michele Sasdelli, David Suter, Tat-Jun Chin. 417-427 [doi]
- A Scalable Combinatorial Solver for Elastic Geometrically Consistent 3D Shape MatchingPaul Roetzer, Paul Swoboda, Daniel Cremers, Florian Bernard. 428-438 [doi]
- FastDOG: Fast Discrete Optimization on GPUAhmed Abbas, Paul Swoboda. 439-449 [doi]
- Data-Free Network Compression via Parametric Non-uniform Mixed Precision QuantizationVladimir Chikin, Mikhail Antiukh. 450-459 [doi]
- AdaSTE: An Adaptive Straight-Through Estimator to Train Binary Neural NetworksHuu Le, Rasmus Kjær Høier, Che-Tsung Lin, Christopher Zach. 460-469 [doi]
- Training Quantised Neural Networks with STE Variants: the Additive Noise Annealing AlgorithmMatteo Spallanzani, Gian Paolo Leonardi, Luca Benini. 470-479 [doi]
- GLASS: Geometric Latent Augmentation for Shape SpacesSanjeev Muralikrishnan, Siddhartha Chaudhuri, Noam Aigerman, Vladimir G. Kim, Matthew Fisher, Niloy J. Mitra. 470-479 [doi]
- AME: Attention and Memory Enhancement in Hyper-Parameter OptimizationNuo Xu, Jianlong Chang, Xing Nie, Chunlei Huo, Shiming Xiang, Chunhong Pan. 480-489 [doi]
- Efficient Maximal Coding Rate Reduction by Variational FormsChristina Baek, Ziyang Wu, Kwan Ho Ryan Chan, Tianjiao Ding, Yi Ma 0001, Benjamin D. Haeffele. 490-498 [doi]
- A Unified Framework for Implicit Sinkhorn DifferentiationMarvin Eisenberger, Aysim Toker, Laura Leal-Taixé, Florian Bernard, Daniel Cremers. 499-508 [doi]
- Computing Wasserstein-$p$ Distance Between Images with Linear CostYidong Chen, Chen Li, Zhonghua Lu. 509-518 [doi]
- An Iterative Quantum Approach for Transformation Estimation from Point SetsNatacha Luete Meli, Florian Mannel, Jan Lellmann. 519-527 [doi]
- BoosterNet: Improving Domain Generalization of Deep Neural Nets using Culpability-Ranked FeaturesNourhan Bayasi, Ghassan Hamarneh, Rafeef Garbi. 528-538 [doi]
- Pooling Revisited: Your Receptive Field is SuboptimalDong-Hwan Jang, Sanghyeok Chu, Joonhyuk Kim, Bohyung Han. 539-548 [doi]
- Why Discard if You can Recycle?: A Recycling Max Pooling Module for 3D Point Cloud AnalysisJiajing Chen, Burak Kakillioglu, Huantao Ren, Senem Velipasalar. 549-557 [doi]
- Online Convolutional ReparameterizationMu Hu, Junyi Feng, Jiashen Hua, Baisheng Lai, Jianqiang Huang, Xiaojin Gong, Xiansheng Hua 0001. 558-567 [doi]
- RepMLPNet: Hierarchical Vision MLP with Re-parameterized LocalityXiaohan Ding, Honghao Chen, Xiangyu Zhang 0005, Jungong Han, Guiguang Ding. 568-577 [doi]
- DyRep: Bootstrapping Training with Dynamic Re-parameterizationTao Huang 0020, Shan You, Bohan Zhang, Yuxuan Du, Fei Wang 0032, Chen Qian 0006, Chang Xu 0002. 578-587 [doi]
- Quarantine: Sparsity Can Uncover the Trojan Attack Trigger for FreeTianlong Chen, Zhenyu Zhang, Yihua Zhang, Shiyu Chang, Sijia Liu 0001, Zhangyang Wang. 588-599 [doi]
- Condensing CNNs with Partial Differential EquationsAnil Kag, Venkatesh Saligrama. 600-609 [doi]
- Deep Equilibrium Optical Flow EstimationShaojie Bai, Zhengyang Geng, Yash Savani, J. Zico Kolter. 610-620 [doi]
- Frame Averaging for Equivariant Shape Space LearningMatan Atzmon, Koki Nagano, Sanja Fidler, Sameh Khamis, Yaron Lipman. 621-631 [doi]
- Dual-Generator Face ReenactmentGee-Sern Hsu, Chun-Hung Tsai, Hung-Yi Wu. 632-640 [doi]
- Convolution of Convolution: Let Kernels Spatially CollaborateRongzhen Zhao, Jian Li, Zhenzhi Wu. 641-650 [doi]
- SASIC: Stereo Image Compression with Latent Shifts and Stereo AttentionMatthias Wödlinger, Jan Kotera, Jan Xu, Robert Sablatnig. 651-660 [doi]
- RADU: Ray-Aligned Depth Update Convolutions for ToF Data DenoisingMichael Schelling, Pedro Hermosilla, Timo Ropinski. 661-670 [doi]
- Co-domain Symmetry for Complex-Valued Deep LearningUtkarsh Singhal, Yifei Xing, Stella X. Yu. 671-680 [doi]
- Paramixer: Parameterizing Mixing Links in Sparse Factors Works Better than Dot-Product Self-AttentionTong Yu, Ruslan Khalitov, Lei Cheng, Zhirong Yang. 681-690 [doi]
- Compressing Models with Few Samples: Mimicking then ReplacingHuanyu Wang, Junjie Liu, Xin Ma, Yang Yong, Zhenhua Chai, Jianxin Wu 0001. 691-700 [doi]
- Total Variation Optimization Layers for Computer VisionRaymond A. Yeh, Yuan-Ting Hu, Zhongzheng Ren, Alexander G. Schwing. 701-711 [doi]
- AIM: an Auto-Augmenter for Images and MeshesVinit Veerendraveer Singh, Chandra Kambhamettu. 712-721 [doi]
- Recurrent Variational Network: A Deep Learning Inverse Problem Solver applied to the task of Accelerated MRI ReconstructionGeorge Yiasemis, Jan-Jakob Sonke, Clarisa Sánchez, Jonas Teuwen. 722-731 [doi]
- Deep orientation-aware functional maps: Tackling symmetry issues in Shape MatchingNicolas Donati, Etienne Corman, Maks Ovsjanikov. 732-741 [doi]
- Weakly-supervised Metric Learning with Cross-Module Communications for the Classification of Anterior Chamber Angle ImagesJingqi Huang, Yue Ning 0001, Dong Nie, Linan Guan, Xiping Jia. 742-752 [doi]
- Delving into the Estimation Shift of Batch Normalization in a NetworkLei Huang, Yi Zhou, Tian Wang, Jie Luo, Xianglong Liu. 753-762 [doi]
- Generalizing Interactive Backpropagating Refinement for Dense Prediction NetworksFanqing Lin, Brian Price, Tony R. Martinez. 763-772 [doi]
- Brain-inspired Multilayer Perceptron with Spiking NeuronsWenshuo Li, Hanting Chen, Jianyuan Guo, Ziyang Zhang, Yunhe Wang 0001. 773-783 [doi]
- Smooth Maximum Unit: Smooth Activation Function for Deep Networks using Smoothing Maximum TechniqueKoushik Biswas, Sandeep Kumar 0002, Shilpak Banerjee, Ashish Kumar Pandey. 784-793 [doi]
- Revisiting Weakly Supervised Pre-Training of Visual Perception ModelsMannat Singh, Laura Gustafson, Aaron Adcock, Vinicius de Freitas Reis, Bugra Gedik, Raj Prateek Kosaraju, Dhruv Mahajan 0001, Ross B. Girshick, Piotr Dollár, Laurens van der Maaten. 794-804 [doi]
- On the Integration of Self-Attention and ConvolutionXuran Pan, Chunjiang Ge, Rui Lu, Shiji Song, Guanfu Chen, Zeyi Huang, Gao Huang. 805-815 [doi]
- Hire-MLP: Vision MLP via Hierarchical RearrangementJianyuan Guo, Yehui Tang, Kai Han 0002, Xinghao Chen 0001, Han Wu, Chao Xu 0006, Chang Xu 0002, Yunhe Wang 0001. 816-826 [doi]
- Stable Long-Term Recurrent Video Super-ResolutionBenjamin Naoto Chiche, Arnaud Woiselle, Joana Frontera-Pons, Jean-Luc Starck. 827-836 [doi]
- Single-Domain Generalized Object Detection in Urban Scene via Cyclic-Disentangled Self-DistillationAming Wu, Cheng Deng. 837-846 [doi]
- Progressive End-to-End Object Detection in Crowded ScenesAnlin Zheng, Yuang Zhang, Xiangyu Zhang, Xiaojuan Qi, Jian Sun. 847-856 [doi]
- Zero-Shot Text-Guided Object Generation with Dream FieldsAjay Jain, Ben Mildenhall, Jonathan T. Barron, Pieter Abbeel, Ben Poole. 857-866 [doi]
- ISNet: Shape Matters for Infrared Small Target DetectionMingjin Zhang, Rui Zhang, Yuxiang Yang, Haichen Bai, Jing Zhang, Jie Guo. 867-876 [doi]
- Pseudo-Stereo for Monocular 3D Object Detection in Autonomous DrivingYi-nan Chen, Hang Dai, Yong Ding 0003. 877-887 [doi]
- CLRNet: Cross Layer Refinement Network for Lane DetectionTu Zheng, Yifei Huang, Yang Liu, Wenjian Tang, Zheng Yang, Deng Cai 0001, Xiaofei He 0001. 888-897 [doi]
- CAT-Det: Contrastively Augmented Transformer for Multimodal 3D Object DetectionYanan Zhang, Jiaxin Chen, Di Huang 0001. 898-907 [doi]
- Modality-Agnostic Learning for Radar-Lidar Fusion in Vehicle DetectionYu-Jhe Li, Jinhyung Park, Matthew O'Toole, Kris Kitani. 908-917 [doi]
- Group Contextualization for Video RecognitionYanbin Hao, Hao Zhang, Chong-Wah Ngo, Xiangnan He 0001. 918-928 [doi]
- Learning Transferable Human-Object Interaction Detector with Natural Language SupervisionSuchen Wang, Yueqi Duan, Henghui Ding, Yap-Peng Tan, Kim-Hui Yap, Junsong Yuan. 929-938 [doi]
- Accelerating DETR Convergence via Semantic-Aligned MatchingGongjie Zhang, Zhipeng Luo, Yingchen Yu, Kaiwen Cui, Shijian Lu. 939-948 [doi]
- Efficient Video Instance Segmentation via Tracklet Query and ProposalJialian Wu, Sudhir Yarram, Hui Liang 0003, Tian Lan, Junsong Yuan, Jayan Eledath, Gérard G. Medioni. 949-958 [doi]
- Class Re-Activation Maps for Weakly-Supervised Semantic SegmentationZhaozheng Chen, Tan Wang, Xiongwei Wu, Xian-Sheng Hua 0001, Hanwang Zhang, Qianru Sun. 959-968 [doi]
- Democracy Does Matter: Comprehensive Feature Mining for Co-Salient Object DetectionSiyue Yu, Jimin Xiao, Bingfeng Zhang, Eng Gee Lim. 969-978 [doi]
- 2 AM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic SegmentationJinheng Xie, Jianfeng Xiang, Junliang Chen, Xianxu Hou, Xiaodong Zhao, LinLin Shen. 979-988 [doi]
- Sketching without Worrying: Noise-Tolerant Sketch-Based Image RetrievalAyan Kumar Bhunia, Subhadeep Koley, Abdullah Faiz Ur Rahman Khilji, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song. 989-998 [doi]
- AutoLoss-Zero: Searching Loss Functions from Scratch for Generic TasksHao Li, Tianwen Fu, Jifeng Dai, Hongsheng Li, Gao Huang, Xizhou Zhu. 999-1008 [doi]
- Consistency Learning via Decoding Path Augmentation for Transformers in Human Object Interaction DetectionJihwan Park, Seungjun Lee, Hwan Heo, Hyeong Kyu Choi, Hyunwoo J. Kim. 1009-1018 [doi]
- A Proposal-based Paradigm for Self-supervised Sound Source Localization in VideosHanyu Xuan, Zhiliang Wu, Jian Yang, Yan Yan 0002, Xavier Alameda-Pineda. 1019-1028 [doi]
- SimAN: Exploring Self-Supervised Representation Learning of Scene Text via Similarity-Aware NormalizationCanjie Luo, Lianwen Jin, Jingdong Chen. 1029-1038 [doi]
- Towards End-to-End Unified Scene Text Detection and Layout AnalysisShangbang Long, Siyang Qin, Dmitry Panteleev, Alessandro Bissacco, Yasuhisa Fujii, Michalis Raptis. 1039-1049 [doi]
- Clothes-Changing Person Re-identification with RGB Modality OnlyXinqian Gu, Hong Chang, Bingpeng Ma, Shutao Bai, Shiguang Shan, Xilin Chen 0001. 1050-1059 [doi]
- MonoJSG: Joint Semantic and Geometric Cost Volume for Monocular 3D Object DetectionQing Lian, Peiliang Li 0001, Xiaozhi Chen. 1060-1069 [doi]
- Homography Loss for Monocular 3D Object DetectionJiaqi Gu, Bojian Wu, Lubin Fan, Jianqiang Huang, Shen Cao, Zhiyu Xiang, Xian-Sheng Hua 0001. 1070-1079 [doi]
- TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with TransformersXuyang Bai, Zeyu Hu, Xinge Zhu, Qingqiu Huang, Yilun Chen, Hongbo Fu, Chiew-Lan Tai. 1080-1089 [doi]
- TWIST: Two-Way Inter-label Self-Training for Semi-supervised 3D Instance SegmentationRuihang Chu, Xiaoqing Ye, Zhengzhe Liu, Xiao Tan 0001, Xiaojuan Qi, Chi-Wing Fu, Jiaya Jia. 1090-1099 [doi]
- RBGNet: Ray-based Grouping for 3D Object DetectionHaiyang Wang, Shaoshuai Shi, Ze Yang 0003, Rongyao Fang, Qi Qian, Hongsheng Li 0001, Bernt Schiele, Liwei Wang 0001. 1100-1109 [doi]
- Voxel Field Fusion for 3D Object DetectionYanwei Li, Xiaojuan Qi, Yukang Chen, Liwei Wang 0009, Zeming Li, Jian Sun, Jiaya Jia. 1110-1119 [doi]
- Learning to Detect Mobile Objects from LiDAR Scans Without LabelsYurong You, Katie Luo, Cheng Perng Phoo, Wei-Lun Chao, Wen Sun, Bharath Hariharan, Mark E. Campbell, Kilian Q. Weinberger. 1120-1130 [doi]
- OccAM's Laser: Occlusion-based Attribution Maps for 3D Object Detectors on LiDAR DataDavid Schinagl, Georg Krispel, Horst Possegger, Peter M. Roth, Horst Bischof. 1131-1140 [doi]
- Confidence Propagation Cluster: Unleash Full Potential of Object DetectorsYichun Shen, Wanli Jiang, Zhen Xu, Rundong Li, Junghyun Kwon. 1141-1151 [doi]
- TransGeo: Transformer Is All You Need for Cross-view Image Geo-localizationSijie Zhu, Mubarak Shah, Chen Chen 0001. 1152-1161 [doi]
- A Voxel Graph CNN for Object Classification with Event CamerasYongjian Deng, Hao Chen, Hai Liu 0004, Youfu Li. 1162-1171 [doi]
- OSKDet: Orientation-sensitive Keypoint Localization for Rotated Object DetectionDongchen Lu, Dongmei Li, Yali Li 0001, Shengjin Wang. 1172-1182 [doi]
- Canonical Voting: Towards Robust Oriented Bounding Box Detection in 3D ScenesYang You, Zelin Ye, Yujing Lou, Chengkun Li, Yong-Lu Li, Lizhuang Ma, Weiming Wang, Cewu Lu. 1183-1192 [doi]
- Category Contrast for Unsupervised Domain Adaptation in Visual TasksJiaxing Huang 0001, Dayan Guan, Aoran Xiao, Shijian Lu, Ling Shao 0001. 1193-1204 [doi]
- Scaling Vision TransformersXiaohua Zhai, Alexander Kolesnikov 0003, Neil Houlsby, Lucas Beyer. 1204-1213 [doi]
- Amodal Segmentation through Out-of-Task and Out-of-Distribution Generalization with a Bayesian ModelYihong Sun, Adam Kortylewski, Alan L. Yuille. 1205-1214 [doi]
- GANSeg: Learning to Segment by Unsupervised Hierarchical Image GenerationXingzhe He, Bastian Wandt, Helge Rhodin. 1215-1225 [doi]
- Segment-Fusion: Hierarchical Context Fusion for Robust 3D Semantic SegmentationAnirud Thyagharajan, Benjamin Ummenhofer, Prashant Laddha, Om Ji Omer, Sreenivas Subramoney. 1226-1235 [doi]
- Deep Hierarchical Semantic SegmentationLiulei Li, Tianfei Zhou, Wenguan Wang, Jianwu Li, Yi Yang 0001. 1236-1247 [doi]
- Semantic Segmentation by Early Region ProxyYifan Zhang, Bo Pang, Cewu Lu. 1248-1258 [doi]
- Panoptic, Instance and Semantic Relations: A Relational Context Encoder to Enhance Panoptic SegmentationShubhankar Borse, Hyojin Park, Hong Cai, Debasmit Das, Risheek Garrepalli, Fatih Porikli. 1259-1269 [doi]
- Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with TransformersZhiqi Li, Wenhai Wang, Enze Xie, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo, Tong Lu. 1270-1279 [doi]
- Masked-attention Mask Transformer for Universal Image SegmentationBowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Rohit Girdhar. 1280-1289 [doi]
- FocalClick: Towards Practical Interactive Image SegmentationXi Chen, Zhiyan Zhao, Yilei Zhang, Manni Duan, Donglian Qi, Hengshuang Zhao. 1290-1299 [doi]
- High Quality Segmentation for Ultra High-resolution ImagesTiancheng Shen, Yuechen Zhang, Lu Qi, Jason Kuen, Xingyu Xie, Jianlong Wu, Zhe Lin, Jiaya Jia. 1300-1309 [doi]
- Wnet: Audio-Guided Video Object Segmentation via Wavelet-Based Cross- Modal Denoising NetworksWenwen Pan, Haonan Shi, Zhou Zhao, Jieming Zhu, Xiuqiang He 0001, Zhigeng Pan, Lianli Gao, Jun Yu, Fei Wu, Qi Tian 0001. 1310-1321 [doi]
- Recurrent Dynamic Embedding for Video Object SegmentationMingxing Li, Li Hu, Zhiwei Xiong, Bang Zhang, Pan Pan, Dong Liu 0002. 1322-1331 [doi]
- Accelerating Video Object Segmentation with Compressed VideoKai Xu, Angela Yao. 1332-1341 [doi]
- Per-Clip Video Object SegmentationKwanYong Park, Sanghyun Woo, Seoung Wug Oh, In-So Kweon, Joon-Young Lee. 1342-1351 [doi]
- SWEM: Towards Real-Time Video Object Segmentation with Sequential Weighted Expectation-MaximizationZhihui Lin, Tianyu Yang, Maomao Li, Ziyu Wang, Chun Yuan, Wenhao Jiang, Wei Liu. 1352-1362 [doi]
- Neural Recognition of Dashed Curves with Gestalt Law of ContinuityHanyuan Liu, Chengze Li, Xueting Liu, Tien-Tsin Wong. 1363-1372 [doi]
- CVNet: Contour Vibration Network for Building ExtractionZiqiang Xu, Chunyan Xu, Zhen Cui, Xiangwei Zheng, Jian Yang. 1373-1381 [doi]
- A Keypoint-based Global Association Network for Lane DetectionJinsheng Wang, Yinchao Ma, Shaofei Huang, Tianrui Hui, Fei Wang, Chen Qian, Tianzhu Zhang. 1382-1391 [doi]
- EDTER: Edge Detection with TransformerMengyang Pu, Yaping Huang, Yuming Liu, Qingji Guan, Haibin Ling. 1392-1402 [doi]
- Fixing Malfunctional Objects With Learned Physical Simulation and Functional PredictionYining Hong, Kaichun Mo, Li Yi, Leonidas J. Guibas, Antonio Torralba 0001, Joshua B. Tenenbaum, Chuang Gan. 1403-1413 [doi]
- Coherent Point Drift Revisited for Non-rigid Shape Matching and RegistrationAoxiang Fan, Jiayi Ma 0001, Xin Tian 0006, Xiaoguang Mei, Wei Lin. 1414-1424 [doi]
- CodedVTR: Codebook-based Sparse Voxel Transformer with Geometric GuidanceTianchen Zhao, Niansong Zhang, Xuefei Ning, He Wang, Li Yi, Yu Wang. 1425-1434 [doi]
- FLOAT: Factorized Learning of Object Attributes for Improved Multi-object Multi-part Scene ParsingRishubh Singh, Pranav Gupta, Pradeep Shenoy, Ravikiran Sarvadevabhatla. 1435-1445 [doi]
- Rotationally Equivariant 3D Object DetectionHong-Xing Yu, Jiajun Wu 0001, Li Yi. 1446-1454 [doi]
- AUV-Net: Learning Aligned UV Maps for Texture Transfer and SynthesisZhiqin Chen, Kangxue Yin, Sanja Fidler. 1455-1464 [doi]
- Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded ScenesHongsuk Choi, Gyeongsik Moon, Joonkyu Park, Kyoung Mu Lee. 1465-1474 [doi]
- Human Mesh Recovery from Multiple ShotsGeorgios Pavlakos, Jitendra Malik, Angjoo Kanazawa. 1475-1485 [doi]
- HandOccNet: Occlusion-Robust 3D Hand Mesh Estimation NetworkJoonkyu Park, Yeonguk Oh, Gyeongsik Moon, Hongsuk Choi, Kyoung Mu Lee. 1486-1495 [doi]
- Photorealistic Monocular 3D Reconstruction of Humans Wearing ClothingThiemo Alldieck, Mihai Zanfir, Cristian Sminchisescu. 1496-1505 [doi]
- Disentangled3D: Learning a 3D Generative Model with Disentangled Geometry and Appearance from Monocular ImagesAyush Tewari, Mallikarjun B. R. 0001, Xingang Pan, Ohad Fried, Maneesh Agrawala, Christian Theobalt. 1506-1515 [doi]
- NeuralHDHair: Automatic High-fidelity Hair Modeling from a Single Image Using Implicit Neural RepresentationsKeyu Wu, Yifan Ye, Lingchen Yang, Hongbo Fu, Kun Zhou 0001, Youyi Zheng. 1516-1525 [doi]
- Topologically-Aware Deformation Fields for Single-View 3D ReconstructionShivam Duggal, Deepak Pathak. 1526-1536 [doi]
- Generating Diverse 3D Reconstructions from a Single Occluded Face ImageRahul Dey, Vishnu Naresh Boddeti. 1537-1547 [doi]
- LOLNeRF: Learn from One LookDaniel Rebain, Mark J. Matthews, Kwang Moo Yi, Dmitry Lagun, Andrea Tagliasacchi. 1548-1557 [doi]
- Learning Local Displacements for Point Cloud CompletionYida Wang, David Joseph Tan, Nassir Navab, Federico Tombari. 1558-1567 [doi]
- Exploiting Pseudo Labels in a Self-Supervised Learning Framework for Improved Monocular Depth EstimationAndra Petrovai, Sergiu Nedevschi. 1568-1578 [doi]
- Dimension Embeddings for Monocular 3D Object DetectionYunpeng Zhang, Wenzhao Zheng, Zheng Zhu, Guan Huang, Dalong Du, Jie Zhou 0001, Jiwen Lu. 1579-1588 [doi]
- Understanding 3D Object Articulation in Internet VideosShengyi Qian 0001, Linyi Jin, Chris Rockwell, Siyi Chen, David F. Fouhey. 1589-1599 [doi]
- P3Depth: Monocular Depth Estimation with a Piecewise Planarity PriorVaishakh Patil, Christos Sakaridis, Alexander Liniger, Luc Van Gool. 1600-1611 [doi]
- Neural Face Identification in a 2D Wireframe Projection of a Manifold ObjectKehan Wang, Jia Zheng, Zihan Zhou 0001. 1612-1621 [doi]
- PanopticDepth: A Unified Framework for Depth-aware Panoptic SegmentationNaiyu Gao, Fei He, Jian Jia, Yanhu Shan, Haoyang Zhang, Xin Zhao 0012, Kaiqi Huang. 1622-1632 [doi]
- Stability-driven Contact Reconstruction From Monocular Color ImagesZimeng Zhao, Binghui Zuo, Wei Xie, Yangang Wang. 1633-1643 [doi]
- LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware Transformer NetworkZhigang Jiang, Zhongzheng Xiang, Jinhua Xu, Ming Zhao. 1644-1653 [doi]
- Collaborative Learning for Hand and Object Reconstruction with Attention-guided Graph ConvolutionTze Ho Elden Tse, Kwang In Kim, Ales Leonardis, Hyung Jin Chang. 1654-1664 [doi]
- RM-Depth: Unsupervised Learning of Recurrent Monocular Depth in Dynamic ScenesTak-Wai Hui. 1665-1674 [doi]
- Exploring Geometric Consistency for Monocular 3D Object DetectionQing Lian, Botao Ye, Ruijia Xu, Weilong Yao, Tong Zhang. 1675-1684 [doi]
- Learning 3D Object Shape and Layout without 3D SupervisionGeorgia Gkioxari, Nikhila Ravi, Justin Johnson 0001. 1685-1694 [doi]
- Single-Stage 3D Geometry-Preserving Depth Estimation Model Training on Dataset Mixtures with Uncalibrated Stereo DataNikolay Patakin, Anna Vorontsova, Mikhail Artemyev, Anton Konushin 0002. 1695-1704 [doi]
- Occluded Human Mesh RecoveryRawal Khirodkar, Shashank Tripathi, Kris Kitani. 1705-1715 [doi]
- LAKe-Net: Topology-Aware Point Cloud Completion by Localizing Aligned KeypointsJunshu Tang, Zhijun Gong, Ran Yi, Yuan Xie 0006, Lizhuang Ma. 1716-1725 [doi]
- OcclusionFusion: Occlusion-aware Motion Estimation for Real-time Dynamic 3D ReconstructionWenbin Lin, Chengwei Zheng, Jun-Hai Yong, Feng Xu. 1726-1735 [doi]
- Depth Estimation by Combining Binocular Stereo and Monocular Structured-LightYuhua Xu 0006, Xiaoli Yang, Yushan Yu, Wei Jia, ZhaoBi Chu, Yulan Guo. 1736-1745 [doi]
- Learning from Pixel-Level Noisy Label : A New Perspective for Light Field Saliency DetectionMingtao Feng, Kendong Liu, Liang Zhang, Hongshan Yu, Yaonan Wang, Ajmal Mian. 1746-1756 [doi]
- HyperTransformer: A Textural and Spectral Feature Fusion Transformer for PansharpeningWele Gedara Chaminda Bandara, Vishal M. Patel. 1757-1767 [doi]
- Revisiting Near/Remote Sensing with Geospatial AttentionScott Workman, Muhammad Usman Rafique, Hunter Blanton, Nathan Jacobs. 1768-1777 [doi]
- Memory-augmented Deep Conditional Unfolding Network for PansharpeningGang Yang, Man Zhou, Keyu Yan, Aiping Liu, Xueyang Fu, Fan Wang. 1778-1787 [doi]
- Mutual Information-driven Pan-sharpeningMan Zhou, Keyu Yan, Jie Huang, Zihe Yang, Xueyang Fu, Feng Zhao 0004. 1788-1798 [doi]
- Sparse and Complete Latent Organization for Geospatial Semantic SegmentationFengyu Yang, Chenyang Ma. 1799-1808 [doi]
- The Probabilistic Normal Epipolar Constraint for Frame- To-Frame Rotation Optimization under Uncertain Feature PositionsDominik Muhle, Lukas Koestler, Nikolaus Demmel, Florian Bernard, Daniel Cremers. 1809-1818 [doi]
- Oriented RepPoints for Aerial Object DetectionWentong Li, Yijie Chen, Kaixuan Hu, Jianke Zhu. 1819-1828 [doi]
- Using 3D Topological Connectivity for Ghost Particle Reduction in Flow ReconstructionChristina Tsalicoglou, Thomas Rösgen. 1829-1837 [doi]
- Self-Supervised Super-Resolution for Multi-Exposure Push-Frame SatellitesNgoc-Long Nguyen, Jérémy Anger, Axel Davy, Pablo Arias, Gabriele Facciolo. 1848-1858 [doi]
- MISF: Multi-level Interactive Siamese Filtering for High-Fidelity Image InpaintingXiaoguang Li, Qing Guo 0005, Di Lin, Ping Li 0016, Wei Feng 0005, Song Wang. 1859-1868 [doi]
- Iterative Deep Homography EstimationSi-Yuan Cao, Jianxin Hu, Ze-Hua Sheng, Hui-Liang Shen. 1869-1878 [doi]
- GCFSR: a Generative and Controllable Face Super Resolution Method Without Facial and GAN PriorsJingwen He, Wu Shi, Kai Chen, Lean Fu, Chao Dong. 1879-1888 [doi]
- Deep Color Consistent Network for Low-Light Image EnhancementZhao Zhang 0001, Huan Zheng, Richang Hong, Mingliang Xu, Shuicheng Yan, Meng Wang. 1889-1898 [doi]
- LAR-SR: A Local Autoregressive Model for Image Super-ResolutionBaisong Guo, Xiaoyun Zhang, Haoning Wu, Yu Wang, Ya Zhang, Yan-Feng Wang. 1899-1908 [doi]
- Multi-Scale Memory-Based Video DeblurringBo Ji, Angela Yao. 1909-1918 [doi]
- Local Texture Estimator for Implicit Representation FunctionJaewon Lee, Kyong Hwan Jin. 1919-1928 [doi]
- ChiTransformer: Towards Reliable Stereo from CuesQing Su, Shihao Ji. 1929-1939 [doi]
- PolyWorld: Polygonal Building Extraction with Graph Neural Networks in Satellite ImagesStefano Zorzi, Shabab Bazrafkan, Stefan Habenschuss, Friedrich Fraundorfer. 1938-1947 [doi]
- BNUDC: A Two-Branched Deep Neural Network for Restoring Images from Under-Display CamerasJaihyun Koh, Jangho Lee, Sungroh Yoon. 1940-1949 [doi]
- ISNAS-DIP: Image-Specific Neural Architecture Search for Deep Image PriorMetin Ersin Arican, Ozgur Kara, Gustav Bredell, Ender Konukoglu. 1950-1958 [doi]
- IFRNet: Intermediate Feature Refine Network for Efficient Frame InterpolationLingtong Kong, Boyuan Jiang, Donghao Luo, Wenqing Chu, Xiaoming Huang, Ying Tai, Chengjie Wang, Jie Yang. 1959-1968 [doi]
- Learning Graph Regularisation for Guided Super-ResolutionRiccardo de Lutio, Alexander Becker, Stefano D'Aronco, Stefania Russo, Jan D. Wegner, Konrad Schindler. 1969-1978 [doi]
- Self-supervised Deep Image Restoration via Adaptive Stochastic Gradient Langevin DynamicsWeixi Wang, Ji Li, Hui Ji. 1979-1988 [doi]
- Self-Supervised Arbitrary-Scale Point Clouds Upsampling via Implicit Neural RepresentationWenbo Zhao, Xianming Liu, Zhiwei Zhong, Junjun Jiang, Wei Gao 0003, Ge Li 0002, Xiangyang Ji. 1989-1997 [doi]
- Noise Distribution Adaptive Self-Supervised Image Denoising using Tweedie Distribution and Score MatchingKwanyoung Kim, Taesung Kwon, Jong Chul Ye. 1998-2006 [doi]
- Unpaired Deep Image Deraining Using Dual Contrastive LearningXiang Chen 0015, Jinshan Pan, Kui Jiang, Yufeng Li, Yufeng Huang, Caihua Kong, Longgang Dai, Zhentao Fan. 2007-2016 [doi]
- Blind2Unblind: Self-Supervised Image Denoising with Visible Blind SpotsZejin Wang, Jiazheng Liu, Guoqing Li, Hua Han 0001. 2017-2026 [doi]
- Self-augmented Unpaired Image Dehazing via Density and Depth DecompositionYang Yang, Chaoyue Wang, Risheng Liu, Lin Zhang, Xiaojie Guo, Dacheng Tao. 2027-2036 [doi]
- VideoINR: Learning Video Implicit Neural Representation for Continuous Space-Time Super-ResolutionZeyuan Chen, Yinbo Chen, Jingwen Liu, Xingqian Xu, Vidit Goel, Zhangyang Wang, Humphrey Shi, Xiaolong Wang. 2037-2047 [doi]
- Fast Algorithm for Low-rank Tensor Completion in Delay-embedded SpaceRyuki Yamamoto, Hidekata Hontani, Akira Imakura, Tatsuya Yokota. 2048-2056 [doi]
- Exploring and Evaluating Image Restoration Potential in Dynamic ScenesCheng Zhang, Shaolin Su, Yu Zhu, Qingsen Yan, Jinqiu Sun, Yanning Zhang. 2057-2066 [doi]
- th Order Iterative DegradationPranjay Shyam, Kyung Soo Kim, Kuk-Jin Yoon. 2067-2077 [doi]
- Does text attract attention on e-commerce images: A novel saliency prediction dataset and methodLai Jiang, Yifei Li, Shengxi Li, Mai Xu, Se Lei, Yichen Guo, Bo Huang. 2078-2087 [doi]
- IDR: Self-Supervised Image Denoising via Iterative Data RefinementYi Zhang, Dasong Li, Ka Lung Law, Xiaogang Wang, Hongwei Qin, Hongsheng Li. 2088-2097 [doi]
- ABPN: Adaptive Blend Pyramid Network for Real-Time Local Retouching of Ultra High-Resolution PhotoBiwen Lei, Xiefan Guo, Hongyu Yang, Miaomiao Cui, Xuansong Xie, Di Huang 0001. 2098-2107 [doi]
- Texture-based Error Analysis for Image Super-ResolutionSalma Abdel Magid, Zudi Lin, Donglai Wei 0001, Yulun Zhang, Jinjin Gu, Hanspeter Pfister. 2108-2117 [doi]
- Blind Image Super-resolution with Elaborate Degradation Modeling on Noise and KernelZongsheng Yue, Qian Zhao, Jianwen Xie, Lei Zhang, Deyu Meng, Kwan-Yee K. Wong. 2118-2128 [doi]
- KNN Local Attention for Image RestorationHunsang Lee, Hyesong Choi, Kwanghoon Sohn, Dongbo Min. 2129-2139 [doi]
- Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object DetectionRuijun Gao, Qing Guo 0005, Felix Juefei-Xu, Hongkai Yu, Huazhu Fu, Wei Feng, Yang Liu 0003, Song Wang. 2140-2149 [doi]
- Zoom In and Out: A Mixed-scale Triplet Network for Camouflaged Object DetectionYouwei Pang, Xiaoqi Zhao, Tian-Zhu Xiang, Lihe Zhang, Huchuan Lu. 2150-2160 [doi]
- Self-Supervised Keypoint Discovery in Behavioral VideosJennifer J. Sun, Serim Ryou, Roni H. Goldshmid, Brandon Weissbourd, John O. Dabiri, David J. Anderson, Ann Kennedy, Yisong Yue, Pietro Perona. 2161-2170 [doi]
- Learning to Align Sequential Actions in the WildWeizhe Liu, Bugra Tekin, Huseyin Coskun, Vibhav Vineet, Pascal Fua, Marc Pollefeys. 2171-2181 [doi]
- Dynamic 3D Gaze from Afar: Deep Gaze Estimation from Temporal Eye-Head-Body CoordinationSoma Nonaka, Shohei Nobuhara, Ko Nishino. 2182-2191 [doi]
- End-to-End Human-Gaze-Target Detection with TransformersDanyang Tu, Xiongkuo Min, Huiyu Duan, Guodong Guo, Guangtao Zhai, Wei Shen. 2192-2200 [doi]
- Automatic Synthesis of Diverse Weak Supervision Sources for Behavior AnalysisAlbert Tseng, Jennifer J. Sun, Yisong Yue. 2201-2210 [doi]
- MUSE-VAE: Multi-Scale VAE for Environment-Aware Long Term Trajectory PredictionMihee Lee, Samuel S. Sohn, Seonghyeon Moon, Sejong Yoon, Mubbasir Kapadia, Vladimir Pavlovic. 2211-2220 [doi]
- Graph-based Spatial Transformer with Memory Replay for Multi-future Pedestrian Trajectory PredictionLihuan Li, Maurice Pagnucco, Yang Song 0001. 2221-2231 [doi]
- End-to-End Trajectory Distribution Prediction Based on Occupancy Grid MapsKe Guo, Wenxi Liu, Jia Pan. 2232-2241 [doi]
- Learning Affordance Grounding from Exocentric ImagesHongchen Luo, Wei Zhai, Jing Zhang 0037, Yang Cao 0010, Dacheng Tao. 2242-2251 [doi]
- 3D Scene Painting via Semantic Image SynthesisJaebong Jeong, Janghun Jo, Sunghyun Cho, Jaesik Park. 2252-2262 [doi]
- Learning Invisible Markers for Hidden Codes in Offline-to-online PhotographyJun Jia, Zhongpai Gao, Dandan Zhu, Xiongkuo Min, Guangtao Zhai, Xiaokang Yang. 2263-2272 [doi]
- ETHSeg: An Amodel Instance Segmentation Network and a Real-world Dataset for X-Ray Waste InspectionLingteng Qiu, Zhangyang Xiong, Xuhao Wang, Kenkun Liu, Yihan Li, Guanying Chen, Xiaoguang Han 0001, Shuguang Cui. 2273-2282 [doi]
- Doodle It Yourself: Class Incremental Learning by Drawing a Few SketchesAyan Kumar Bhunia, Viswanatha Reddy Gajjala, Subhadeep Koley, Rohit Kundu, Aneeshan Sain, Tao Xiang, Yi-Zhe Song. 2283-2292 [doi]
- Image Disentanglement Autoencoder for Steganography without EmbeddingXiyao Liu 0001, Ziping Ma 0002, Junxing Ma, Jian Zhang, Gerald Schaefer, Hui Fang 0003. 2293-2302 [doi]
- Adaptive Hierarchical Representation Learning for Long-Tailed Object DetectionBanghuai Li. 2303-2312 [doi]
- Semiconductor Defect Detection by Hybrid Classical-Quantum Deep LearningYuanFu Yang, Min Sun. 2313-2322 [doi]
- Density-preserving Deep Point Cloud CompressionYun He, Xinlin Ren, Danhang Tang, Yinda Zhang 0001, Xiangyang Xue, Yanwei Fu. 2323-2332 [doi]
- Graph-context Attention Networks for Size-varied Deep Graph MatchingZheheng Jiang, Hossein Rahmani, Plamen P. Angelov, Sue Black 0002, Bryan M. Williams 0001. 2333-2342 [doi]
- TransWeather: Transformer-based Restoration of Images Degraded by Adverse Weather ConditionsJeya Maria Jose Valanarasu, Rajeev Yasarla, Vishal M. Patel. 2343-2353 [doi]
- ObjectFormer for Image Manipulation Detection and LocalizationJunke Wang, Zuxuan Wu, Jingjing Chen, Xintong Han, Abhinav Shrivastava, Ser-Nam Lim, Yu-Gang Jiang. 2354-2363 [doi]
- Sequential Voting with Relational Box Fields for Active Object DetectionQichen Fu, Xingyu Liu, Kris M. Kitani. 2364-2373 [doi]
- Efficient Classification of Very Large Images with Tiny ObjectsFanjie Kong, Ricardo Henao. 2374-2384 [doi]
- Partially Does It: Towards Scene-Level FG-SBIR with Partial InputPinaki Nath Chowdhury, Ayan Kumar Bhunia, Viswanatha Reddy Gajjala, Aneeshan Sain, Tao Xiang, Yi-Zhe Song. 2385-2395 [doi]
- Long-term Visual Map Sparsification with Heterogeneous GNNMing-Fang Chang, Yipu Zhao, Rajvi Shah, Jakob J. Engel, Michael Kaess, Simon Lucey. 2396-2405 [doi]
- Connecting the Complementary-view Videos: Joint Camera Identification and Subject AssociationRuize Han, Yiyang Gan, Jiacheng Li, Feifan Wang, Wei Feng, Song Wang. 2406-2415 [doi]
- DiffusionCLIP: Text-Guided Diffusion Models for Robust Image ManipulationGwanghyun Kim, Taesung Kwon, Jong Chul Ye. 2416-2425 [doi]
- Aesthetic Text Logo Synthesis via Content-aware Layout InferringYizhi Wang, Guo Pu, Wenhan Luo, Yexin Wang, Pengfei Xiong, Hongwen Kang, Zhouhui Lian. 2426-2435 [doi]
- Rethinking Image Cropping: Exploring Diverse Compositions from Global ViewsGengyun Jia, Huaibo Huang, Chaoyou Fu, Ran He. 2436-2445 [doi]
- Defensive Patches for Robust Recognition in the Physical WorldJiakai Wang, Zixin Yin, Pengfei Hu, Aishan Liu, Renshuai Tao, Haotong Qin, Xianglong Liu, Dacheng Tao. 2446-2455 [doi]
- Semi-supervised Video Paragraph Grounding with Contrastive EncoderXun Jiang, Xing Xu, Jingran Zhang, Fumin Shen, Zuo Cao, Heng Tao Shen. 2456-2465 [doi]
- Meta Distribution Alignment for Generalizable Person Re-IdentificationHao Ni, Jingkuan Song, Xiaopeng Luo, Feng Zheng, Wen Li, Heng Tao Shen. 2477-2486 [doi]
- FvOR: Robust Joint Shape and Pose Optimization for Few-view Object ReconstructionZhenpei Yang, Zhile Ren, Miguel Ángel Bautista 0001, Zaiwei Zhang, Qi Shan, Qixing Huang. 2487-2497 [doi]
- It's About Time: Analog Clock Reading in the WildCharig Yang, Weidi Xie, Andrew Zisserman. 2498-2507 [doi]
- Consistency driven Sequential Transformers Attention Model for Partially Observable ScenesSamrudhdhi B. Rangrej, Chetan L. Srinidhi, James J. Clark. 2508-2517 [doi]
- Smartadapt: Multi-branch Object Detection Framework for Videos on MobilesRan Xu, Fangzhou Mu, Jayoung Lee, Preeti Mukherjee, Somali Chaterji, Saurabh Bagchi, Yin Li 0003. 2518-2528 [doi]
- Generating 3D Bio-Printable Patches Using Wound Segmentation and Reconstruction to Treat Diabetic Foot UlcersHan Joo Chae, Seunghwan Lee, Hyewon Son, Seungyeob Han, Taebin Lim. 2529-2539 [doi]
- Investigating the Impact of Multi-LiDAR Placement on Object Detection for Autonomous DrivingHanjiang Hu, Zuxin Liu, Sharad Chitlangia, Akhil Agnihotri, Ding Zhao. 2540-2549 [doi]
- CMT-DeepLab: Clustering Mask Transformers for Panoptic SegmentationQihang Yu, Huiyu Wang, Dahun Kim, Siyuan Qiao, Maxwell D. Collins, Yukun Zhu, Hartwig Adam, Alan L. Yuille, Liang-Chieh Chen. 2550-2560 [doi]
- Unsupervised Hierarchical Semantic Segmentation with Multiview Cosegmentation and Clustering TransformersTsung-Wei Ke, Jyh-Jing Hwang, Yunhui Guo, Xudong Wang, Stella X. Yu. 2561-2571 [doi]
- Rethinking Semantic Segmentation: A Prototype ViewTianfei Zhou, Wenguan Wang, Ender Konukoglu, Luc Van Gool. 2572-2583 [doi]
- Semantic-Aware Domain Generalized SegmentationDuo Peng, Yinjie Lei, Munawar Hayat, Yulan Guo, Wen Li 0001. 2584-2595 [doi]
- Adaptive Early-Learning Correction for Segmentation from Noisy AnnotationsSheng Liu, Kangning Liu, Weicheng Zhu, Yiqiu Shen, Carlos Fernandez-Granda. 2596-2606 [doi]
- Pointly-Supervised Instance SegmentationBowen Cheng, Omkar Parkhi, Alexander Kirillov. 2607-2616 [doi]
- Joint Forecasting of Panoptic Segmentations with Difference AttentionColin Graber, Cyril Jazra, Wenjie Luo, Liangyan Gui, Alexander G. Schwing. 2617-2626 [doi]
- FocusCut: Diving into a Focus View in Interactive SegmentationZheng Lin 0005, Zheng-Peng Duan, Zhao Zhang, Chun-Le Guo, Ming-Ming Cheng. 2627-2636 [doi]
- Human Instance Matting via Mutual Guidance and Multi-Instance RefinementYanan Sun 0005, Chi-Keung Tang, Yu-Wing Tai. 2637-2646 [doi]
- Deformable Sprites for Unsupervised Video DecompositionVickie Ye, Zhengqi Li, Richard Tucker 0001, Angjoo Kanazawa, Noah Snavely. 2647-2656 [doi]
- Eigencontours: Novel Contour Descriptors Based on Low-Rank ApproximationWonhui Park, Dongkwon Jin, Chang-Su Kim 0001. 2657-2665 [doi]
- Robust and Accurate Superquadric Recovery: a Probabilistic ApproachWeixiao Liu, Yuwei Wu, Sipu Ruan, Gregory S. Chirikjian. 2666-2675 [doi]
- Medial Spectral Coordinates for 3D Shape AnalysisMorteza Rezanejad, Mohammad Khodadad, Hamidreza Mahyar, Herve Lombaert, Michael Gruninger, Dirk B. Walther, Kaleem Siddiqi. 2676-2686 [doi]
- Scribble-Supervised LiDAR Semantic SegmentationOzan Unal, Dengxin Dai, Luc Van Gool. 2687-2697 [doi]
- SoftGroup for 3D Instance Segmentation on Point CloudsThang Vu, Kookhoi Kim, Tung Minh Luu, Thanh Nguyen, Chang D. Yoo. 2698-2707 [doi]
- Accurate 3D Body Shape Regression using Metric and Semantic AttributesVasileios Choutas, Lea Müller, Chun-Hao P. Huang, Siyu Tang 0001, Dimitrios Tzionas, Michael J. Black. 2708-2718 [doi]
- JIFF: Jointly-aligned Implicit Face Function for High Quality Single View Clothed Human ReconstructionYukang Cao, Guanying Chen, Kai Han 0001, Wenqi Yang, Kwan-Yee K. Wong. 2719-2729 [doi]
- Tracking People by Predicting 3D Appearance, Location and PoseJathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, Jitendra Malik. 2730-2739 [doi]
- ArtiBoost: Boosting Articulated 3D Hand-Object Pose Estimation via Online Exploration and SynthesisLixin Yang, Kailin Li, Xinyu Zhan 0001, Jun Lv, Wenqiang Xu, Jiefeng Li, Cewu Lu. 2740-2750 [doi]
- Interacting Attention Graph for Single Image Two-Hand ReconstructionMengcheng Li, Liang An, Hongwen Zhang, Lianpeng Wu, Feng Chen, Tao Yu 0007, Yebin Liu. 2751-2760 [doi]
- 3D human tongue reconstruction from single "in-the-wild" imagesStylianos Ploumpis, Stylianos Moschoglou, Vasileios Triantafyllou, Stefanos Zafeiriou. 2761-2770 [doi]
- EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose EstimationHansheng Chen, Pichao Wang, Fan Wang, Wei Tian, Lu Xiong, Hao Li 0030. 2771-2780 [doi]
- Diversity Matters: Fully Exploiting Depth Clues for Reliable Monocular 3D Object DetectionZhuoling Li, Zhan Qu, Yang Zhou, Jianzhuang Liu, Haoqian Wang, Lihui Jiang. 2781-2790 [doi]
- OmniFusion: 360 Monocular Depth Estimation via Geometry-Aware FusionYuyan Li, Yuliang Guo, Zhixin Yan, Xinyu Huang 0001, Ye Duan, Liu Ren. 2791-2800 [doi]
- Gated2Gated: Self-Supervised Depth Estimation from Gated ImagesAmanpreet Walia, Stefanie Walz, Mario Bijelic, Fahim Mannan, Frank D. Julca-Aguilar, Michael S. Langer, Werner Ritter, Felix Heide. 2801-2811 [doi]
- IRISformer: Dense Vision Transformers for Single-Image Inverse Rendering in Indoor ScenesRui Zhu, Zhengqin Li, Janarbek Matai, Fatih Porikli, Manmohan Chandraker. 2812-2821 [doi]
- Egocentric Scene Understanding via Multimodal Spatial RectifierTien Do, Khiem Vuong, Hyun Soo Park. 2822-2831 [doi]
- Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View GeometryGwangbin Bae, Ignas Budvytis, Roberto Cipolla. 2832-2841 [doi]
- The Implicit Values of A Good Hand Shake: Handheld Multi-Frame Neural Depth RefinementIlya Chugunov, Yuxuan Zhang, Zhihao Xia, Xuaner Zhang, Jiawen Chen, Felix Heide. 2842-2852 [doi]
- BANMo: Building Animatable 3D Neural Models from Many Casual VideosGengshan Yang, Minh Vo, Natalia Neverova, Deva Ramanan, Andrea Vedaldi, Hanbyul Joo. 2853-2863 [doi]
- Self-supervised Video TransformerKanchana Ranasinghe, Muzammal Naseer, Salman Khan 0001, Fahad Shahbaz Khan, Michael S. Ryoo. 2864-2874 [doi]
- Temporally Efficient Vision Transformer for Video Instance SegmentationShusheng Yang, Xinggang Wang, Yu Li 0003, Yuxin Fang, Jiemin Fang, Wenyu Liu 0001, Xun Zhao, Ying Shan. 2875-2885 [doi]
- VISOLO: Grid-Based Space-Time Aggregation for Efficient Online Video Instance SegmentationSu Ho Han, Sukjun Hwang, Seoung Wug Oh, Yeonchool Park, Hyunwoo Kim, Min-Jung Kim, Seon Joo Kim. 2886-2895 [doi]
- Temporal Alignment Networks for Long-term VideoTengda Han, Weidi Xie, Andrew Zisserman. 2896-2906 [doi]
- Revisiting the "Video" in Video-Language UnderstandingShyamal Buch, Cristóbal Eyzaguirre, Adrien Gaidon, Jiajun Wu 0001, Li Fei-Fei 0001, Juan Carlos Niebles. 2907-2917 [doi]
- Invariant Grounding for Video Question AnsweringYicong Li 0004, Xiang Wang, Junbin Xiao, Wei Ji, Tat-Seng Chua. 2918-2927 [doi]
- 3IV: Probabilistic Procedure Planning from Instructional Videos with Weak SupervisionHe Zhao 0004, Isma Hadji, Nikita Dvornik, Konstantinos G. Derpanis, Richard P. Wildes, Allan D. Jepson. 2928-2938 [doi]
- FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality AssessmentJinglin Xu, Yongming Rao, Xumin Yu, Guangyi Chen 0002, Jie Zhou 0001, Jiwen Lu. 2939-2948 [doi]
- Cross-Model Pseudo-Labeling for Semi-Supervised Action RecognitionYinghao Xu, Fangyun Wei, Xiao Sun, Ceyuan Yang, Yujun Shen, Bo Dai, Bolei Zhou, Stephen Lin 0001. 2949-2958 [doi]
- Revisiting Skeleton-based Action RecognitionHaodong Duan, Yue Zhao 0006, Kai Chen 0026, Dahua Lin, Bo Dai. 2959-2968 [doi]
- OpenTAL: Towards Open Set Temporal Action LocalizationWentao Bao, Qi Yu 0001, Yu Kong. 2969-2979 [doi]
- Dual-AI: Dual-path Actor Interaction Learning for Group Activity RecognitionMingfei Han 0002, David Junhao Zhang, Yali Wang 0001, Rui Yan, Lina Yao 0001, Xiaojun Chang, Yu Qiao. 2980-2989 [doi]
- TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation RecognitionHaodong Duan, Nanxuan Zhao, Kai Chen, Dahua Lin. 2990-3000 [doi]
- Revealing Occlusions with 4D Neural FieldsBasile Van Hoorick, Purva Tendulkar, Dídac Surís, Dennis Park, Simon Stent, Carl Vondrick. 3001-3011 [doi]
- HODOR: High-level Object Descriptors for Object Re-segmentation in Video Learned from Static ImagesAli Athar, Jonathon Luiten, Alexander Hermans, Deva Ramanan, Bastian Leibe. 3012-3021 [doi]
- Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence LearningJuncheng Li 0006, Junlin Xie, Long Qian, Linchao Zhu, Siliang Tang, Fei Wu 0001, Yi Yang, Yueting Zhuang, Xin Eric Wang. 3022-3031 [doi]
- UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight DetectionYe Liu, Siyuan Li, Yang Wu, Chang Wen Chen, Ying Shan, Xiaohu Qie. 3032-3041 [doi]
- Future Transformer for Long-term Action AnticipationDayoung Gong, Joonseok Lee, Manjin Kim, Seong Jong Ha, Minsu Cho. 3042-3051 [doi]
- MLP-3D: A MLP-like 3D Architecture with Grouped Time MixingZhaofan Qiu, Ting Yao, Chong-Wah Ngo, Tao Mei 0001. 3052-3062 [doi]
- Learning Pixel-Level Distinctions for Video Highlight DetectionFanyue Wei, Biao Wang, Tiezheng Ge, Yuning Jiang, Wen Li 0001, Lixin Duan. 3063-3072 [doi]
- DR.VIC: Decomposition and Reasoning for Video Individual CountingTao Han, Lei Bai 0001, Junyu Gao 0001, Qi Wang 0009, Wanli Ouyang. 3073-3082 [doi]
- Slot-VPS: Object-centric Representation Learning for Video Panoptic SegmentationYi Zhou, Hui Zhang, Hana Lee, Shuyang Sun, Pingjun Li, Yangguang Zhu, ByungIn Yoo, Xiaojuan Qi, Jae-Joon Han. 3083-3093 [doi]
- Explore Spatio-temporal Aggregation for Insubstantial Object Detection: Benchmark Dataset and BaselineKailai Zhou, Yibo Wang, Tao Lv, Yunqian Li, Linsen Chen, Qiu Shen, Xun Cao. 3094-3105 [doi]
- Video Shadow Detection via Spatio-Temporal Interpolation Consistency TrainingXiao Lu, Yihong Cao, Sheng Liu, Chengjiang Long, Zipei Chen, Xuanyu Zhou, Yimin Yang, Chunxia Xiao. 3106-3115 [doi]
- Coarse-to-Fine Feature Mining for Video Semantic SegmentationGuolei Sun, Yun Liu, Henghui Ding, Thomas Probst, Luc Van Gool. 3116-3127 [doi]
- Tencent-MVSE: A Large-Scale Benchmark Dataset for Multi-Modal Video Similarity EvaluationZhaoyang Zeng, Yongsheng Luo, Zhenhua Liu, Fengyun Rao, Dian Li, Weidong Guo, Zhen Wen. 3128-3137 [doi]
- Object-Region Video TransformersRoei Herzig, Elad Ben-Avraham, Karttikeya Mangalam, Amir Bar, Gal Chechik, Anna Rohrbach, Trevor Darrell, Amir Globerson. 3138-3149 [doi]
- Colar: Effective and Efficient Online Action Detection by Consulting ExemplarsLe Yang, Junwei Han, Dingwen Zhang. 3150-3159 [doi]
- SimVP: Simpler yet Better Video PredictionZhangyang Gao, Cheng Tan 0012, Lirong Wu, Stan Z. Li. 3160-3170 [doi]
- Imposing Consistency for Optical Flow EstimationJisoo Jeong, Jamie Menjay Lin, Fatih Porikli, Nojun Kwak. 3171-3181 [doi]
- Stand-Alone Inter-Frame Attention in Video ModelsFuchen Long, Zhaofan Qiu, Yingwei Pan, Ting Yao, Jiebo Luo, Tao Mei 0001. 3182-3191 [doi]
- Video Swin TransformerZe Liu, Jia Ning, Yue Cao 0001, Yixuan Wei, Zheng Zhang 0022, Stephen Lin 0001, Han Hu 0004. 3192-3201 [doi]
- Bayesian Nonparametric Submodular Video Partition for Robust Anomaly DetectionHitesh Sapkota, Qi Yu 0001. 3202-3211 [doi]
- Likert Scoring with Grade Decoupling for Long-term Action AssessmentAngchi Xu, Ling-An Zeng, Wei-Shi Zheng. 3222-3231 [doi]
- Complex Video Action Reasoning via Learnable Markov Logic NetworkYang Jin, Linchao Zhu, Yadong Mu. 3232-3241 [doi]
- Learning from Temporal Gradient for Semi-supervised Action RecognitionJunfei Xiao, Longlong Jing, Lin Zhang, Ju He, Qi She, Zongwei Zhou, Alan L. Yuille, Yingwei Li. 3242-3252 [doi]
- Semi-Supervised Video Semantic Segmentation with Inter-Frame Feature ReconstructionJiafan Zhuang, Zilei Wang, Yuan Gao. 3253-3261 [doi]
- Weakly Supervised Temporal Action Localization via Representative Snippet Knowledge PropagationLinjiang Huang, Liang Wang 0001, Hongsheng Li 0001. 3262-3271 [doi]
- Joint Hand Motion and Interaction Hotspots Prediction from Egocentric VideosShaowei Liu, Subarna Tripathi, Somdeb Majumdar, Xiaolong Wang 0004. 3272-3282 [doi]
- Human Hands as Probes for Interactive Object UnderstandingMohit Goyal, Sahil Modi, Rishabh Goyal, Saurabh Gupta. 3283-3293 [doi]
- LD-ConGR: A Large RGB-D Video Dataset for Long-Distance Continuous Gesture RecognitionDan Liu, Libo Zhang, Yanjun Wu. 3294-3302 [doi]
- Object-aware Video-language Pre-training for RetrievalAlex Jinpeng Wang, Yixiao Ge, Guanyu Cai, Rui Yan, Xudong Lin 0003, Ying Shan, Xiaohu Qie, Mike Zheng Shou. 3303-3312 [doi]
- Fast and Unsupervised Action Boundary Detection for Action SegmentationZexing Du, Xue Wang 0006, Guoqing Zhou 0003, Qing Wang 0006. 3313-3322 [doi]
- Multiview Transformers for Video RecognitionShen Yan, Xuehan Xiong, Anurag Arnab, Zhichao Lu, Mi Zhang, Chen Sun 0002, Cordelia Schmid. 3323-3333 [doi]
- Semi-Weakly-Supervised Learning of Complex Actions from Instructional Task VideosYuhan Shen, Ehsan Elhamifar. 3334-3344 [doi]
- Progressive Attention on Multi-Level Dense Difference Maps for Generic Event Boundary DetectionJiaqi Tang, Zhaoyang Liu, Chen Qian 0006, Wayne Wu, Limin Wang 0002. 3345-3354 [doi]
- Comparing Correspondences: Video Prediction with Correspondence-wise LossesDaniel Geng, Max Hamilton, Andrew Owens. 3355-3366 [doi]
- Sound-Guided Semantic Image ManipulationSeung-Hyun Lee, Wonseok Roh, Wonmin Byeon, Sang Ho Yoon, Chan Young Kim, Jinkyu Kim, Sangpil Kim. 3367-3376 [doi]
- Expressive Talking Head Generation with Granular Audio-Visual ControlBorong Liang, Yan Pan, Zhizhi Guo, Hang Zhou, Zhibin Hong, Xiaoguang Han, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang 0001. 3377-3386 [doi]
- Depth-Aware Generative Adversarial Network for Talking Head Video GenerationFa-Ting Hong, Longhao Zhang, Li Shen, Dan Xu 0002. 3387-3396 [doi]
- Learning Motion-Dependent Appearance for High-Fidelity Rendering of Dynamic Humans from a Single CameraJae Shin Yoon, Duygu Ceylan, Tuanfeng Y. Wang, Jingwan Lu, Jimei Yang, Zhixin Shu, Hyun Soo Park. 3397-3407 [doi]
- Audio-driven Neural Gesture Reenactment with Video Motion GraphsYang Zhou, Jimei Yang, Dingzeyu Li, Jun Saito, Deepali Aneja, Evangelos Kalogerakis. 3408-3418 [doi]
- Portrait Eyeglasses and Shadow Removal by Leveraging 3D Synthetic DataJunfeng Lyu, Zhibo Wang, Feng Xu. 3419-3429 [doi]
- Weakly Supervised High-Fidelity Clothing Model GenerationRuili Feng, Cheng Ma, Chengji Shen, Xin Gao, Zhenjiang Liu, Xiaobo Li, Kairi Ou, Deli Zhao, Zheng-Jun Zha. 3430-3439 [doi]
- TemporalUV: Capturing Loose Clothing with Temporally Coherent UV CoordinatesYou Xie, Huiqi Mao, Angela Yao, Nils Thuerey. 3440-3449 [doi]
- Full-Range Virtual Try-On with Recurrent Tri-Level TransformHan Yang, Xinrui Yu, Ziwei Liu 0002. 3450-3459 [doi]
- Style-Based Global Appearance Flow for Virtual Try-OnSen He, Yi-Zhe Song, Tao Xiang. 3460-3469 [doi]
- Dressing in the Wild by Watching Dance VideosXin Dong, Fuwei Zhao, Zhenyu Xie, Xijin Zhang, Daniel K. Du, Min Zheng, Xiang Long, Xiaodan Liang, Jianchao Yang. 3470-3479 [doi]
- A Brand New Dance Partner: Music-Conditioned Pluralistic Dancing Controlled by Multiple Dance GenresJinwoo Kim 0001, Heeseok Oh, Seongjean Kim, Hoseok Tong, Sanghoon Lee 0001. 3480-3490 [doi]
- Unpaired Cartoon Image Synthesis via Gated Cycle MappingYifang Men, Yuan Yao 0013, Miaomiao Cui, Zhouhui Lian, Xuansong Xie, Xian-Sheng Hua 0001. 3491-3500 [doi]
- DLFormer: Discrete Latent Transformer for Video InpaintingJingjing Ren, Qingqing Zheng, Yuanyuan Zhao, Xuemiao Xu, Chen Li. 3501-3510 [doi]
- ST-MFNet: A Spatio-Temporal Multi-Flow Network for Frame InterpolationDuolikun Danier, Fan Zhang, David R. Bull. 3511-3521 [doi]
- Video Frame Interpolation with TransformerLiying Lu, Ruizheng Wu, Huaijia Lin, Jiangbo Lu, Jiaya Jia. 3522-3532 [doi]
- Long-term Video Frame Interpolation via Feature PropagationDawit Mureja Argaw, In-So Kweon. 3533-3542 [doi]
- Many-to-many Splatting for Efficient Video Frame InterpolationPing Hu, Simon Niklaus, Stan Sclaroff, Kate Saenko. 3543-3552 [doi]
- Look Outside the Room: Synthesizing A Consistent Long-Term 3D Scene Video from A Single ImageXuanchi Ren, Xiaolong Wang. 3553-3563 [doi]
- Spatial-Temporal Space Hand-in-Hand: Spatial-Temporal Video Super-Resolution via Cycle-Projected Mutual LearningMengshun Hu, Kui Jiang, Liang Liao, Jing Xiao 0004, Junjun Jiang, Zheng Wang 0007. 3564-3573 [doi]
- Playable Environments: Video Manipulation in Space and TimeWilli Menapace, Stéphane Lathuilière, Aliaksandr Siarohin, Christian Theobalt, Sergey Tulyakov, Vladislav Golyanik, Elisa Ricci 0001. 3574-3583 [doi]
- Event-based Video Reconstruction via Potential-assisted Spiking Neural NetworkLin Zhu 0012, Xiao Wang, Yi Chang, Jianing Li, Tiejun Huang 0001, Yonghong Tian 0001. 3584-3594 [doi]
- Modular Action Concept Grounding in Semantic Video PredictionWei Yu, Wenxin Chen, Songheng Yin, Steve Easterbrook, Animesh Garg. 3595-3604 [doi]
- Show Me What and Tell Me How: Video Synthesis via Multimodal ConditioningLigong Han, Jian Ren, Hsin-Ying Lee, Francesco Barbieri, Kyle Olszewski, Shervin Minaee, Dimitris N. Metaxas, Sergey Tulyakov. 3605-3615 [doi]
- StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2Ivan Skorokhodov, Sergey Tulyakov, Mohamed Elhoseiny. 3616-3626 [doi]
- Structure-Aware Motion Transfer with Deformable Anchor ModelJiale Tao, Biao Wang, Borun Xu, Tiezheng Ge, Yuning Jiang, Wen Li 0001, Lixin Duan. 3627-3636 [doi]
- Image Animation with Perturbed MasksYoav Shalev, Lior Wolf. 3637-3646 [doi]
- Thin-Plate Spline Motion Model for Image AnimationJian Zhao, Hui Zhang. 3647-3656 [doi]
- Controllable Animation of Fluid Elements in Still ImagesAniruddha Mahapatra, Kuldeep Kulkarni. 3657-3666 [doi]
- Watch It Move: Unsupervised Discovery of 3D Joints for Re-Posing of Articulated ObjectsAtsuhiro Noguchi, Umar Iqbal, Jonathan Tremblay, Tatsuya Harada, Orazio Gallo. 3667-3677 [doi]
- Geometric Structure Preserving Warp for Natural Image StitchingPeng Du, Jifeng Ning, Jiguang Cui, Shaoli Huang, Xinchao Wang, Jiaxin Wang. 3678-3686 [doi]
- Few-Shot Incremental Learning for Label-to-Image TranslationPei Chen, Yangkang Zhang, Zejian Li, Lingyun Sun. 3687-3697 [doi]
- Exemplar-based Pattern Synthesis with Implicit Periodic Field NetworkHaiwei Chen, Jiayi Liu, Weikai Chen 0001, Shichen Liu, Yajie Zhao. 3698-3707 [doi]
- SIMBAR: Single Image-Based Scene Relighting For Effective Data Augmentation For Automated Driving Vision TasksXianling Zhang, Nathan Tseng, Ameerah Syed, Rohan Bhasin, Nikita Jaipuria. 3708-3718 [doi]
- SoftCollage: A Differentiable Probabilistic Tree Generator for Image CollageJiahao Yu, Li Chen, Mingrui Zhang, Mading Li. 3719-3728 [doi]
- PILC: Practical Image Lossless Compression with an End-to-end GPU Oriented Neural FrameworkNing Kang 0001, Shanzhao Qiu, Shifeng Zhang, Zhenguo Li, Shutao Xia. 3729-3738 [doi]
- Kubric: A scalable dataset generatorKlaus Greff, Francois Belletti, Lucas Beyer, Carl Doersch, Yilun Du, Daniel Duckworth, David J. Fleet, Dan Gnanapragasam, Florian Golemo, Charles Herrmann, Thomas Kipf, Abhijit Kundu, Dmitry Lagun, Issam H. Laradji, Hsueh-Ti Derek Liu, Henning Meyer, Yishu Miao, Derek Nowrouzezahrai, A. Cengiz Öztireli, Etienne Pot, Noha Radwan, Daniel Rebain, Sara Sabour, Mehdi S. M. Sajjadi, Matan Sela, Vincent Sitzmann, Austin Stone, Deqing Sun, Suhani Vora, Ziyu Wang, Tianhao Wu, Kwang Moo Yi, Fangcheng Zhong, Andrea Tagliasacchi. 3739-3751 [doi]
- 360MonoDepth: High-Resolution 360° Monocular Depth EstimationManuel Rey-Area, Mingze Yuan, Christian Richardt. 3752-3762 [doi]
- Pretrain, Self-train, Distill: A simple recipe for Supersizing 3D ReconstructionKalyan Vasudev Alwala, Abhinav Gupta 0001, Shubham Tulsiani. 3763-3772 [doi]
- DGECN: A Depth-Guided Edge Convolutional Network for End-to-End 6D Pose EstimationTuo Cao, Fei Luo 0004, Yanping Fu, Wenxiao Zhang, Shengjie Zheng, Chunxia Xiao. 3773-3782 [doi]
- MonoGround: Detecting Monocular 3D Objects from the GroundZequn Qin, Xi Li 0001. 3783-3792 [doi]
- 3D Shape Reconstruction from 2D Images with Disentangled Attribute FlowXin Wen, Junsheng Zhou, Yu-Shen Liu, Hua Su, Zhen Dong, Zhizhong Han. 3793-3803 [doi]
- Toward Practical Monocular Indoor Depth EstimationCho-Ying Wu, Jialiang Wang, Michael Hall, Ulrich Neumann, Shuochen Su. 3804-3814 [doi]
- Focal Length and Object Pose Estimation via Render and CompareGeorgy Ponimatkin, Yann Labbé, Bryan C. Russell, Mathieu Aubry, Josef Sivic. 3815-3824 [doi]
- CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance FieldsCan Wang, Menglei Chai, Mingming He, Dongdong Chen 0001, Jing Liao 0001. 3825-3834 [doi]
- Registering Explicit to Implicit: Towards High-Fidelity Garment mesh Reconstruction from Single ImagesHeming Zhu, Lingteng Qiu, Yuda Qiu, Xiaoguang Han 0001. 3835-3844 [doi]
- Layered Depth Refinement with Mask GuidanceSoo Ye Kim, Jianming Zhang 0001, Simon Niklaus, Yifei Fan, Simon Chen, Zhe Lin 0001, Munchurl Kim. 3845-3855 [doi]
- HEAT: Holistic Edge Attention Transformer for Structured ReconstructionJiacheng Chen, Yiming Qian, Yasutaka Furukawa. 3856-3865 [doi]
- BARC: Learning to Regress 3D Dog Shape from Images by Exploiting Breed InformationNadine Rüegg, Silvia Zuffi, Konrad Schindler, Michael J. Black. 3866-3874 [doi]
- Time3D: End-to-End Joint Monocular 3D Object Detection and Tracking for Autonomous DrivingPeixuan Li, Jieyu Jin. 3875-3884 [doi]
- What's in your hands? 3D Reconstruction of Generic Objects in HandsYufei Ye, Abhinav Gupta 0001, Shubham Tulsiani. 3885-3895 [doi]
- 3D Moments from Near-Duplicate PhotosQianqian Wang, Zhengqi Li, David Salesin, Noah Snavely, Brian Curless, Janne Kontkanen. 3896-3905 [doi]
- Neural Window Fully-connected CRFs for Monocular Depth EstimationWeihao Yuan, Xiaodong Gu 0004, Zuozhuo Dai, Siyu Zhu, Ping Tan. 3906-3915 [doi]
- PUMP: Pyramidal and Uniqueness Matching Priors for Unsupervised Learning of Local DescriptorsJérôme Revaud, Vincent Leroy 0003, Philippe Weinzaepfel, Boris Chidlovskii. 3916-3926 [doi]
- CroMo: Cross-Modal Learning for Monocular Depth EstimationYannick Verdié, Jifei Song, Barnabé Mas, Benjamin Busam, Ales Leonardis, Steven McDonagh. 3927-3937 [doi]
- $\phi$-SfT: Shape-from-Template with a Physics-Based Deformation ModelNavami Kairanda, Edith Tretschk, Mohamed Elgharib, Christian Theobalt, Vladislav Golyanik. 3938-3948 [doi]
- Human-Aware Object Placement for Visual Environment ReconstructionHongwei Yi, Chun-Hao P. Huang, Dimitrios Tzionas, Muhammed Kocabas, Mohamed Hassan, Siyu Tang 0001, Justus Thies, Michael J. Black. 3949-3960 [doi]
- AutoRF: Learning 3D Object Radiance Fields from Single View ObservationsNorman Müller, Andrea Simonelli, Lorenzo Porzi, Samuel Rota Bulò, Matthias Nießner, Peter Kontschieder. 3961-3970 [doi]
- Pix2NeRF: Unsupervised Conditional $\pi$-GAN for Single Image to Neural Radiance Fields TranslationShengqu Cai, Anton Obukhov, Dengxin Dai, Luc Van Gool. 3971-3980 [doi]
- MonoScene: Monocular 3D Semantic Scene CompletionAnh-Quan Cao, Raoul de Charette. 3981-3991 [doi]
- GenDR: A Generalized Differentiable RendererFelix Petersen, Bastian Goldluecke, Christian Borgelt, Oliver Deussen. 3992-4001 [doi]
- MonoDTR: Monocular 3D Object Detection with Depth-Aware TransformerKuan-Chih Huang, Tsung-Han Wu, Hung-Ting Su, Winston H. Hsu. 4002-4011 [doi]
- ROCA: Robust CAD Model Retrieval and Alignment from a Single ImageCan Gümeli, Angela Dai, Matthias Nießner. 4012-4021 [doi]
- HP-Capsule: Unsupervised Face Part Discovery by Hierarchical Parsing Capsule NetworkChang Yu, Xiangyu Zhu, Xiaomei Zhang, Zidu Wang, Zhaoxiang Zhang, Zhen Lei 0001. 4022-4031 [doi]
- Killing Two Birds with One Stone: Efficient and Robust Training of Face Recognition CNNs by Partial FCXiang An, Jiankang deng, Jia Guo, Ziyong Feng, Xuhan Zhu, Jing Yang 0038, Tongliang Liu. 4032-4041 [doi]
- Sparse Local Patch Transformer for Robust Face Alignment and Landmarks Inherent Relation LearningJiahao Xia, Weiwei Qu, Wenjian Huang, Jianguo Zhang, Xi Wang, Min Xu. 4042-4051 [doi]
- Enhancing Face Recognition with Self-Supervised 3D ReconstructionMingjie He, Jie Zhang 0071, Shiguang Shan, Xilin Chen 0001. 4052-4061 [doi]
- Learning to Learn across Diverse Data Biases in Deep Face RecognitionChang Liu 0022, Xiang Yu 0002, Yi-Hsuan Tsai, Masoud Faraki, Ramin Moslemi, Manmohan Chandraker, Yun Fu 0001. 4062-4072 [doi]
- An Efficient Training Approach for Very Large Scale Face RecognitionKai Wang, Shuo Wang, Panpan Zhang, Zhipeng Zhou, Zheng Zhu, Xiaobo Wang, Xiaojiang Peng, Baigui Sun, Hao Li, Yang You. 4073-4082 [doi]
- MogFace: Towards a Deeper Appreciation on Face DetectionYang Liu, Fei Wang, Jiankang deng, Zhipeng Zhou, Baigui Sun, Hao Li. 4083-4092 [doi]
- Exploring Frequency Adversarial Attacks for Face Forgery DetectionShuai Jia, Chao Ma 0004, Taiping Yao, Bangjie Yin, Shouhong Ding, Xiaokang Yang. 4093-4102 [doi]
- End-to-End Reconstruction-Classification Learning for Face Forgery DetectionJunyi Cao, Chao Ma 0004, Taiping Yao, Shen Chen, Shouhong Ding, Xiaokang Yang. 4103-4112 [doi]
- Domain Generalization via Shuffled Style Assembly for Face Anti-SpoofingZhuo Wang, Zezheng Wang, Zitong Yu, Weihong Deng, Jiahong Li, Tingting Gao, Zhongyuan Wang. 4113-4123 [doi]
- Privacy-preserving Online AutoML for Domain-Specific Face DetectionChenqian Yan, Yuge Zhang, Quanlu Zhang, Yaming Yang 0001, Xinyang Jiang, YuQing Yang, Baoyuan Wang. 4124-4134 [doi]
- Simulated Adversarial Testing of Face Recognition ModelsNataniel Ruiz, Adam Kortylewski, Weichao Qiu, Cihang Xie, Sarah Adel Bargal, Alan L. Yuille, Stan Sclaroff. 4135-4145 [doi]
- Decoupled Multi-task Learning with Cyclical Self-Regulation for Face ParsingQingping Zheng, Jiankang deng, Zheng Zhu, Ying Li, Stefanos Zafeiriou. 4146-4155 [doi]
- Towards Semi-Supervised Deep Facial Expression Recognition with An Adaptive Confidence MarginHangyu Li, Nannan Wang 0001, Xi Yang 0011, Xiaoyu Wang, Xinbo Gao 0001. 4156-4165 [doi]
- Towards Accurate Facial Landmark Detection via Cascaded TransformersHui Li, Zidong Guo, Seon-Min Rhee, Seungju Han, Jae-Joon Han. 4166-4175 [doi]
- PhysFormer: Facial Video-based Physiological Measurement with Temporal Difference TransformerZitong Yu, Yuming Shen, Jingang Shi, Hengshuang Zhao, Philip H. S. Torr, Guoying Zhao. 4176-4186 [doi]
- GazeOnce: Real-Time Multi-Person Gaze EstimationMingfang Zhang, Yunfei Liu, Feng Lu 0005. 4187-4196 [doi]
- Generalizing Gaze Estimation with Rotation ConsistencyYiwei Bao, Yunfei Liu, Haofei Wang, Feng Lu. 4197-4206 [doi]
- Face Relighting with Geometrically Consistent ShadowsAndrew Hou, Michel Sarkis, Ning Bi, Yiying Tong, Xiaoming Liu 0002. 4207-4216 [doi]
- HairMapper: Removing Hair from Portraits Using GANsYiqian Wu, Yong-Liang Yang, Xiaogang Jin 0001. 4217-4226 [doi]
- Learning to Restore 3D Face from In-the-Wild Degraded ImagesZhenyu Zhang 0005, Yanhao Ge, Ying Tai, Xiaoming Huang, Chengjie Wang, Hao Tang, Dongjin Huang, Zhifeng Xie. 4227-4237 [doi]
- Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-LabelsYuchao Wang, Haochen Wang, Yujun Shen, Jingjing Fei, Wei Li, Guoqiang Jin, Liwei Wu, Rui Zhao 0018, Xinyi Le. 4238-4247 [doi]
- Perturbed and Strict Mean Teachers for Semi-supervised Semantic SegmentationYuyuan Liu, Yu Tian, Yuanhong Chen, Fengbei Liu, Vasileios Belagiannis, Gustavo Carneiro. 4248-4257 [doi]
- ST++: Make Self-trainingWork Better for Semi-supervised Semantic SegmentationLihe Yang, Wei Zhuo, Lei Qi 0001, Yinghuan Shi, Yang Gao 0001. 4258-4267 [doi]
- Beyond Semantic to Instance Segmentation: Weakly-Supervised Instance Segmentation via Semantic Knowledge Transfer and Self-RefinementBeomyoung Kim, Youngjoon Yoo, Chaeeun Rhee, Junmo Kim. 4268-4277 [doi]
- Self-supervised Image-specific Prototype Exploration for Weakly Supervised Semantic SegmentationQi Chen, Lingxiao Yang, Jianhuang Lai, Xiaohua Xie. 4278-4288 [doi]
- Regional Semantic Contrast and Aggregation for Weakly Supervised Semantic SegmentationTianfei Zhou, Meijie Zhang, Fang Zhao, Jianwu Li. 4289-4299 [doi]
- Multi-class Token Transformer for Weakly Supervised Semantic SegmentationLian Xu, Wanli Ouyang, Mohammed Bennamoun, Farid Boussaïd, Dan Xu 0002. 4300-4309 [doi]
- Weakly Supervised Semantic Segmentation by Pixel-to-Prototype ContrastYe Du, Zehua Fu, Qingjie Liu, Yunhong Wang. 4310-4319 [doi]
- Threshold Matters in WSSS: Manipulating the Activation for the Robust and Accurate Segmentation Model Against ThresholdsMinhyun Lee, Dongseob Kim, Hyunjung Shim. 4320-4329 [doi]
- Novel Class Discovery in Semantic SegmentationYuyang Zhao, Zhun Zhong, Nicu Sebe, Gim Hee Lee. 4330-4339 [doi]
- Pin the Memory: Learning to Generalize Semantic SegmentationJin Kim, Jiyoung Lee, Jungin Park, Dongbo Min, Kwanghoon Sohn. 4340-4350 [doi]
- ISDNet: Integrating Shallow and Deep Networks for Efficient Ultra-high Resolution SegmentationShaohua Guo, Liang Liu, Zhenye Gan, Yabiao Wang, Wuhao Zhang, Chengjie Wang, Guannan Jiang, Wei Zhang, Ran Yi, Lizhuang Ma, Ke Xu. 4351-4360 [doi]
- Incremental Learning in Semantic Segmentation from Image LabelsFabio Cermelli, Dario Fontanel, Antonio Tavera, Marco Ciccone, Barbara Caputo. 4361-4371 [doi]
- Instance Segmentation with Mask-supervised Polygonal Boundary TransformersJustin Lazarow, Weijian Xu, Zhuowen Tu. 4372-4381 [doi]
- SharpContour: A Contour-based Boundary Refinement Approach for Efficient and Accurate Instance SegmentationChenming Zhu, Xuanye Zhang, Yanran Li, Liangdong Qiu, Kai Han 0001, Xiaoguang Han 0001. 4382-4391 [doi]
- Sparse Object-level Supervision for Instance Segmentation with Pixel EmbeddingsAdrian Wolny, Qin Yu 0005, Constantin Pape, Anna Kreshuk. 4392-4401 [doi]
- Mask Transfiner for High-Quality Instance SegmentationLei Ke, Martin Danelljan, Xia Li, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu. 4402-4411 [doi]
- Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise AffinityWeiyao Wang 0001, Matt Feiszli, Heng Wang, Jitendra Malik, Du Tran. 4412-4422 [doi]
- Sparse Instance Activation for Real-Time Instance SegmentationTianheng Cheng, Xinggang Wang, Shaoyu Chen, Wenqiang Zhang, Qian Zhang, Chang Huang, Zhaoxiang Zhang, Wenyu Liu 0001. 4423-4432 [doi]
- E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance SegmentationTao Zhang, Shiqing Wei, Shunping Ji. 4433-4442 [doi]
- Hyperbolic Image SegmentationMina Ghadimi Atigh, Julian Schoep, Erman Acar, Nanne van Noord, Pascal Mettes. 4443-4452 [doi]
- SeeThroughNet: Resurrection of Auxiliary Loss by Preserving Class Probability InformationDasol Han, Jaewook Yoo, Dokwan Oh. 4453-4462 [doi]
- CDGNet: Class Distribution Guided Network for Human ParsingKunliang Liu, Ouk Choi, Jianming Wang, Wonjun Hwang. 4463-4472 [doi]
- CLIMS: Cross Language Image Matching for Weakly Supervised Semantic SegmentationJinheng Xie, Xianxu Hou, Kai Ye 0004, LinLin Shen. 4473-4482 [doi]
- Sparse Non-local CRFOlga Veksler, Yuri Boykov. 4483-4493 [doi]
- Detecting Camouflaged Object in Frequency DomainYijie Zhong, Bo Li, Lv Tang, Senyun Kuang, Shuang Wu 0001, Shouhong Ding. 4494-4503 [doi]
- Progressive Minimal Path Method with Embedded CNNWei Liao. 4504-4512 [doi]
- Open-Set Text Recognition via Character-Context DecouplingChang Liu, Chun Yang, Xu-Cheng Yin. 4513-4522 [doi]
- Neural Collaborative Graph Machines for Table Structure RecognitionHao Liu 0003, Xin Li, Bing Liu, Deqiang Jiang, Yinsong Liu, Bo Ren 0002. 4523-4532 [doi]
- Revisiting Document Image Dewarping by Grid RegularizationXiangwei Jiang, Rujiao Long, Nan Xue 0001, Zhibo Yang, Cong Yao, Gui-Song Xia. 4533-4542 [doi]
- Syntax-Aware Network for Handwritten Mathematical Expression RecognitionYe Yuan, Xiao Liu, Wondimu Dikubab, Hui Liu, Zhilong Ji, Zhongqin Wu, Xiang Bai. 4543-4552 [doi]
- Few Could Be Better Than All: Feature Sampling and Grouping for Scene Text DetectionJingqun Tang, Wenqing Zhang, Hongye Liu, Mingkun Yang, Bo Jiang, Guanglong Hu, Xiang Bai. 4553-4562 [doi]
- Fourier Document Restoration for Robust Document Dewarping and RecognitionChuhui Xue, Zichen Tian, Fangneng Zhan, Shijian Lu, Song Bai. 4563-4572 [doi]
- XYLayoutLM: Towards Layout-Aware Multimodal Networks For Visually-Rich Document UnderstandingZhangxuan Gu, Changhua Meng, Ke Wang, Jun Lan, Weiqiang Wang, Ming Gu, Liqing Zhang 0001. 4573-4582 [doi]
- SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text RecognitionMingxin Huang, Yuliang Liu, Zhenghao Peng, Chongyu Liu, Dahua Lin, Shenggao Zhu, Nicholas Yuan, Kai Ding, Lianwen Jin. 4583-4593 [doi]
- Towards Weakly-Supervised Text Spotting using a Multi-Task TransformerYair Kittenplon, Inbal Lavi, Sharon Fogel, Yarin Bar, R. Manmatha, Pietro Perona. 4594-4603 [doi]
- TableFormer: Table Structure Understanding with TransformersAhmed S. Nassar, Nikolaos Livathinos, Maksym Lysak, Peter W. J. Staar. 4604-4613 [doi]
- Knowledge Mining with Scene Text for Fine-Grained RecognitionHao Wang, Junchao Liao, Tianheng Cheng, Zewen Gao, Hao Liu, Bo Ren, Xiang Bai, Wenyu Liu 0001. 4614-4623 [doi]
- PubTables-1M: Towards comprehensive table extraction from unstructured documentsBrandon Smock, Rohith Pesala, Robin Abraham. 4624-4632 [doi]
- Focal and Global Knowledge Distillation for DetectorsZhendong Yang, Zhe Li, Xiaohu Jiang, Yuan Gong, Zehuan Yuan, Danpei Zhao, Chun Yuan. 4633-4642 [doi]
- Speed up Object Detection on Gigapixel-level Images with Patch ArrangementJiahao Fan, Huabin Liu 0001, Wenjie Yang, John See, Aixin Zhang, Weiyao Lin. 4643-4653 [doi]
- Training Object Detectors from Scratch: An Empirical Study in the Era of Vision TransformerWeixiang Hong, Jiangwei Lao, Wang Ren, Jian Wang, Jingdong Chen, Wei Chu. 4652-4661 [doi]
- Learning with Neighbor Consistency for Noisy LabelsAhmet Iscen, Jack Valmadre, Anurag Arnab, Cordelia Schmid. 4662-4671 [doi]
- Meta Convolutional Neural Networks for Single Domain GeneralizationChaoqun Wan, Xu Shen, Yonggang Zhang, Zhiheng Yin, Xinmei Tian 0001, Feng Gao, Jianqiang Huang, Xian-Sheng Hua 0001. 4672-4681 [doi]
- Dual Cross-Attention Learning for Fine-Grained Visual Categorization and Object Re-IdentificationHaowei Zhu, Wenjing Ke, Dong Li, Ji Liu, Lu Tian, Yi Shan. 4682-4692 [doi]
- Geometry-Aware Guided Loss for Deep Crack RecognitionZhuangzhuang Chen, Jin Zhang, Zhuonan Lai, Jie Chen 0027, Zun Liu, Jianqiang Li 0001. 4693-4702 [doi]
- Segment, Magnify and Reiterate: Detecting Camouflaged Objects the Hard WayQi Jia 0001, Shuilian Yao, Yu Liu 0012, Xin Fan 0001, Risheng Liu, Zhongxuan Luo. 4703-4712 [doi]
- Dynamic Sparse R-CNNQinghang Hong, Fengming Liu, Dong Li, Ji Liu, Lu Tian, Yi Shan. 4713-4722 [doi]
- Deep Hybrid Models for Out-of-Distribution DetectionSenqi Cao, Zhongfei Zhang. 4723-4733 [doi]
- AutoLoss-GMS: Searching Generalized Margin-based Softmax Loss Function for Person Re-identificationHongyang Gu, Jianmin Li 0001, Guangyuan Fu, Chifong Wong, Xinghao Chen, Jun Zhu 0001. 4734-4743 [doi]
- Feature Erasing and Diffusion Network for Occluded Person Re-IdentificationZhikang Wang, Feng Zhu, Shixiang Tang, Rui Zhao 0001, Lihuo He, Jiangning Song. 4744-4753 [doi]
- Multi-label Classification with Partial Annotations using Class-aware Selective LossEmanuel Ben Baruch, Tal Ridnik, Itamar Friedman, Avi Ben-Cohen, Nadav Zamir, Asaf Noy, Lihi Zelnik-Manor. 4754-4762 [doi]
- BoxeR: Box-Attention for 2D and 3D TransformersDuy-Kien Nguyen, Jihong Ju, Olaf Booij, Martin R. Oswald, Cees G. M. Snoek. 4763-4772 [doi]
- Multi-label Iterated Learning for Image Classification with Label AmbiguitySai Rajeswar, Pau Rodríguez, Soumye Singhal, David Vázquez 0001, Aaron C. Courville. 4773-4783 [doi]
- Vision Transformer with Deformable AttentionZhuofan Xia, Xuran Pan, Shiji Song, Li Erran Li, Gao Huang. 4784-4793 [doi]
- MViTv2: Improved Multiscale Vision Transformers for Classification and DetectionYanghao Li, Chao-Yuan Wu, Haoqi Fan 0001, Karttikeya Mangalam, Bo Xiong, Jitendra Malik, Christoph Feichtenhofer. 4794-4804 [doi]
- Dense Learning based Semi-Supervised Object DetectionBinghui Chen, Pengyu Li, Xiang Chen, Biao Wang, Lei Zhang, Xian-Sheng Hua 0001. 4805-4814 [doi]
- 2: Randomized Decision Routing for Object DetectionYali Li 0001, Shengjin Wang. 4815-4824 [doi]
- GlideNet: Global, Local and Intrinsic based Dense Embedding NETwork for Multi-category Attributes PredictionKareem Metwaly, Aerin Kim, Elliot Branson, Vishal Monga. 4825-4836 [doi]
- Self-Supervised Equivariant Learning for Oriented Keypoint DetectionJongmin Lee, Byungjin Kim, Minsu Cho. 4837-4847 [doi]
- Label Relation Graphs Enhanced Hierarchical Residual Network for Hierarchical Multi-Granularity ClassificationJingzhou Chen, Peng Wang, Jian Liu, Yuntao Qian. 4848-4857 [doi]
- Object Localization under Single Coarse Point SupervisionXuehui Yu, Pengfei Chen, Di Wu, Najmul Hassan, Guorong Li, Junchi Yan, Humphrey Shi, Qixiang Ye, Zhenjun Han. 4858-4867 [doi]
- Rethinking Visual Geo-localization for Large-Scale ApplicationsGabriele Moreno Berton, Carlo Masone, Barbara Caputo. 4868-4878 [doi]
- Whose Hands are These? Hand Detection and Hand-Body Association in the WildSupreeth Narasimhaswamy, Thanh Nguyen, Mingzhen Huang, Minh Hoai. 4879-4889 [doi]
- Cloning Outfits from Real-World Images to 3D Characters for Generalizable Person Re-IdentificationYanan Wang, Xuezhi Liang, ShengCai Liao. 4890-4899 [doi]
- Towards Unsupervised Domain GeneralizationXingxuan Zhang, Linjun Zhou, Renzhe Xu, Peng Cui 0001, Zheyan Shen, Haoxin Liu. 4900-4910 [doi]
- ViM: Out-Of-Distribution with Virtual-logit MatchingHaoqi Wang, Zhizhong Li 0002, Litong Feng, Wayne Zhang. 4911-4920 [doi]
- Vision Transformer Slimming: Multi-Dimension Searching in Continuous Optimization SpaceArnav Chavan, Zhiqiang Shen, Zhuang Liu 0003, Zechun Liu, Kwang-Ting Cheng, Eric P. Xing. 4921-4931 [doi]
- Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through EstimationZechun Liu, Kwang-Ting Cheng, Dong Huang, Eric P. Xing, Zhiqiang Shen. 4932-4942 [doi]
- Align and Prompt: Video-and-Language Pre-training with Entity PromptsDongxu Li, Junnan Li 0001, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi. 4943-4953 [doi]
- Language-Bridged Spatial-Temporal Interaction for Referring Video Object SegmentationZihan Ding, Tianrui Hui, Junshi Huang, Xiaoming Wei, Jizhong Han, Si Liu 0001. 4954-4963 [doi]
- Language as Queries for Referring Video Object SegmentationJiannan Wu, Yi Jiang, Peize Sun, Zehuan Yuan, Ping Luo 0002. 4964-4974 [doi]
- End-to-End Referring Video Object Segmentation with Multimodal TransformersAdam Botach, Evgenii Zheltonozhskii, Chaim Baskin. 4975-4985 [doi]
- Multi-Level Representation Learning with Semantic Alignment for Referring Video Object SegmentationDongming Wu, Xingping Dong, Ling Shao 0001, Jianbing Shen. 4986-4995 [doi]
- X-Pool: Cross-Modal Language-Video Attention for Text-Video RetrievalSatya Krishna Gorti, Noël Vouitsis, Junwei Ma, Keyvan Golestan, Maksims Volkovs, Animesh Garg, Guangwei Yu. 4996-5005 [doi]
- Video-Text Representation Learning via Differentiable Weak Temporal AlignmentDohwan Ko, Joonmyung Choi, Juyeon Ko, Shinyeong Noh, Kyoung-woon On, Eun-Sol Kim, Hyunwoo J. Kim. 5006-5015 [doi]
- MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio DescriptionsMattia Soldan, Alejandro Pardo, Juan León Alcázar, Fabian Caba Heilbron, Chen Zhao 0002, Silvio Giancola, Bernard Ghanem. 5016-5025 [doi]
- Advancing High-Resolution Video-Language Representation with Large-Scale Video TranscriptionsHongwei Xue, Tiankai Hang, Yanhong Zeng, Yuchong Sun, Bei Liu 0001, Huan Yang 0005, Jianlong Fu, Baining Guo. 5026-5035 [doi]
- Measuring Compositional Consistency for Video Question AnsweringMona Gandhi, Mustafa Omer Gul, Eva Prakash, Madeleine Grunde-McLaughlin, Ranjay Krishna, Maneesh Agrawala. 5036-5045 [doi]
- Sim VQA: Exploring Simulated Environments for Visual Question AnsweringPaola Cascante-Bonilla, Hui Wu, Letao Wang, Rogério Feris, Vicente Ordonez. 5046-5056 [doi]
- Transform-Retrieve-Generate: Natural Language-Centric Outside-Knowledge Visual Question AnsweringFeng Gao 0013, Qing-ping, Govind Thattai, Aishwarya N. Reganti, Ying Nian Wu, Prem Natarajan. 5057-5067 [doi]
- SwapMix: Diagnosing and Regularizing the Over-Reliance on Visual Context in Visual Question AnsweringVipul Gupta, Zhuowan Li, Adam Kortylewski, Chenyu Zhang, Yingwei Li, Alan L. Yuille. 5068-5078 [doi]
- MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question AnsweringYang Ding, Jing Yu, Bang Liu, Yue Hu, Mingxin Cui, Qi Wu 0001. 5079-5088 [doi]
- Maintaining Reasoning Consistency in Compositional Visual Question AnsweringChenchen Jing, Yunde Jia, Yuwei Wu, Xinyu Liu, Qi Wu. 5089-5098 [doi]
- MLSLT: Towards Multilingual Sign Language TranslationAoxiong Yin, Zhou Zhao, Weike Jin, Meng Zhang, Xingshan Zeng, Xiaofei He 0001. 5099-5109 [doi]
- A Simple Multi-Modality Transfer Learning Baseline for Sign Language TranslationYutong Chen, Fangyun Wei, Xiao Sun, Zhirong Wu, Stephen Lin 0001. 5110-5120 [doi]
- 2SLR: Consistency-enhanced Continuous Sign Language RecognitionRonglai Zuo, Brian Mak. 5121-5130 [doi]
- Signing at Scale: Learning to Co-Articulate Signs for Large-Scale Photo-Realistic Sign Language ProductionBen Saunders, Necati Cihan Camgöz, Richard Bowden. 5131-5141 [doi]
- Generating Diverse and Natural 3D Human Motions from TextChuan Guo, Shihao Zou, Xinxin Zuo, Sen Wang 0003, Wei Ji, Xingyu Li, Li Cheng 0001. 5142-5151 [doi]
- Sub-word Level Lip Reading With Visual AttentionK. R. Prajwal, Triantafyllos Afouras, Andrew Zisserman. 5152-5162 [doi]
- Habitat-Web: Learning Embodied Object-Search Strategies from Human Demonstrations at ScaleRam Ramrakhya, Eric Undersander, Dhruv Batra, Abhishek Das. 5163-5173 [doi]
- ViSTA: Vision and Scene Text Aggregation for Cross-Modal RetrievalMengjun Cheng, Yipeng Sun, Longchao Wang, Xiongwei Zhu, Kun Yao, Jie Chen, Guoli Song, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang 0001. 5174-5183 [doi]
- Cross Modal Retrieval with Querybank NormalisationSimion-Vlad Bogolin, Ioana Croitoru, Hailin Jin, Yang Liu, Samuel Albanie. 5184-5195 [doi]
- Prompt Distribution LearningYuning Lu, Jianzhuang Liu, Yonggang Zhang, Yajing Liu, Xinmei Tian 0001. 5196-5205 [doi]
- VALHALLA: Visual Hallucination for Machine TranslationYi Li, Rameswar Panda, Yoon Kim, Chun-Fu Richard Chen, Rogério Feris, David D. Cox, Nuno Vasconcelos. 5206-5216 [doi]
- VL-ADAPTER: Parameter-Efficient Transfer Learning for Vision-and-Language TasksYi-Lin Sung, Jaemin Cho 0001, Mohit Bansal. 5217-5227 [doi]
- Winoground: Probing Vision and Language Models for Visio-Linguistic CompositionalityTristan Thrush, Ryan Jiang, Max Bartolo, Amanpreet Singh, Adina Williams, Douwe Kiela, Candace Ross. 5228-5238 [doi]
- MixFormer: Mixing Features across Windows and DimensionsQiang Chen, Qiman Wu, Jian Wang, Qinghao Hu, Tao Hu, Errui Ding, Jian Cheng 0001, Jingdong Wang 0001. 5239-5249 [doi]
- Recurrent Glimpse-based Decoder for Detection with TransformerZhe Chen, Jing Zhang, Dacheng Tao. 5250-5259 <