Abstract is missing.
- Lane Detection Transformer Based on Multi-frame Horizontal and Vertical Attention and Visual Transformer ModuleHan Zhang, Yunchao Gu, Xinliang Wang, JunJun Pan, Minghui Wang. 1-16 [doi]
- ProposalContrast: Unsupervised Pre-training for LiDAR-Based 3D Object DetectionJunbo Yin, Dingfu Zhou, Liangjun Zhang, Jin Fang, Cheng-Zhong Xu 0001, Jianbing Shen, Wenguan Wang. 17-33 [doi]
- PreTraM: Self-supervised Pre-training via Connecting Trajectory and MapChenfeng Xu, Tian Li, Chen Tang, Lingfeng Sun, Kurt Keutzer, Masayoshi Tomizuka, Alireza Fathi, Wei Zhan. 34-50 [doi]
- Master of All: Simultaneous Generalization of Urban-Scene Segmentation to All Adverse Weather ConditionsNikhil Reddy, Abhinav Singhal, Abhishek Kumar, Mahsa Baktashmotlagh, Chetan Arora 0001. 51-69 [doi]
- LESS: Label-Efficient Semantic Segmentation for LiDAR Point CloudsMinghua Liu, Yin Zhou, Charles R. Qi, Boqing Gong, Hao Su 0001, Dragomir Anguelov. 70-89 [doi]
- Visual Cross-View Metric Localization with Dense Uncertainty EstimatesZimin Xia, Olaf Booij, Marco Manfredi, Julian F. P. Kooij. 90-106 [doi]
- V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision TransformerRunsheng Xu, Hao Xiang, Zhengzhong Tu, Xin Xia 0007, Ming-Hsuan Yang 0001, Jiaqi Ma. 107-124 [doi]
- DevNet: Self-supervised Monocular Depth Learning via Density Volume ConstructionKaichen Zhou, Lanqing Hong, Changhao Chen, Hang Xu, Chaoqiang Ye, Qingyong Hu, Zhenguo Li. 125-142 [doi]
- Action-Based Contrastive Learning for Trajectory PredictionMarah Halawa, Olaf Hellwich, Pia Bideau. 143-159 [doi]
- Radatron: Accurate Detection Using Multi-resolution Cascaded MIMO RadarSohrab Madani, Jayden Guan, Waleed Ahmed, Saurabh Gupta 0001, Haitham Hassanieh. 160-178 [doi]
- LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object DetectionYi Wei, Zibu Wei, Yongming Rao, Jiaxin Li, Jie Zhou 0001, Jiwen Lu. 179-195 [doi]
- Efficient Point Cloud Segmentation with Geometry-Aware Sparse NetworksMaosheng Ye, Rui Wan, Shuangjie Xu, Tongyi Cao, Qifeng Chen. 196-212 [doi]
- FH-Net: A Fast Hierarchical Network for Scene Flow Estimation on Real-World Point CloudsLihe Ding, Shaocong Dong, Tingfa Xu, Xinli Xu, Jie Wang, Jianan Li. 213-229 [doi]
- SpatialDETR: Robust Scalable Transformer-Based 3D Object Detection From Multi-view Camera Images With Global Cross-Sensor AttentionSimon Doll, Richard Schulz, Lukas Schneider, Viviane Benzin, Markus Enzweiler, Hendrik P. A. Lensch. 230-245 [doi]
- Pixel-Wise Energy-Biased Abstention Learning for Anomaly Segmentation on Complex Urban Driving ScenesYu Tian, Yuyuan Liu, Guansong Pang, Fengbei Liu, Yuanhong Chen, Gustavo Carneiro. 246-263 [doi]
- Rethinking Closed-Loop Training for Autonomous DrivingChris Zhang, Runsheng Guo 0003, Wenyuan Zeng, Yuwen Xiong, Binbin Dai, Rui Hu 0001, Mengye Ren, Raquel Urtasun. 264-282 [doi]
- SLiDE: Self-supervised LiDAR De-snowing Through Reconstruction DifficultyGwangtak Bae, Byungjun Kim, Seongyong Ahn, Jihong Min, Inwook Shim. 283-300 [doi]
- Generative Meta-Adversarial Network for Unseen Object NavigationSixian Zhang, Weijie Li, Xinhang Song, Yubing Bai, Shuqiang Jiang. 301-320 [doi]
- Object Manipulation via Visual Target LocalizationKiana Ehsani, Ali Farhadi, Aniruddha Kembhavi, Roozbeh Mottaghi. 321-337 [doi]
- MoDA: Map Style Transfer for Self-supervised Domain Adaptation of Embodied AgentsEun Sun Lee, Junho Kim, Sangwon Park, Young Min Kim 0001. 338-354 [doi]
- Housekeep: Tidying Virtual Households Using Commonsense ReasoningYash Kant, Arun Ramachandran, Sriram Yenamandra, Igor Gilitschenski, Dhruv Batra, Andrew Szot, Harsh Agrawal. 355-373 [doi]
- Domain Randomization-Enhanced Depth Simulation and Restoration for Perceiving and Grasping Specular and Transparent ObjectsQiyu Dai, Jiyao Zhang, Qiwei Li, Tianhao Wu, Hao Dong 0003, Ziyuan Liu, Ping Tan, He Wang. 374-391 [doi]
- Resolving Copycat Problems in Visual Imitation Learning via Residual Action PredictionChia-Chi Chuang, Donglin Yang, Chuan Wen, Yang Gao 0029. 392-409 [doi]
- OPD: Single-View 3D Openable Part DetectionHanxiao Jiang 0001, Yongsen Mao, Manolis Savva, Angel X. Chang. 410-426 [doi]
- AirDet: Few-Shot Detection Without Fine-Tuning for Autonomous ExplorationBowen Li, Chen Wang 0033, Pranay Reddy, Seungchan Kim, Sebastian Scherer. 427-444 [doi]
- TransGrasp: Grasp Pose Estimation of a Category of Objects by Transferring Grasps from Only One Labeled InstanceHongtao Wen, Jianhang Yan, Wanli Peng, Yi Sun 0009. 445-461 [doi]
- StARformer: Transformer with State-Action-Reward Representations for Visual Reinforcement LearningJinghuan Shang, Kumara Kahatapitiya, Xiang Li, Michael S. Ryoo. 462-479 [doi]
- TIDEE: Tidying Up Novel Rooms Using Visuo-Semantic Commonsense PriorsGabriel Sarch, Zhaoyuan Fang, Adam W. Harley, Paul Schydlo, Michael J. Tarr, Saurabh Gupta 0001, Katerina Fragkiadaki. 480-496 [doi]
- Learning Efficient Multi-agent Cooperative Visual ExplorationChao Yu, Xinyi Yang, Jiaxuan Gao, Huazhong Yang, Yu Wang, Yi Wu. 497-515 [doi]
- Zero-Shot Category-Level Object Pose EstimationWalter Goodwin, Sagar Vaze, Ioannis Havoutis, Ingmar Posner. 516-532 [doi]
- Sim-to-Real 6D Object Pose Estimation via Iterative Self-training for Robotic Bin PickingKai Chen, Rui Cao, Stephen James, Yichuan Li 0002, Yun-Hui Liu, Pieter Abbeel, Qi Dou. 533-550 [doi]
- Active Audio-Visual Separation of Dynamic Sound SourcesSagnik Majumder, Kristen Grauman. 551-569 [doi]
- DexMV: Imitation Learning for Dexterous Manipulation from Human VideosYuzhe Qin, Yueh-Hua Wu, Shaowei Liu, Hanwen Jiang, Ruihan Yang, Yang Fu, Xiaolong Wang 0004. 570-587 [doi]
- Sim-2-Sim Transfer for Vision-and-Language Navigation in Continuous EnvironmentsJacob Krantz, Stefan Lee. 588-603 [doi]
- Style-Agnostic Reinforcement LearningJuyong Lee, Seokjun Ahn, Jaesik Park. 604-620 [doi]
- Self-supervised Interactive Object Segmentation Through a Singulation-and-Grasping ApproachHoujian Yu, Changhyun Choi. 621-637 [doi]
- Learning from Unlabeled 3D Environments for Vision-and-Language NavigationShizhe Chen, Pierre-Louis Guhur, Makarand Tapaswi, Cordelia Schmid, Ivan Laptev. 638-655 [doi]
- BodySLAM: Joint Camera Localisation, Mapping, and Human Motion TrackingDorian Henning, Tristan Laidlow, Stefan Leutenegger. 656-673 [doi]
- FusionVAE: A Deep Hierarchical Variational Autoencoder for RGB Image FusionFabian Duffhauss, Ngo Anh Vien, Hanna Ziesche, Gerhard Neumann. 674-691 [doi]
- Learning Algebraic Representation for Systematic Generalization in Abstract ReasoningChi Zhang 0017, Sirui Xie, Baoxiong Jia, Ying Nian Wu, Song Chun Zhu, Yixin Zhu. 692-709 [doi]
- Video Dialog as Conversation About Objects Living in Space-TimeHoang-Anh Pham, Thao Minh Le, Vuong Le, Tu Minh Phuong, Truyen Tran 0001. 710-726 [doi]