Abstract is missing.
- CogView3: Finer and Faster Text-to-Image Generation via Relay DiffusionWendi Zheng, Jiayan Teng, Zhuoyi Yang, Weihan Wang, Jidong Chen, Xiaotao Gu, Yuxiao Dong, Ming Ding 0004, Jie Tang 0001. 1-22 [doi]
- SiT: Exploring Flow and Diffusion-Based Generative Models with Scalable Interpolant TransformersNanye Ma, Mark Goldstein, Michael S. Albergo, Nicholas M. Boffi, Eric Vanden-Eijnden, Saining Xie. 23-40 [doi]
- Learn to Memorize and to Forget: A Continual Learning Perspective of Dynamic SLAMBaicheng Li, Zike Yan, Dong Wu, Hanqing Jiang, Hongbin Zha. 41-57 [doi]
- Forecasting Future Videos from Novel Views via Disentangled 3D Scene RepresentationSudhir Yarram, Junsong Yuan 0001. 58-76 [doi]
- GMM-IKRS: Gaussian Mixture Models for Interpretable Keypoint Refinement and ScoringEmanuele Santellani, Martin Zach, Christian Sormann, Mattia Rossi, Andreas Kuhn 0002, Friedrich Fraundorfer. 77-93 [doi]
- Get Your Embedding Space in Order: Domain-Adaptive Regression for Forest MonitoringSizhuo Li, Dimitri Gominski, Martin Brandt, Xiaoye Tong, Philippe Ciais. 94-111 [doi]
- ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and InsertionDaniel Winter, Matan Cohen, Shlomi Fruchter, Yael Pritch, Alex Rav-Acha, Yedid Hoshen. 112-129 [doi]
- CoDA: Instructive Chain-of-Domain Adaptation with Severity-Aware Visual Prompt TuningZiyang Gong, Fuhao Li, Yupeng Deng 0002, Deblina Bhattacharjee, Xianzheng Ma, Xiangwei Zhu, Zhenming Ji. 130-148 [doi]
- Curved Diffusion: A Generative Model with Optical Geometry ControlAndrey Voynov, Amir Hertz, Moab Arar, Shlomi Fruchter, Daniel Cohen-Or. 149-164 [doi]
- Mini-Splatting: Representing Scenes with a Constrained Number of GaussiansGuangchi Fang, Bing Wang 0013. 165-181 [doi]
- MeshSegmenter: Zero-Shot Mesh Semantic Segmentation via Texture SynthesisZiming Zhong, Yanyu Xu, Jing Li, Jiale Xu, Zhengxin Li, Chaohui Yu, Shenghua Gao. 182-199 [doi]
- OTSeg: Multi-Prompt Sinkhorn Attention for Zero-Shot Semantic SegmentationKwanyoung Kim, Yujin Oh, Jong Chul Ye. 200-217 [doi]
- Skeleton Recall Loss for Connectivity Conserving and Resource Efficient Segmentation of Thin Tubular StructuresYannick Kirchhoff, Maximilian R. Rokuss, Saikat Roy, Balint Kovacs, Constantin Ulrich, Tassilo Wald, Maximilian Zenk, Philipp Vollmuth, Jens Kleesiek, Fabian Isensee, Klaus H. Maier-Hein. 218-234 [doi]
- Conceptual Codebook Learning for Vision-Language ModelsYi Zhang, Ke Yu, Siqi Wu, Zhihai He. 235-251 [doi]
- LingoQA: Visual Question Answering for Autonomous DrivingAna-Maria Marcu, Long Chen 0015, Jan Hünermann, Alice Karnsund, Benoît Hanotte, Prajwal Chidananda, Saurabh Nair, Vijay Badrinarayanan, Alex Kendall, Jamie Shotton, Elahe Arani, Oleg Sinavski. 252-269 [doi]
- AnimateMe: 4D Facial Expressions via Diffusion ModelsDimitrios Gerogiannis, Foivos Paraperas Papantoniou, Rolandos-Alexandros Potamias, Alexandros Lattas, Stylianos Moschoglou, Stylianos Ploumpis, Stefanos Zafeiriou. 270-287 [doi]
- HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal ReasoningZhecan Wang, Garrett Bingham, Adams Wei Yu, Quoc V. Le, Thang Luong, Golnaz Ghiasi. 288-304 [doi]
- LATTE3D: Large-scale Amortized Text-To-Enhanced3D SynthesisKevin Xie, Jonathan Lorraine, Tianshi Cao, Jun Gao 0004, James Lucas, Antonio Torralba 0001, Sanja Fidler, Xiaohui Zeng. 305-322 [doi]
- PreSight: Enhancing Autonomous Vehicle Perception with City-Scale NeRF PriorsTianyuan Yuan, Yucheng Mao, Jiawei Yang, Yicheng Liu, Yue Wang, Hang Zhao. 323-339 [doi]
- Unveiling and Mitigating Memorization in Text-to-Image Diffusion Models Through Cross AttentionJie Ren 0019, Yaxin Li 0001, Shenglai Zeng, Han Xu 0002, Lingjuan Lyu, Yue Xing 0002, Jiliang Tang. 340-356 [doi]
- iNeMo: Incremental Neural Mesh Models for Robust Class-Incremental LearningTom Fischer, Yaoyao Liu 0001, Artur Jesslen, Noor Ahmed, Prakhar Kaushik, Angtian Wang, Alan L. Yuille, Adam Kortylewski, Eddy Ilg. 357-374 [doi]
- Context Diffusion: In-Context Aware Image GenerationIvona Najdenkoska, Animesh Sinha, Abhimanyu Dubey, Dhruv Mahajan 0001, Vignesh Ramanathan, Filip Radenovic. 375-391 [doi]
- Pose-Guided Fine-Grained Sign Language Video GenerationTongkai Shi, Lianyu Hu 0003, Fanhua Shang, Jichao Feng, Peidong Liu, Wei Feng 0005. 392-409 [doi]
- RAP: Retrieval-Augmented Planner for Adaptive Procedure Planning in Instructional VideosAli Zare, Yulei Niu, Hammad A. Ayyubi, Shih-Fu Chang. 410-426 [doi]
- Certifiably Robust Image WatermarkZhengyuan Jiang, Moyang Guo, Yuepeng Hu, Jinyuan Jia 0001, Neil Zhenqiang Gong. 427-443 [doi]
- Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept DiscoverySukrut Rao, Sweta Mahajan, Moritz Böhle, Bernt Schiele. 444-461 [doi]
- Online Zero-Shot Classification with CLIPQi Qian 0001, Juhua Hu. 462-477 [doi]