Abstract is missing.
- LGM: Large Multi-view Gaussian Model for High-Resolution 3D Content CreationJiaxiang Tang, Zhaoxi Chen 0009, Xiaokang Chen, Tengfei Wang 0002, Gang Zeng, Ziwei Liu 0002. 1-18 [doi]
- Mahalanobis Distance-Based Multi-view Optimal Transport for Multi-view Crowd LocalizationQi Zhang 0041, Kaiyi Zhang, Antoni B. Chan, Hui Huang 0004. 19-36 [doi]
- RAW-Adapter: Adapting Pre-trained Visual Model to Camera RAW ImagesZiteng Cui, Tatsuya Harada. 37-56 [doi]
- SLEDGE: Synthesizing Driving Environments with Generative Models and Rule-Based TrafficKashyap Chitta, Daniel Dauner, Andreas Geiger 0001. 57-74 [doi]
- AFreeCA: Annotation-Free Counting for AllAdriano C. D'Alessandro, Ali Mahdavi-Amiri, Ghassan Hamarneh. 75-91 [doi]
- Adversarially Robust Distillation by Reducing the Student-Teacher Variance GapJunhao Dong, Piotr Koniusz, Junxi Chen, Yew-Soon Ong. 92-111 [doi]
- LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D GenerationYushi Lan, Fangzhou Hong, Shuai Yang 0001, Shangchen Zhou, Xuyi Meng, Bo Dai 0002, Xingang Pan, Chen Change Loy. 112-130 [doi]
- Hierarchical Temporal Context Learning for Camera-Based Semantic Scene CompletionBohan Li, Jiajun Deng, Wenyao Zhang, Zhujin Liang, Dalong Du, Xin Jin 0014, Wenjun Zeng. 131-148 [doi]
- Equi-GSPR: Equivariant SE(3) Graph Network Model for Sparse Point Cloud RegistrationXueyang Kang, Zhaoliang Luan, Kourosh Khoshelham, Bing Wang 0013. 149-167 [doi]
- GTP-4o: Modality-Prompted Heterogeneous Graph Learning for Omni-Modal Biomedical RepresentationChenxin Li, Xinyu Liu 0001, Cheng Wang, Yifan Liu 0010, Weihao Yu, Jing Shao, Yixuan Yuan. 168-187 [doi]
- PromptCCD: Learning Gaussian Mixture Prompt Pool for Continual Category DiscoveryFernando Julio Cendra, Bingchen Zhao, Kai Han 0001. 188-205 [doi]
- Sapiens: Foundation for Human Vision ModelsRawal Khirodkar, Timur M. Bagautdinov, Julieta Martinez, Su Zhaoen, Austin James, Peter Selednik, Stuart Anderson, Shunsuke Saito. 206-228 [doi]
- Linearly Controllable GAN: Unsupervised Feature Categorization and Decomposition for Image Generation and ManipulationSehyung Lee, Mijung Kim, Yeongnam Chae, Björn Stenger. 229-245 [doi]
- Generating Human Interaction Motions in Scenes with Text ControlHongwei Yi, Justus Thies, Michael J. Black, Xue Bin Peng, Davis Rempe. 246-263 [doi]
- NOVUM: Neural Object Volumes for Robust Object ClassificationArtur Jesslen, Guofeng Zhang 0025, Angtian Wang, Wufei Ma, Alan L. Yuille, Adam Kortylewski. 264-281 [doi]
- Align Before Collaborate: Mitigating Feature Misalignment for Robust Multi-agent PerceptionKun Yang 0010, Dingkang Yang, Ke Li, Dongling Xiao, Zedian Shao, Peng Sun 0007, Liang Song. 282-299 [doi]
- HIMO: A New Benchmark for Full-Body Human Interacting with Multiple ObjectsXintao Lv, Liang Xu, Yichao Yan, Xin Jin 0014, Congsheng Xu, Shuwen Wu, Yifan Liu, Lincheng Li, Mengxiao Bi, Wenjun Zeng, Xiaokang Yang. 300-318 [doi]
- SAIR: Learning Semantic-Aware Implicit RepresentationCanyu Zhang, Xiaoguang Li, Qing Guo, Song Wang. 319-335 [doi]
- ColorMNet: A Memory-Based Deep Spatial-Temporal Feature Propagation Network for Video ColorizationYixin Yang, Jiangxin Dong, Jinhui Tang 0001, Jinshan Pan. 336-352 [doi]
- UNIC: Universal Classification Models via Multi-teacher DistillationMert Bülent Sariyildiz, Philippe Weinzaepfel, Thomas Lucas 0002, Diane Larlus, Yannis Kalantidis. 353-371 [doi]
- Instance-Dependent Noisy-Label Learning with Graphical Model Based Noise-Rate EstimationArpit Garg, Cuong Nguyen 0006, Rafael Felix, Thanh-Toan Do, Gustavo Carneiro 0001. 372-389 [doi]
- Eliminating Warping Shakes for Unsupervised Online Video StitchingLang Nie, Chunyu Lin, Kang Liao, Yun Zhang 0024, Shuaicheng Liu, Rui Ai 0001, Yao Zhao 0001. 390-407 [doi]
- Vary: Scaling up the Vision Vocabulary for Large Vision-Language ModelHaoran Wei, Lingyu Kong, Jinyue Chen, Liang Zhao, Zheng Ge, Jinrong Yang, Jianjian Sun, Chunrui Han, Xiangyu Zhang 0005. 408-424 [doi]
- Merlin: Empowering Multimodal LLMs with Foresight MindsEn Yu, Liang Zhao, Yana Wei, Jinrong Yang, Dongming Wu, Lingyu Kong, Haoran Wei, Tiancai Wang, Zheng Ge, Xiangyu Zhang 0005, Wenbing Tao. 425-443 [doi]
- ViC-MAE: Self-supervised Representation Learning from Images and Video with Contrastive Masked AutoencodersJefferson Hernandez, Ruben Villegas, Vicente Ordonez. 444-463 [doi]
- E.T. the Exceptional Trajectories: Text-to-Camera-Trajectory Generation with Character AwarenessRobin Courant, Nicolas Dufour, Xi Wang, Marc Christie, Vicky Kalogeiton. 464-480 [doi]
- OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow UnderstandingMing Hu, Peng Xia, Lin Wang 0027, Siyuan Yan, Feilong Tang, Zhongxing Xu, Yimin Luo, Kaimin Song, Jürgen Leitner, Xuelian Cheng, Jun Cheng, Chi Liu, Kaijing Zhou, ZongYuan Ge. 481-500 [doi]