Abstract is missing.
- GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and GenerationYinghao Xu, Zifan Shi, Yifan Wang 0011, Hansheng Chen 0001, Ceyuan Yang, Sida Peng, Yujun Shen, Gordon Wetzstein. 1-20 [doi]
- IRGen: Generative Modeling for Image RetrievalYidan Zhang, Ting Zhang 0002, Dong Chen 0003, Yujing Wang, Qi Chen, Xing Xie 0001, Hao Sun 0015, Weiwei Deng, Qi Zhang, Fan Yang, Mao Yang, Qingmin Liao, Jingdong Wang, Baining Guo. 21-41 [doi]
- Learning Trimodal Relation for Audio-Visual Question Answering with Missing ModalityKyu Ri Park, Hong Joo Lee 0001, Jung-Uk Kim. 42-59 [doi]
- FastCAD: Real-Time CAD Retrieval and Alignment from Scans and VideosFlorian Langer, Jihong Ju, Georgi Dikov, Gerhard Reitmayr, Mohsen Ghafoorian. 60-77 [doi]
- A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask InpaintingWouter Van Gansbeke, Bert De Brabandere. 78-97 [doi]
- VISA: Reasoning Video Object Segmentation via Large Language ModelsCilin Yan, Haochen Wang, Shilin Yan, Xiaolong Jiang, Yao Hu, Guoliang Kang, Weidi Xie, Efstratios Gavves. 98-115 [doi]
- Lego: Learning to Disentangle and Invert Personalized Concepts Beyond Object Appearance in Text-to-Image Diffusion ModelsSaman Motamed, Danda Pani Paudel, Luc Van Gool. 116-133 [doi]
- IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth GenerationYuanhao Zhai 0001, Kevin Lin, Linjie Li, Chung-Ching Lin, Jianfeng Wang, Zhengyuan Yang, David S. Doermann, Junsong Yuan 0001, Zicheng Liu 0001, Lijuan Wang. 134-152 [doi]
- Scaling Backwards: Minimal Synthetic Pre-Training?Ryo Nakamura, Ryu Tadokoro, Ryosuke Yamada, Yuki M. Asano, Iro Laina, Christian Rupprecht 0001, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka. 153-171 [doi]
- BAMM: Bidirectional Autoregressive Motion ModelEkkasit Pinyoanuntapong, Muhammad Usama Saleem, Pu Wang 0001, Minwoo Lee 0001, Srijan Das, Chen Chen 0001. 172-190 [doi]
- Event-Based Head Pose Estimation: Benchmark and MethodJiahui Yuan, Hebei Li, Yansong Peng, Jin Wang, Yuheng Jiang, Yueyi Zhang, Xiaoyan Sun 0001. 191-208 [doi]
- Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head VideosEkta Prashnani, Koki Nagano, Shalini De Mello, David Luebke, Orazio Gallo. 209-228 [doi]
- Towards Multi-modal Transformers in Federated LearningGuangyu Sun, Matias Mendieta, Aritra Dutta, Xin Li 0022, Chen Chen 0001. 229-246 [doi]
- Fisher Calibration for Backdoor-Robust Heterogeneous Federated LearningWenke Huang, Mang Ye, Zekun Shi, Bo Du 0001, Dacheng Tao. 247-265 [doi]
- QueryCDR: Query-Based Controllable Distortion Rectification Network for Fisheye ImagesPengbo Guo, Chengxu Liu, Xingsong Hou, Xueming Qian. 266-284 [doi]
- Latent-INR: A Flexible Framework for Implicit Representations of Videos with Discriminative SemanticsShishira R. Maiya, Anubhav Gupta, Matthew Gwilliam, Max Ehrlich, Abhinav Shrivastava. 285-302 [doi]
- DCDM: Diffusion-Conditioned-Diffusion Model for Scene Text Image Super-ResolutionShrey Singh, Prateek Keserwani, Masakazu Iwamura, Partha Pratim Roy 0001. 303-320 [doi]
- Per-Gaussian Embedding-Based Deformation for Deformable 3D Gaussian SplattingJeongmin Bae, Seoha Kim, Youngsik Yun, Hahyun Lee, Gun Bang, Youngjung Uh. 321-335 [doi]
- DreamMover: Leveraging the Prior of Diffusion Models for Image Interpolation with Large MotionLiao Shen, Tianqi Liu 0003, Huiqiang Sun, Xinyi Ye, Baopu Li, Jianming Zhang 0001, Zhiguo Cao 0001. 336-353 [doi]
- CoLA: Conditional Dropout and Language-Driven Robust Dual-Modal Salient Object DetectionShuang Hao 0015, Chunlin Zhong, He Tang. 354-371 [doi]
- Image-Feature Weak-to-Strong Consistency: An Enhanced Paradigm for Semi-supervised LearningZhiyu Wu, Jinshi Cui. 372-388 [doi]
- RPBG: Towards Robust Neural Point-Based Graphics in the WildQingtian Zhu, Zizhuang Wei, Zhongtian Zheng, Yifan Zhan, Zhuyu Yao, Jiawang Zhang, Kejian Wu, Yinqiang Zheng. 389-406 [doi]
- GaussReg: Fast 3D Registration with Gaussian SplattingJiahao Chang, Yinglin Xu, Yihao Li, Yuantao Chen, WenSen Feng, Xiaoguang Han 0001. 407-423 [doi]
- Efficient Diffusion Transformer with Step-Wise Dynamic Attention MediatorsYifan Pu, Zhuofan Xia, Jiayi Guo, Dongchen Han, Qixiu Li, Duo Li, Yuhui Yuan, Ji Li 0006, Yizeng Han, Shiji Song, Gao Huang 0001, Li Xiu 0001. 424-441 [doi]
- Open Vocabulary 3D Scene Understanding via Geometry Guided Self-DistillationPengfei Wang 0012, Yuxi Wang 0001, Shuai Li 0014, Zhaoxiang Zhang 0001, Zhen Lei 0001, Lei Zhang 0006. 442-460 [doi]
- IAM-VFI : Interpolate Any Motion for Video Frame Interpolation with Motion Complexity MapKihwan Yoon, Yong-Han Kim, Sungjei Kim, Jinwoo Jeong. 461-477 [doi]
- TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete DataSiyi Du, Shaoming Zheng, Yinsong Wang, Wenjia Bai, Declan P. O'Regan, Chen Qin. 478-496 [doi]