Abstract is missing.
- OvSW: Overcoming Silent Weights for Accurate Binary Neural NetworksJingyang Xiang, Zuohui Chen, Siqi Li, Qing Wu, Yong Liu 0007. 1-18 [doi]
- Multistain Pretraining for Slide Representation Learning in PathologyGuillaume Jaume, Anurag Vaidya, Andrew Zhang, Andrew H. Song, Richard J. Chen, Sharifa Sahai, Dandan Mo, Emilio Madrigal, Long Phi Le, Faisal Mahmood. 19-37 [doi]
- T-Rex2: Towards Generic Object Detection via Text-Visual Prompt SynergyQing Jiang, Feng Li 0040, Zhaoyang Zeng, Tianhe Ren, Shilong Liu, Lei Zhang 0001. 38-57 [doi]
- Harmonizing Knowledge Transfer in Neural Network with Unified DistillationYaomin Huang, Zaomin Yan, Chaomin Shen, Faming Fang, Guixu Zhang. 58-74 [doi]
- Mamba-ND: Selective State Space Modeling for Multi-dimensional DataShufan Li, Harkanwar Singh, Aditya Grover. 75-92 [doi]
- Click Prompt Learning with Optimal Transport for Interactive SegmentationJie Liu 0043, Haochen Wang, Wenzhe Yin, Jan-Jakob Sonke, Efstratios Gavves. 93-110 [doi]
- 3D Human Pose Estimation via Non-causal Retentive NetworksKaili Zheng, Feixiang Lu, Yihao Lv, Liangjun Zhang, Chenyi Guo, Ji Wu 0002. 111-128 [doi]
- OMR: Occlusion-Aware Memory-Based Refinement for Video Lane DetectionDongkwon Jin, Chang-Su Kim 0001. 129-145 [doi]
- 6DoF Head Pose Estimation Through Explicit Bidirectional Interaction with Face GeometrySungho Chun, Ju Yong Chang. 146-163 [doi]
- Latent Diffusion Prior Enhanced Deep Unfolding for Snapshot Spectral Compressive ImagingZongliang Wu, Ruiying Lu, Ying Fu 0001, Xin Yuan 0002. 164-181 [doi]
- Multimodal Cross-Domain Few-Shot Learning for Egocentric Action RecognitionMasashi Hatano, Ryo Hachiuma, Ryo Fujii, Hideo Saito. 182-199 [doi]
- Enhancing Tampered Text Detection Through Frequency Feature Fusion and DecompositionZhongxi Chen, Shen Chen, Taiping Yao, Ke Sun, Shouhong Ding, Xianming Lin, Liujuan Cao, Rongrong Ji. 200-217 [doi]
- Modeling Label Correlations with Latent Context for Multi-label RecognitionZhaomin Chen, Quan Cui, Ruoxi Deng, Jie Hu, Guodao Zhang. 218-234 [doi]
- LLM as Dataset Analyst: Subpopulation Structure Discovery with Large Language ModelYulin Luo, Ruichuan An, Bocheng Zou, Yiming Tang, Jiaming Liu, Shanghang Zhang. 235-252 [doi]
- Finding Needles in a Haystack: A Black-Box Approach to Invisible Watermark DetectionMinzhou Pan, Zhenting Wang, Xin Dong 0009, Vikash Sehwag, Lingjuan Lyu, Xue Lin. 253-270 [doi]
- DynoSurf: Neural Deformation-Based Temporally Consistent Dynamic Surface ReconstructionYuxin Yao, Siyu Ren, Junhui Hou, Zhi Deng, Juyong Zhang, Wenping Wang. 271-288 [doi]
- MOD-UV: Learning Mobile Object Detectors from Unlabeled VideosYihong Sun, Bharath Hariharan. 289-307 [doi]
- ARoFace: Alignment Robustness to Improve Low-Quality Face RecognitionMohammad Saeed Ebrahimi Saadabadi, Sahar Rahimi Malakshan, Ali Dabouei, Nasser M. Nasrabadi. 308-327 [doi]
- Learning Diffusion Models for Multi-view Anomaly DetectionChieh Liu, Yu Min Chu, Ting-I Hsieh, Hwann-Tzong Chen, Tyng-Luh Liu. 328-345 [doi]
- Clearer Frames, Anytime: Resolving Velocity Ambiguity in Video Frame InterpolationZhihang Zhong, Gurunandan Krishnan, Xiao Sun, Yu Qiao 0001, Sizhuo Ma, Jian Wang 0111. 346-363 [doi]
- Multi-modal Relation Distillation for Unified 3D Representation LearningHuiqun Wang, Yiping Bao, Panwang Pan, Zeming Li, Xiao Liu, Ruijie Yang, Di Huang 0001. 364-381 [doi]
- Strengthening Multimodal Large Language Model with Bootstrapped Preference OptimizationRenjie Pi, Tianyang Han, Wei Xiong 0015, Jipeng Zhang, Runtao Liu, Rui Pan, Tong Zhang 0001. 382-398 [doi]
- Collaborative Vision-Text Representation Optimizing for Open-Vocabulary SegmentationSiyu Jiao, Hongguang Zhu, Jiannan Huang 0002, Yao Zhao 0001, Yunchao Wei, Humphrey Shi. 399-416 [doi]
- Distributionally Robust Loss for Long-Tailed Multi-label Image ClassificationDekun Lin, Tailai Peng, Rui Chen, Xinran Xie, Xiaolin Qin, Zhe Cui. 417-433 [doi]
- MesonGS: Post-training Compression of 3D Gaussians via Efficient Attribute TransformationShuzhao Xie, Weixiang Zhang, Chen Tang, Yunpeng Bai, Rongwei Lu, Shijia Ge, Zhi Wang 0001. 434-452 [doi]
- LongVLM: Efficient Long Video Understanding via Large Language ModelsYuetian Weng, Mingfei Han 0002, Haoyu He, Xiaojun Chang, Bohan Zhuang. 453-470 [doi]
- The All-Seeing Project V2: Towards General Relation Comprehension of the Open WorldWeiyun Wang, Yiming Ren, Haowen Luo, Tiantong Li, Chenxiang Yan, Zhe Chen, Wenhai Wang, Qingyun Li, Lewei Lu, Xizhou Zhu, Yu Qiao 0001, Jifeng Dai. 471-490 [doi]