| 5126 | -- | 5139 | Lin Zhu 0012, Weiquan Yan, Yi Chang 0002, Yonghong Tian 0001, Hua Huang 0001. Simultaneous Learning Intensity and Optical Flow From High-Speed Spike Stream |
| 5140 | -- | 5152 | Fangyu Li, Junzhu Duan, Qiyu Zhang, Caifeng Shan, Honggui Han. Bi-Directional and Triangular Circulation Fusion Neural Networks for Small Object Detection |
| 5153 | -- | 5165 | Chengyou Jia, Minnan Luo, Zhuohang Dang, Guang Dai, Xiaojun Chang, Jingdong Wang 0001. PSDiff: Diffusion Model for Person Search With Iterative and Collaborative Refinement |
| 5166 | -- | 5181 | Cuixin Yang, Rongkang Dong, Jun Xiao 0010, Cong Zhang, Kin-Man Lam 0001, Fei Zhou 0001, Guoping Qiu. Geometric Distortion Guided Transformer for Omnidirectional Image Super-Resolution |
| 5182 | -- | 5194 | Zhen Hong, Bowen Wang, Haoran Duan 0001, Yawen Huang, Xiong Li, Zhenyu Wen, Xiang Wu, Wei Xiang 0001, Yefeng Zheng 0001. SP-SLAM: Neural Real-Time Dense SLAM With Scene Priors |
| 5195 | -- | 5207 | Linyin Luo, Hanjiang Lai, Yan Pan 0002, Jian Yin 0001. Efficient Multimodal Selection for Retrieval in Knowledge-Based Visual Question Answering |
| 5208 | -- | 5222 | Dezhao Zhai, Wei Chen 0148, Yinghao Ding, Ming Yu 0007, Qinwei Li, Hang Wu. Research on Robust Measurement Method of Heart Rate Using Remote Photoplethysmography Based on Adversarial Learning Network With High and Low Frequency Features |
| 5223 | -- | 5235 | Qinzhong Tan, Ao Li, Le Dong, Weisheng Dong, Xin Li 0005, Guangming Shi. CDS-Net: Contextual Difference Sensitivity Network for Pixel-Wise Road Crack Detection |
| 5236 | -- | 5250 | Chenming Li, Shiguang Liu. TM2SP: A Transformer-Based Multi-Level Spatiotemporal Feature Pyramid Network for Video Saliency Prediction |
| 5251 | -- | 5264 | Ziqian Lu, Mushui Liu, Yunlong Yu, Zhao Wang, Xi Li 0001, Jungong Han. Variational Adapter: Improving CLIP in Data-Imbalanced Scenarios |
| 5265 | -- | 5278 | Linchun Hu, Wenming Cao 0001, Zhenqi Zhang, Yuchuang Liang. Progressive Feature Reconstruction Network for Zero-Shot Learning |
| 5279 | -- | 5292 | Runxin Zhang, Xia Wu 0001, Huimin Chen, Guanxiong He, Zheng Wang 0037, Rong Wang 0001, Feiping Nie 0001. Toward Balance Adaptive Weighted Ensemble Clustering |
| 5293 | -- | 5306 | Qiangqiang Shen, Zihou Guo, Hanzhang Wang, Yanhui Xu, Yongyong Chen, Shiqi Wang 0001, Yongsheng Liang 0001. Reliable Entropy-Induced Anchor Learning for Incomplete Multi-View Subspace Clustering |
| 5307 | -- | 5317 | Zhidan Ran, Zhiyao Xiao, Xiaobo Lu, Xuan Wei, Wei Liu. Context-Aided Semantic-Aware Self-Alignment for Video-Based Person Re-Identification |
| 5318 | -- | 5330 | Shujun Liu, Ling Chang. Conditional Dual Diffusion for Multimodal Clustering of Optical and SAR Images |
| 5331 | -- | 5342 | Kaiwen Du, Weirong Ye, Hanyu Guo, Yan Yan 0001, Hanzi Wang. Edge Guided Network With Motion Enhancement for Few-Shot Action Recognition |
| 5343 | -- | 5354 | Yuan Gao, Haibo Liu, Xiaohui Wei 0001. Semantic Concept Perception Network With Interactive Prompting for Cross-View Image Geo-Localization |
| 5355 | -- | 5366 | Rui Wang, Quanxue Gao, Ming Yang 0024, Qianqian Wang 0001. Tensorized Tri-Factor Decomposition for Multi-View Clustering |
| 5367 | -- | 5379 | Zihao He, Qianyu Shu, Jinming Wen, Hing-Cheung So. Efficient Sparse Recovery With Arctangent Regularization: A Novel Iterative Thresholding Algorithm |
| 5380 | -- | 5393 | Guanchun Wang, Xiangrong Zhang, Zelin Peng, Shunli Tian, Tianyang Zhang 0002, Xu Tang, Licheng Jiao. OraL: An Observational Learning Paradigm for Unsupervised Hyperspectral Change Detection |
| 5394 | -- | 5406 | Zhenkun Zhu 0001, Ruiqin Xiong, Jing Zhao 0011, Rui Zhao 0010, Xiaopeng Fan, Shuyuan Zhu, Tie-Jun Huang 0001. High Dynamic Range Imaging for Dynamic Scenes Based on Multi-Level Spike Camera |
| 5407 | -- | 5418 | Yumeng Su, Jiachao Zhang, Rui Yan 0010, Pengpeng Li 0001, Guo-Sen Xie, Xiangbo Shu. STPM: Spatial-Temporal Token Pruning and Merging for Complex Activity Recognition |
| 5419 | -- | 5430 | Xiusheng Xu, Lei Qi 0001, Jingyang Zhou, Xin Geng 0001. BatStyler: Advancing Multi-Category Style Generation for Source-Free Domain Generalization |
| 5431 | -- | 5444 | Xinran Cao, Liang Luo, Yu Gu 0003, Fuji Ren. Co-Dance With Ambiguity: An Ambiguity-Aware Facial Expression Recognition Framework for More Robustness |
| 5445 | -- | 5460 | Hui Lin, Nan Li, Pengjuan Yao, Kexin Dong, Yuhan Guo, Danfeng Hong, Ying Zhang, Congcong Wen. Generalization-Enhanced Few-Shot Object Detection in Remote Sensing |
| 5461 | -- | 5474 | Junjie Li, Guanshuo Wang, Yichao Yan, Fufu Yu, Qiong Jia 0004, Jie Qin, Shouhong Ding, Xiaokang Yang 0001. GPS: Generalizable Person Search on Large-Scale User-Generated Video Content |
| 5475 | -- | 5488 | Youze Wang, Wenbo Hu 0001, Yinpeng Dong, Jing Liu 0001, Hanwang Zhang, Richang Hong. Align Is Not Enough: Multimodal Universal Jailbreak Attack Against Multimodal Large Language Models |
| 5489 | -- | 5500 | Min Long, Zhenyu Liu, Le-Bing Zhang, Fei Peng 0001. LGDF-Net: Local and Global Feature-Based Dual-Branch Fusion Networks for Deepfake Detection |
| 5501 | -- | 5517 | Xiaotian Wu, Bofan Song, Jia Fang, Wei Qi Yan 0001, Qing-Yu Peng. CRP2-VCS: Contrast-Oriented Region-Based Progressive Probabilistic Visual Cryptography Schemes |
| 5518 | -- | 5532 | Xuan Li, Guomin Zhang, Weiwei Chen, Li Cheng, Yining Xie, Jiayi Ma 0001. An Infrared and Visible Image Fusion Method Based on Semantic-Sensitive Mask Selection and Bidirectional-Collaboration Region Fusion |
| 5533 | -- | 5544 | Jinliang Liu, Zongxin Yang. Test-Time Adaptation for Real-World Video Adverse Weather Restoration With Meta Batch Normalization |
| 5545 | -- | 5559 | Huicong Zhang, Haozhe Xie, Shengping Zhang, Hongxun Yao. Patch-Based Spatio-Temporal Deformable Attention BiRNN for Video Deblurring |
| 5560 | -- | 5574 | Yuan Shi, Bin Xia, Xiaoyu Jin, Xing Wang, Tianyu Zhao, Xin Xia 0014, Xuefeng Xiao 0001, Wenming Yang. VmambaIR: Visual State Space Model for Image Restoration |
| 5575 | -- | 5588 | Chaopeng Zhang, Ruiqin Xiong, Xiaopeng Fan, Debin Zhao. Attentive Large Kernel Network With Mixture of Experts for Video Deblurring |
| 5589 | -- | 5601 | Haozhi Shi, Weiying Xie, Haonan Qin, Yunsong Li, Leyuan Fang. Visual State Space Model With Graph-Based Feature Aggregation for No-Reference Image Quality Assessment |
| 5602 | -- | 5616 | JunJie Zhu, Liquan Shen, Zhengyong Wang, Yihan Yu. Underwater Image Quality Assessment Using Feature Disentanglement and Dynamic Content-Distortion Guidance |
| 5617 | -- | 5632 | Xiaolong Liu, Song Qiu, Mei Zhou, Weijie Le, Qingli Li, Yan Wang 0033. WFANet-DDCL: Wavelet-Based Frequency Attention Network and Dual Domain Consistency Learning for 7T MRI Synthesis From 3T MRI |
| 5633 | -- | 5643 | Yajie Chen, Shujuan Wang, Boshuai Zhang, Lihua Lin, Qianqian Chai, Jiazheng Yang, Xin Yang 0008, Qian Liu. Multi-Granularity Topology-Aware Cell Localization and Counting in Pathological Images |
| 5644 | -- | 5658 | Kun Yang 0010, Zhi Xu, Dingkang Yang, Qiang Fu, Rui Tang, Liang Song, Lihua Zhang. Robust Multi-Agent Collaborative Perception via Spatio-Temporal Awareness |
| 5659 | -- | 5670 | Long Zhang, Peipei Song, Zhangling Duan, Shuo Wang 0008, Xiaojun Chang, Xun Yang 0001. Video Corpus Moment Retrieval With Query-Specific Context Learning and Progressive Localization |
| 5671 | -- | 5683 | Zhiming Wang, Ning Ge 0001, Jianhua Lu. Motion In-Betweening With Spatial and Temporal Transformers |
| 5684 | -- | 5696 | Meng Wang, Yan Ding 0004, Yumeng Liu, Yunchuan Qin, Ruihui Li, Zhuo Tang. MixSSC: Forward-Backward Mixture for Vision-Based 3D Semantic Scene Completion |
| 5697 | -- | 5710 | Hao Jing, Anhong Wang, Lijun Zhao 0002, Yakun Yang, Donghan Bu, Jing Zhang, Yifan Zhang, Junhui Hou. Boosting 3D Object Detection With Semantic-Aware Multi-Branch Framework |
| 5711 | -- | 5723 | Zhenbo Yu, Junjie Wang, Hang Wang, Zhiyuan Zhang, Jinxian Liu, Zefan Li, Bingbing Ni, Wenjun Zhang 0001. Mesh2Animation: Unsupervised Animating for Quadruped 3D Objects |
| 5724 | -- | 5737 | Yufeng Yin 0004, Xiaoyan Liu, Zichao Zhang. SMA-MVS: Segmentation-Guided Multi-Scale Anchor Deformation Patch Multi-View Stereo |
| 5738 | -- | 5748 | Junsong Zhang, Zisong Chen, Chunyu Lin, Zhijie Shen, Lang Nie, Kang Liao, Yao Zhao 0001. SGFormer: Spherical Geometry Transformer for 360° Depth Estimation |
| 5749 | -- | 5761 | Xin Zhang, Kun Liu 0009, Xinwang Wang, Zhong Zhou, Haiyong Chen. RMGNet: The Progressive Relationship-Mining Graph Neural Network for Text-to-Image Person Re-Identification |
| 5762 | -- | 5775 | Changhao Wang, Guanwen Zhang, Zhengyun Cheng, Wei Zhou 0020. KPDepth-VO: Self-Supervised Learning of Scale-Consistent Visual Odometry and Depth With Keypoint Features From Monocular Video |
| 5776 | -- | 5790 | Tengfei Liu, Yongli Hu, Mingjie Li 0006, Junfei Yi, Xiaojun Chang, Junbin Gao, Baocai Yin. Tackling Real-World Complexity: Hierarchical Modeling and Dynamic Prompting for Multimodal Long Document Classification |
| 5791 | -- | 5804 | Hailun Cheng, Shenjin Huang, Linghan Cai, Yangfan Xu, Runming Wang, Yongbing Zhang 0002. Focus Your Attention: Multiple Instance Learning With Attention Modification for Whole Slide Pathological Image Classification |
| 5805 | -- | 5820 | Biqing Qi, Junqi Gao, Xinquan Chen, Dong Li 0016, Jianxing Liu, Ligang Wu 0001, Bowen Zhou 0002. Contrastive Augmented Graph2Graph Memory Interaction for Few Shot Continual Learning |
| 5821 | -- | 5832 | Guoqing Zhang 0002, Yan Yang, Yuhui Zheng, Gaven J. Martin, Ruili Wang. Mask-Aware Hierarchical Aggregation Transformer for Occluded Person Re-Identification |
| 5833 | -- | 5843 | Guanghui He, Yanli Ren, Xiaoqiu Cai, Guorui Feng, Xinpeng Zhang 0001. Private Sampling of Latent Diffusion Models for Encrypted Prompt |
| 5844 | -- | 5857 | Runmin Cong, Ning Yang 0008, Hongyu Liu 0003, Dingwen Zhang, Qingming Huang, Sam Kwong, Wei Zhang 0021. TRNet: Two-Tier Recursion Network for Co-Salient Object Detection |
| 5858 | -- | 5871 | Ye Liu 0005, Pengfei Wu, Miaohui Wang, Jun Liu 0036. CPAL: Cross-Prompting Adapter With LoRAs for RGB+X Semantic Segmentation |
| 5872 | -- | 5884 | Ning Li, Bineng Zhong, Qihua Liang, Zhiyi Mo, Jian Nong, Shuxiang Song 0001. SIEVL-Track: Exploring Semantic Information Enhancement for Visual-Language Object Tracking |
| 5885 | -- | 5899 | Ning Liao, Xiaopeng Zhang 0008, Min Cao, Junchi Yan. M-Tuning: Prompt Tuning With Mitigated Label Bias in Open-Set Scenarios |
| 5900 | -- | 5911 | Xu Zhang, Bo Peng 0007, Jianjun Lei, Chao Xue, Yuxuan Yao, Qingming Huang. Adversarially Robust Object Detection via Deviation Calibration and Content Preservation |
| 5912 | -- | 5924 | Yijun Pan, Quan Zhao, Yueyi Zhang, Zilei Wang, Xiaoyan Sun 0001, Feng Wu 0005. Semantic-Aware Late-Stage Supervised Contrastive Learning for Fine-Grained Action Recognition |
| 5925 | -- | 5938 | Chenting Xu, Ke Xu 0003, Xinghao Jiang, Tanfeng Sun. PLOVAD: Prompting Vision-Language Models for Open Vocabulary Video Anomaly Detection |
| 5939 | -- | 5951 | Wenming Cao 0001, Liangxi Qian, Yicha Zhang, Xuelong Li 0001, Xinpeng Yin. Asymmetric Context-Guided Adaptive Alignment Network for Skeleton-Based Action Recognition |
| 5952 | -- | 5965 | Fengyu Liu, Yi Cao, Xianghong Cheng, Jianfeng Wu, Wendong Gu, Luhui Liu. Confidence Factor-Based Robust Localization Algorithm With Visual-Inertial-LiDAR Fusion in Underground Space |
| 5966 | -- | 5979 | Xiang Yuan, Gong Cheng 0003, Ruixiang Yao, Junwei Han. Semantic Differentiation Aids Oriented Small Object Detection |
| 5980 | -- | 5992 | Zixuan Zhao, Shuming Liu 0001, Chengze Zhao, Xu Zhao 0001. Constructing Semantical Structure by Segmentation Integrated Video Embedding for Temporal Action Detection |
| 5993 | -- | 6006 | Yueting Huang, Zhenzhe Hechen, Mingliang Zhou, Zhengguo Li, Sam Kwong. An Attention-Locating Algorithm for Eliminating Background Effects in Fine-Grained Visual Classification |
| 6007 | -- | 6020 | Xinke Wang, Jingyuan Xu, Xiao Sun 0003, Mingzheng Li, Bin Hu 0001, Wei Qian, Dan Guo 0001, Meng Wang 0001. Facial Depression Estimation via Multi-Cue Contrastive Learning |
| 6021 | -- | 6033 | Zeng-Yang Che, Zheng Zhang 0006, Yaping Wu, Meiyun Wang. Disentangle and Then Fuse: A Cross-Modal Network for Synthesizing Gadolinium-Enhanced Brain MR Images |
| 6034 | -- | 6046 | Ling Yang 0006, Yikai Zhao 0001, Zhaochen Yu, Bohan Zeng, Minkai Xu, Shenda Hong, Bin Cui 0001. Spatio-Temporal Energy-Guided Diffusion Model for Zero-Shot Video Synthesis and Editing |
| 6047 | -- | 6058 | Ruoyu Zhao, Mingrui Zhu, Shiyin Dong, De Cheng, Nannan Wang 0001, Xinbo Gao 0001. CatVersion: Concatenating Embeddings for Diffusion-Based Text-to-Image Personalization |
| 6059 | -- | 6073 | Cong Wang 0018, Panwen Hu, Haoyu Zhao, Yuanfan Guo, Jiaxi Gu, Xiao Dong, Jianhua Han, Hang Xu 0004, Xiaodan Liang. UniAdapter: All-in-One Control for Flexible Video Generation |
| 6074 | -- | 6086 | Kexiang Feng, Chuanmin Jia, Jingshan Pan, Siwei Ma, Wen Gao 0001. End-to-End Optimized Lossy Compression for Neural-Morphic Spiking Camera Captured Data |
| 6087 | -- | 6100 | Lei Luo 0003, Junjie Wu, Zhi Jin, Hongwei Guo 0001, Ce Zhu. Joint Resources Optimization for Soft Video Transmission Over IRS-Assisted SR Network |
| 6101 | -- | 6113 | Yifei Wang, Gaozhi Liu, Zhiying Zhu 0001, Xinpeng Zhang 0001, Zhenxing Qian. VivID: A Visually Improved GIF Encoding Network Design |
| 6114 | -- | 6128 | Hengyu Man, Hao Wang 0212, Riyu Lu, Zhaolin Wan, Xiaopeng Fan, Debin Zhao. Content-Aware Dynamic In-Loop Filter With Adjustable Complexity for VVC Intra Coding |
| 6129 | -- | 6144 | Jinjia Peng, Mengkai Li, Bingyan Wang, Huibing Wang. Omni Contextual Aggregation Networks for High-Fidelity Image Inpainting |
| 6145 | -- | 6157 | Junyan Huo, Yanzhuo Ma, Zhenyao Zhang, Hongli Zhang, Hui Yuan 0001, Shuai Wan, FuZheng Yang 0001. Adaptive Enhanced Global Intra Prediction for Efficient Video Coding in Beyond VVC |
| 6158 | -- | 6169 | Chang-xing Li, Donglin Zhang, Zhikai Hu, Xiaojun Wu 0001. Modality Fused Class-Proxy With Knowledge Distillation for Zero-Shot Sketch-Based Image Retrieval |
| 6170 | -- | 6183 | Jingcheng Ke, Jia Wang, Waikeung Wong, Anne Toomey, Jie Wen 0001. Graph-Based Group Division Network for Referring Expression Comprehension |
| 6184 | -- | 6194 | Ming Jin 0007, Wenbo Hu 0001, Richang Hong, Lei Zhu 0002. Revealing Security Flaws in Cross-Modal Retrieval Models Through Video Poisoning |
| 6195 | -- | 6210 | Mingyue Niu, Xu Wang, Jibing Gong, Bin Liu 0041, Jianhua Tao 0001, Björn W. Schuller. Depression Scale Dictionary Decomposition Framework for Multimodal Automatic Depression Level Prediction |