| 10616 | -- | 10631 | Yu Quan, Dong Zhang, Jinhui Tang 0001. Generalized Concordant Vision Transformer With Masked Image Tokens for Object Detection |
| 10632 | -- | 10648 | Wenjin Guo, Donglai Liu, Weiying Xie, Yunsong Li 0001, Xuefei Ning, Zihan Meng, Shulin Zeng, Jie Lei 0001, Zhenman Fang, Yu Wang 0002. ShiftQuant: Toward Accurate and Efficient Sub-8-bit Integer Training |
| 10649 | -- | 10664 | Hang Sun, Qingfei Zhong, Bo Du 0001, Zhigang Tu 0001, Jun Wan 0005, Wenbin Wang 0001, Dong Ren. Bidirectional-Modulation Frequency-Heterogeneous Network for Remote Sensing Image Dehazing |
| 10665 | -- | 10678 | Yang Wei 0002, Haowei Liu, Xiaochen Yuan, Xiuli Bi, Bin Xiao 0002. Let Images Speak More: An Efficient Method for Detecting Image Manipulation History |
| 10679 | -- | 10692 | Penglei Wang, Danyang Wu, Jin Xu 0014, Feiping Nie 0001. Comprehensive Information Extraction With Separable Representation Learning for Multi-View Clustering |
| 10693 | -- | 10704 | Yeongje Im, Jione Pak, Songju Na, Jinhong Park, Jihyung Ryu, Seounghyun Moon, Beomjun Koo, Suk-Ju Kang. Supervised Denoising for Extreme Low-Light Raw Videos |
| 10705 | -- | 10715 | Junkai Fan, Xiang Li 0041, Jianjun Qian, Jun Li 0027, Jian Yang 0003. Non-Aligned Supervision for Real Image Dehazing |
| 10716 | -- | 10727 | Xinxin Wang 0003, Yongshan Zhang, Jie Zhang, Yicong Zhou. Incomplete Multiview Clustering Using Discriminative Feature Recovery and Tensorized Matrix Factorization |
| 10728 | -- | 10741 | Jahoon Jeong, Joonkyo Shim, Hyunsoo Yoon. TANet: Tri-Aspects Network for Camouflaged Object Detection |
| 10742 | -- | 10755 | Feilong Cao, Qijin Xu, Hailiang Ye. Adaptive Prior and Long-Range Dependency-Based Learners for Image Inpainting |
| 10756 | -- | 10771 | Meijun Fu, Xiaomin Wang, Jun Wang 0002, Zhang Yi 0001. Synthetic Gradient Optimization-Based Implicit Amortized Bayesian Meta-Learning for Few-Shot Pumi Spectrographic Image Recognition |
| 10772 | -- | 10786 | Wenrui Li 0001, Penghong Wang, Xingtao Wang, Wangmeng Zuo, Xiaopeng Fan 0001, Yonghong Tian 0001. Multi-Timescale Motion-Decoupled Spiking Transformer for Audio-Visual Zero-Shot Learning |
| 10787 | -- | 10800 | Jie Zhou 0031, Yongxiang Liu, Bowen Peng, Li Liu 0002, Xiang Li 0014. MaDiNet: Mamba Diffusion Network for SAR Target Detection |
| 10801 | -- | 10814 | Zhaojie Chu, Kailing Guo, Xiaofen Xing, Bolun Cai, Shan He, Xiangmin Xu. Alleviating One-to-Many Mapping in Talking Head Synthesis With Dynamic Adaptation Context and Style Adapter |
| 10815 | -- | 10827 | Yunfei Bai, Yiqiang Wu, Bin Zhu, Xiaomao Li. Contrastive-Domain Mean Teacher for Domain Adaptive Object Detection |
| 10828 | -- | 10843 | Yingge Liu, Dawei Dai, Guoyin Wang 0001, Shuyin Xia. Multivariate Feedback-Based Image-Text Joint Learning for Sketch-Less Facial Image Retrieval |
| 10844 | -- | 10861 | Shuai Liu, Yuchao Zheng 0001, Jianru Li, Huimin Lu 0001, Dong An, Zhengxiang Shen, Zhanshan Wang. Turbid Underwater Image Enhancement With Illumination-Constrained and Structure-Preserved Retinex Model |
| 10862 | -- | 10874 | Ting-Wei Zhou, Xi-Le Zhao, Wei-Hao Wu, Jian Li Wang, Yi-Si Luo. Frequency-Aware Implicit Neural Representation for Multi-Dimensional Data Recovery |
| 10875 | -- | 10890 | Zenghui Wang 0009, Songlin Du, Yaping Yan, Guobao Xiao, Xiaobo Lu. Tex2Sem: Learning From Textures to Semantics for Robust Semantic Correspondence |
| 10891 | -- | 10905 | Zhipu Liu, Lei Zhang 0038. Multi-Model Synergy Perception for Open-World Person Re-Identification |
| 10906 | -- | 10917 | Yu Bai, Liang Bai 0001, Xian Yang 0001, Jiye Liang. Label-Semantic-Based Prompt Tuning for Vision Transformer Adaptation in Medical Image Analysis |
| 10918 | -- | 10929 | Qing Tian 0001, Xiang Liu, JiaZhong Zhou, Yuhui Zheng, Jun Wan 0001, Zhen Lei 0001. Cross-Attention With Conditional Matching for Multi-Target Domain Adaptation |
| 10930 | -- | 10943 | Liyuan Guo, Lianghai Jin, Enmin Song. Queue-Augmented Correlation-Biased Orthogonality Loss and Implicit Selective Transformer for Facial Expression Recognition in the Wild |
| 10944 | -- | 10958 | Pengyu Jie, Wanquan Liu, Chenqiang Gao, Yihui Wen, Rui He, Weiping Wen, Pengcheng Li, Jintao Zhang, Deyu Meng. A Point-Neighborhood Learning Framework for Nasal Endoscopic Image Segmentation |
| 10959 | -- | 10972 | Yunsong Li 0001, Xin Zhang 0092, Weiying Xie, Xiaoyu Chen, Daixun Li, Hangyu Ye, Leyuan Fang. Dual-Depth Unified Joint Optimization: Adaptive Curvature-Based Compression |
| 10973 | -- | 10985 | Quanbo Ge, Bingtao Zhu, Mengmeng Wang 0009, Bingjun Zhang, Yanjun Huang. Airborne Camera Dynamic Target Detection Based on Background Prediction and Semantic Compensation in Surface Environment |
| 10986 | -- | 11000 | Haidong Qin, Tao Yang 0006, Xiaoshi Zhou, Dongdong Li, Yanran Dai, Jing Li 0010. ECC-NeRF: Anti-Aliasing Neural Radiance Fields With Elliptic Cone-Casting for Diverse Camera Models |
| 11001 | -- | 11012 | Yuhao Li, Jiale Cao, Muzammal Naseer, Yu Zhu 0004, Jinqiu Sun, Yanning Zhang 0001, Fahad Shahbaz Khan. Multi-Granularity Language-Guided Training for Multi-Object Tracking |
| 11013 | -- | 11027 | Biao Xiang, Hongmei Chen 0001, Yong Mi, Binbin Sang, Shi-Jinn Horng, Tianrui Li 0001. Class-Specific Discriminability and Multiscale Information-Based Multiview Feature Selection |
| 11028 | -- | 11040 | Weihao Jiang, Chang Liu, Kun He 0001. Intra-Task Mutual Attention-Based Vision Transformer for Few-Shot Learning |
| 11041 | -- | 11053 | Yu Wang 0006, Shikui Wei, Sen Xu, Ying Qin, Yao Zhao 0001. Confidence-Driven Unimodal Interference Removal for Enhanced Multimodal Object Detection |
| 11054 | -- | 11067 | Daidou Guo, Chuan Qin 0001, Xiangyang Luo 0001, Guorui Feng, Xinpeng Zhang 0001. Shields for Digital Images: A Watermarking Method With KAN Block and Simulation-Enhanced Noise Pool to Resist Screen-Camera Attacks |
| 11068 | -- | 11082 | Heng Wang 0014, Hongxia Wang 0001, Mingze He, Fei Zhang 0015, Jinghong Xia. Robust Video Watermarking Against Digital Editing and Camcording |
| 11083 | -- | 11096 | Qihang Ge, Wei Sun 0029, Yu Zhang 0133, Yunhao Li, Zhongpeng Ji, Fengyu Sun, Shangling Jui, Xiongkuo Min, Guangtao Zhai. LMM-VQA: Advancing Video Quality Assessment With Large Multimodal Models |
| 11097 | -- | 11112 | Yimei Liu, Jingchao Cao, Hao Fan 0004, Junyu Dong, Sheng Chen 0001. Real-World Multi-View Stereo via Learning RGB-D Structural Consistency From Depth Super-Resolution |
| 11113 | -- | 11128 | Bosen Lin, Junyu Dong, Xinghui Dong. Perception-Aware Underwater Image Quality Assessment: Dataset, Perceptual Quality Scores, and Assessment Network |
| 11129 | -- | 11143 | Linhan Huang, Yutao Chen, Liu Liu 0014, Jianqing Zhu, Huanqiang Zeng. Harmonizing Metric Discrepancy for Cross-Modal Object Re-Identification |
| 11144 | -- | 11157 | Qianhan Feng, Wenshuo Li, Tong Lin 0002, Xinghao Chen 0001. Full-Stage Pseudo Label Quality Enhancement for Weakly-Supervised Temporal Action Localization |
| 11158 | -- | 11171 | Zhaobo Qi, Shuhui Wang, Weigang Zhang, Qingming Huang. Uncertainty-Aware Mixture of Experts for Video Action Anticipation |
| 11172 | -- | 11185 | Chengyang Fang, Wenhui Jiang, Yuming Fang, Yuxin Peng 0001, Yang Liu 0293. Separate, Locate, and Align: Determine Context Relation of Scene Text From Multiple Perspectives in TextVQA |
| 11186 | -- | 11199 | Xiaohuan Lu, Jiang Long, Haitao Zhang, Wulin Xie, Lian Zhao, Yinghao Ye, Jie Wen 0001. Partial Multi-View Incomplete Multi-Label Learning Network With Quality-Aware Representation Fusion |
| 11200 | -- | 11215 | Xinggang Hu, Yanmin Wu, Mingyuan Zhao, Zhenzhong Cao, Xiangkui Zhang, Xiangyang Ji. DYO-SLAM: Visual Localization and Object Mapping in Dynamic Scenes |
| 11216 | -- | 11228 | Qingxuan Lv, Junyu Dong, Yuezun Li, Sheng Chen 0001, Hui Yu 0001, Shu Zhang 0002, Wenhan Wang. UWStereo: A Large Synthetic Dataset for Underwater Stereo Matching |
| 11229 | -- | 11243 | Changping Hu, Jing Xu 0011, Chifai Pun, Fei Chen 0007, Rui Chen 0019. GlassMolder: Transparent Object Reconstruction With Silhouette-Guided Object-Centric Diffusion |
| 11244 | -- | 11257 | Zhenlong Yuan, Zhidong Yang, Yujun Cai, Kuangxin Wu, Mufan Liu, Dapeng Zhang, Hao Jiang 0013, Zhaoxin Li, Zhaoqi Wang. SED-MVS: Segmentation-Driven and Edge-Aligned Deformation Multi-View Stereo With Depth Restoration and Occlusion Constraint |
| 11258 | -- | 11270 | Sigeng Chen, Jingfan Fan, Danni Ai, Deqiang Xiao, Yucong Lin, Hong Song 0003, Hongli Liu, Wenyuan Yu, Yang Yu, Jian Yang 0009. Multidomain Dependency-Aware Guided Unified-Stage Coronary Artery Branch Recognition Network |
| 11271 | -- | 11281 | Xueyuan Gong, Zhiquan Liu, Yain-Whar Si, Xiaochen Yuan, Ke Wang, Xiaoxiang Liu, Cong Lin, Xinyuan Zhang. FastFace: Fast-Converging Scheduler for Large-Scale Face Recognition Training With One GPU |
| 11282 | -- | 11296 | Qian Feng, Hanbin Zhao, Chao Zhang 0001, Jiahua Dong 0001, Henghui Ding, Yu-Gang Jiang 0001, Hui Qian 0001. PECTP: Parameter-Efficient Cross-Task Prompts for Incremental Vision Transformer |
| 11297 | -- | 11308 | An-An Liu, Hao-Chen Li, Wenhui Li 0001, Dan Song 0006, Hongshuo Tian, Lanjun Wang. ClipMix for Domain Generalization |
| 11309 | -- | 11322 | Kuiyun Huang, Menglong Chen, Hong Zheng, Baihong Lin, Shicai Fan. Soft Cluster-Aware Equivariant Contrastive Learning for Unsupervised Out-of-Distribution Detection |
| 11323 | -- | 11336 | Runzhong Zhang, Yueqi Duan, Yang Chen, Weipeng Hu, Chen Cai, Suchen Wang, Yap-Peng Tan. Boundary Voting Network for Ambiguity-Aware Timestamp-Supervised Action Segmentation |
| 11337 | -- | 11349 | Siping Zhuang, Guangyao Li, Qiangqiang Wu, Yang Lu 0009, Hai-Miao Hu, Hanzi Wang. CGATracker: Correlation-Aware Graph Alignment for Referring Multi-Object Tracking |
| 11350 | -- | 11361 | Yujin Zheng, Chu He, Xiaohan Chen, Huan Zhang, Tao Qu, Dingwen Wang. DFA-MOT: A Dynamic Field-Aware Multi-Object Tracking Framework for Uncrewed Aerial Vehicles |
| 11362 | -- | 11376 | Jun Wang 0131, Bingfei Chai, Lingtao Zhou, Yuanyun Wang. Robust Object Tracking via Long-Range Spatial Representation and Local Feature Enhancement |
| 11377 | -- | 11389 | Yuanhong Zhong, Ge Yan, Yongting Hu, Dong Zhu, Ruyue Zhu. A Two-Stage Framework With Memory for Anomaly Detection via Video Decomposition and Bidirectional Consistency |
| 11390 | -- | 11403 | Yunan Li 0001, Xi Geng, Zhuoqi Ma, Qiguang Miao, Chi-Man Pun. Boundary-Aware Sentence-Gloss Alignment With Semantic Similarity Measurement for Continuous Sign Language Recognition |
| 11404 | -- | 11415 | Zhehao Zhu, Yifei Huang 0002, Mingfang Zhang 0002, Liangyang Ouyang, Yoichi Sato 0001. Prompt-Augmented Boundary Attentive Learning for Weakly Supervised Temporal Sentence Grounding |
| 11416 | -- | 11431 | Kun Dai, Zilong Zhou, Zhiqiang Jiang, Qihao Sun, Tao Xie 0010, Hongbo Gao 0008, Tao An, Ruifeng Li, Lijun Zhao 0003. VD-Matcher: A Very Deep Local Feature Matcher With Weight Recycling and Keypoint Detection |
| 11432 | -- | 11447 | Jinyi Fang, Bingke Zhu, Jingling Yuan, Yingying Chen 0003, Ming Tang 0001, Jinqiao Wang. AMITA: Attribute-Guided Masked Image-Text Alignment for Multi-Label Image Representation |
| 11448 | -- | 11461 | Hao Wang 0073, Tong Jia 0001, Qilong Wang 0001, Wangmeng Zuo. Automatic Label Assignment for Object Detection |
| 11462 | -- | 11473 | Zhen-Xiang Ma, Zhen-Duo Chen 0001, Tai Zheng, Xin Luo 0006, Xin-Shun Xu. BTG-Net++: Enhanced Bi-Directional Task-Guided Network for Few-Shot Fine-Grained Image Classification |
| 11474 | -- | 11487 | Kaixun Jiang, Zhaoyu Chen 0001, Jiyuan Fu, Lingyi Hong, Jinglun Li, Wenqiang Zhang. VideoPure: Diffusion-Based Adversarial Purification for Video Recognition |
| 11488 | -- | 11501 | Tieyuan Chen, Huabin Liu 0001, Chern Hong Lim, John See, Xing Gao 0005, Junhui Hou, Weiyao Lin. CSTA: Spatial-Temporal Causal Adaptive Learning for Exemplar-Free Video Class-Incremental Learning |
| 11502 | -- | 11513 | Hua Yu 0006, Yaqing Hou, Xu Gui 0001, Shanshan Feng 0001, Dongsheng Zhou, Qiang Zhang 0008. A Spatio-Temporal Continuous Network for Stochastic 3D Human Motion Prediction |
| 11514 | -- | 11526 | Hao Liu 0044, Hui Yuan 0001, Raouf Hamzaoui, Weiqing Yan. PU-GSM: A Latent Geometry-Guided Self-Similarity Model for Point Cloud Upsampling |
| 11527 | -- | 11539 | Tengyao Cui, Yongfang Wang, Yihan Wang 0008, Zhijun Fang. Semantic and Saliency-Aware Scalable Image Coding Toward Human-Machine Collaboration |
| 11540 | -- | 11552 | Zhiyuan Li, Yanhui Zhou, Hao Wei 0005, Chenyang Ge, Ajmal Mian. RDEIC: Accelerating Diffusion-Based Extreme Image Compression With Relay Residual Diffusion |
| 11553 | -- | 11566 | Ziqing Ge, Zhimeng Huang, Chuanmin Jia, Siwei Ma 0001, Wen Gao 0001. Rethinking the Functionality of Latent Representation: A Logarithmic Rate-Distortion Model for Learned Image Compression |
| 11567 | -- | 11582 | Haotian Zhang, Yuqi Li, Li Li 0040, Dong Liu 0002. Learning Switchable Priors for Neural Image Compression |
| 11583 | -- | 11597 | Han Xiao, Changqiao Xu, Hongye Jiang, Wendong Wang 0003, Shujie Yang, Lujie Zhong, Xiaofeng Tao 0001, Gabriel-Miro Muntean. Bilateral Bargaining-Based Adaptive Video Transmission: A Frame Rate Perspective |
| 11598 | -- | 11612 | Penggang Qin, Tong Xu 0001, Chao Zhang 0096, Heda Wang, Yao Hu 0002, Enhong Chen. Scenario-Aware Multimodal Chain-of-Thought Prompting for Rationales of VideoSocial Relations |
| 11613 | -- | 11626 | Huakai Lai, Xi Wei, Rui Sun 0006, Tianzhu Zhang 0001. Agent-Based Control Prompt Tuning for Video-Text Retrieval |
| 11627 | -- | 11640 | Linlin Ji, Li Liu 0031. Multi-Scale Feature Fusion Based on Piecewise Polynomial Activation Function for Image-Text Matching |
| 11641 | -- | 11654 | Ran Ran 0001, Jiwei Wei, Shiyuan He, Yuyang Zhou, Peng Wang 0023, Yang Yang 0002, Heng Tao Shen. Fine-Grained Alignment and Interaction for Video Grounding With Cross-Modal Semantic Hierarchical Graph |
| 11655 | -- | 11666 | Ming Jin 0007, Lei Zhu 0002, Richang Hong. BiSeR-LMA: A Bidirectional Semantic Reasoning and Large Model Enhancement Approach for Text-Video Cross-Modal Retrieval |
| 11667 | -- | 11684 | Dazhi Xu, Ming Li 0004, Yan Wu 0003, Peng Zhang 0003, Xinyue Xin. Statistic-Guided Difference Enhancement Graph Transformer for Unsupervised Change Detection in PolSAR Images |
| 11685 | -- | 11697 | Yong Chen 0013, Feiwang Yuan, Wenzhen Lai, Jinshan Zeng, Wei He 0003, Qing Huang. Low-Rank Tensor Meets Deep Prior: Coupling Model-Driven and Data-Driven Methods for Hyperspectral Image Reconstruction |
| 11698 | -- | 11707 | Xinxin Li, Zichi Wang, Xinpeng Zhang 0001. Black-Box Steganography for Large Language Models |
| 11708 | -- | 11722 | Shuai Yuan, Guangyong Gao, Yimin Yu, Zhihua Xia. Reversible Data Hiding in Encrypted Images With Adaptive Multi-Directional MED and Huffman Code Based on Interval-Wise Dynamic Prediction Axes |
| 11723 | -- | 11736 | Dingcheng Gao, Yanjun Qin, Xiaoming Tao 0001, Jianhua Lu. Diversifying Latent Flows for Safety-Critical Scenarios Generation With CARLA Simulator |