Abstract is missing.
- A-MESS: Anchor-based Multimodal Embedding with Semantic Synchronization for Multimodal Intent RecognitionYaomin Shen, Xiaojian Lin, Wei Fan. 1-6 [doi]
- SketchRef: a Multi-Task Evaluation Benchmark for Sketch SynthesisXingyue Lin, Xingjian Hu, Shuai Peng, Jianhua Zhu, Liangcai Gao. 1-6 [doi]
- Perspective Makes Perfect: Prompt-tuning Vision-Language Models for Action Recognition with Diversified Multi-Modal ObservationHailun Zhang, Qijun Zhao, Zhen Zhai, Xinrui Wang. 1-6 [doi]
- Weak Semantic-Guided Entropy Model for Image CompressionYiming Ding, Jianguo Wei. 1-6 [doi]
- Non-Parametric Media Quality Recovery from Spammer-Affected Subjectively Annotated DatasetsLohic Fotio Tiotsop, Andrés Altieri, Giuseppe Valenzise. 1-6 [doi]
- Instance-Distance Active Learning for Source-Free Cross-Domain Object DetectionKangrui Du, Yujun Qian, Juepeng Zheng. 1-6 [doi]
- Dynamic Feature-Focusing with Cross-Modal Semantic Alignment for Video Moment Retrieval and Highlight DetectionXuehui Liang, Ruomei Wang 0001, Baoquan Zhao, Jiawei Feng. 1-6 [doi]
- Self-Relevance-Based Multimodal In-Context Learning for Multimodal Named Entity RecognitionZhi Zhang, Bing Xu, Muyun Yang, Hailong Cao, Conghui Zhu, Wenpeng Lu, Tiejun Zhao. 1-6 [doi]
- MROSS: Multi-Round Region-based Optimization for Scene SketchingYiqi Liang, Ying Liu 0027, Dandan Long, Ruihui Li. 1-6 [doi]
- Merge Mode for Template-based Intra Mode Derivation (TIMD) in ECMMohsen Abdoli, Ramin G. Youvalari, Frank Plowman, Alexandre Tissier. 1-6 [doi]
- Effective Linear Vision Transformer Via Selective Sampling Softmax and Multi-Feature EnhancementXianchao Zhang 0001, Senqi Guan, Yunlong Gao, Linlin Zong, Wenxin Liang, Xinyue Liu 0002. 1-6 [doi]
- Computational Measures of Gaze Behavior Using the Concept of Situational AwarenessYunxiang Jiang, Qing Xu 0002, Aoxing Xu, Simon Parkinson, Klaus Schoeffmann, Chuntie Chen. 1-6 [doi]
- Where's That Voice Coming? Continual Learning for Sound Source LocalizationYang Xiao, Rohan Kumar Das. 1-6 [doi]
- Divide-And-Conquer: Dual-Hierarchical Optimization for Semantic 4D Gaussian SpattingZhiying Yan, Yiyuan Liang, Shilv Cai, Tao Zhang 0147, Sheng Zhong 0001, Luxin Yan, Xu Zou 0002. 1-6 [doi]
- Enhancing Multi-modal Models with Heterogeneous MoE Adapters for Fine-tuningSashuai Zhou, Yan Xia 0006, Hai Huang 0013. 1-6 [doi]
- Bridging the One-to-Many Gap: Multi-label Semantic Learning and Relay for Video CaptioningShuqin Chen, Yikang Hu, Li Yang, Zhixin Sun, Liangjun Yu, Xian Zhong. 1-6 [doi]
- Adaptive Strategy Weighting with Fault Tolerant Localization for Object NavigationYanwei Zheng, Shaopu Feng, Bowen Huang, Changrui Li, Xiao Zhang 0015, Dongxiao Yu. 1-6 [doi]
- Construct a Powerful Discriminative Relationship for Few-Shot Action RecognitionQianhan Tang, Yanan Liu, Ningxin Wang, Kangjian He, Hao Zhang 0110, Dan Xu 0001. 1-6 [doi]
- Wavelet-based Feature Representation Framework for Event Stream RecognitionZihan Cheng, Xingyu Pan, Xi Chen, Shenghua Fan. 1-6 [doi]
- A Domain Generalization Framework Based on Wavelet-Driven Structural Enhancement and Contrastive AlignmentYuheng Xu, Taiping Zhang, Yang Liu. 1-6 [doi]
- Feature and Temporal Disruption Attacks from Images to VideosZhanpeng Liu, Yuqiang Zhang, Tianlong Yu, Xi Lin, Yang Yang, Chenxi Huang, Bin Wang. 1-6 [doi]
- MoE-based Mamba for Multi-scene Universal Remote Sensing Semantic SegmentationJie Zhang 0133, Mingwen Shao, Xiaodong Tan 0002, Xiangyong Cao. 1-6 [doi]
- Context-Enhanced Zero-Shot Video Temporal Grounding with Adaptive Boundary RefinementFangkai Li, Hao Hu, Feiyu Pan, Yanzhen Wang, Yiyou Guo, Xiankai Lu. 1-6 [doi]
- Enhancing Large Multimodal Models with Adaptive Sparsity and KV Cache CompressionTe Zhang, Yuheng Li, Junxiang Wang, Lujun Li. 1-6 [doi]
- Chinese-LiPS: A Chinese Audio-Visual Speech Recognition Dataset with Lip-Reading and Presentation SlidesJinghua Zhao 0004, Yuhang Jia, Shiyao Wang, Jiaming Zhou, Hui Wang 0075, Yong Qin. 1-6 [doi]
- Measuring and Controlling the Spectral Bias for Self-Supervised Image DenoisingWang Zhang, Huaqiu Li, Xiaowan Hu, Tao Jiang, Zikang Chen, Haoqian Wang. 1-6 [doi]
- AU-TTT: Vision Test-Time Training model for Facial Action Unit DetectionBohao Xing, Kaishen Yuan, Zitong Yu, Xin Liu 0012, Heikki Kälviäinen. 1-6 [doi]
- ESVQA: Perceptual Quality Assessment of Egocentric Spatial VideosXilei Zhu, Huiyu Duan, Liu Yang, Yucheng Zhu, Xiongkuo Min, Guangtao Zhai, Patrick Le Callet. 1-6 [doi]
- ALCReg: Active Label Correction for Partial Point Cloud RegistrationZongyi Xu, Xinqi Jiang, Xinyu Gao, Shanshan Zhao 0001, Qianni Zhang, Weisheng Li 0001, Xinbo Gao 0001. 1-6 [doi]
- Audio-Driven Gesture Generation via Deviation Feature in the Latent SpaceJiahui Chen, Huan Yang, Runhua Shi, Chaofan Ding, Xiaoqi Mo, Siyu Xiong, Yinong He. 1-6 [doi]
- Retinal OCT Anomaly Detection Based on Suspicious Strategy and Relational LearningMinghui Zhai, Xing Wu, Liangshan Zhu, Chengliang Wang, Yonggang Luo, Peng Wang. 1-6 [doi]
- MSAF-Net: A Multi-Scale Adaptive Fusion Network for Facial Expression Recognition in Mental Health PatientsGuolong Liu, Jiayu Ye, Hao Wang, Qingxiang Wang. 1-6 [doi]
- EIAD: Explainable Industrial Anomaly Detection Via Multi-Modal Large Language ModelsZongyun Zhang, Jiacheng Ruan, Xian Gao, Ting Liu 0016, Yuzhuo Fu. 1-6 [doi]
- Bidirectional Feature Fusion and Adaptive Decision Network for Multimodal Fake News DetectionDilxat Abdureyim, Bo Ma 0004, Yating Yang, Rui Dong 0002, YiDu Chen, Azmat Anwar, Lei Wang 0065. 1-6 [doi]
- MVPS: Multi-View Adaptive Prompt Synergy for Zero-shot Anomaly DetectionLongzhao Huang, Wenhao Xu, Changwei Wang 0001, Rongtao Xu, Peng Lu, Shibiao Xu. 1-6 [doi]
- StegOT: Trade-offs in Steganography via Optimal TransportChengde Lin, Xuezhu Gong, Shuxue Ding, Mingzhe Yang, Xijun Lu, Chengjun Mo. 1-6 [doi]
- TaxAgent: How Large Language Model Designs Fiscal PolicyJizhou Wang, Xiaodan Fang, Lei Huang, Yongfeng Huang. 1-6 [doi]
- BEV-MMC: Bird's-Eye-View-Based Multimodal Compression for Enhanced Visual RecognitionZhiwei Dong, Ying Liu. 1-6 [doi]
- Making Small Language Model Excellent Symptom Inference Expert for Mental Disorders DetectionMeiling Li, Xiaotian Xu, Shicheng Li, Bin Wu. 1-6 [doi]
- Concept-Centric Learning for Weakly-Supervised Temporal Sentence GroundingYaru Zhang, Haichao Shi, Xiaoyu Zhang. 1-6 [doi]
- SAM-FE: Segment Anything Model Guided Feature Enhancement for Semantic Change Detection of Remote Sensing ImagesJunqing Huang, Tong Liu 0021, Chan-Tong Lam, Xiaochen Yuan. 1-6 [doi]
- Scene Graph Generation with Large Vision-Language Model and Its ApplicationsWei-Xin Chen, Yong-Yong Chen, Shi-Chao Kan. 1-6 [doi]
- Redefining Image-to-Recipe Retrieval with Nutritional and Ingredient SimilaritySatayu Parinayok, Shin'ichi Satoh 0001, Kiyoharu Aizawa, Yoko Yamakata. 1-6 [doi]
- Overcoming Feature Contamination by Unidirectional Information Modeling for Vision-Language TrackingJingchao Wang, Zhijian Wu, Wenlong Zhang, Wenhui Liu, Jianwei Zhang, Dingjiang Huang. 1-6 [doi]
- From Camera to World: A Plug-and-Play Module for Human Mesh TransformationChanghai Ma, Ziyu Wu, Yunkang Zhang, Qijun Ying, Boyan Liu, Xiaohui Cai. 1-6 [doi]
- PSFD: Proactive Spatial-Frequency Defense against Malicious Exemplar-Guided Image EditingLi Zeng, Xiaojun Mo, Meng Xie, Hangtao Zhang, Yixiang Liu, Yezhuo Peng, Yanchun Li. 1-6 [doi]
- JointDistill: Adaptive Multi-Task Distillation for Joint Depth Estimation and Scene SegmentationTiancong Cheng, Ying Zhang 0047, Yuxuan Liang, Roger Zimmermann, Zhiwen Yu 0001, Bin Guo 0001. 1-6 [doi]
- Learning Physics-Informed Color-Aware Transforms for Low-Light Image EnhancementXingxing Yang 0002, Jie Chen 0026, Zaifeng Yang. 1-6 [doi]
- OpenDUN: To Discover Unknown Number of Visual CategoriesSik Chit Wu, Munan Ning, Dong Wei 0004, Yefeng Zheng 0001, Donghuan Lu, Li Yuan 0007. 1-6 [doi]
- SFRP: Fine-Grained Point Cloud Classification via Interaction of Spatial and Feature Representation PointsHaoxiang Sun, Xiaomeng Li, Yanhao Ding, Qian Sun, Zhenbo Li. 1-6 [doi]
- Joint Feature Learning and Mixing via State Space Model for Remote Sensing Change DetectionBin Chen, Shenglong Hu, Huihui Song 0003, Kaihua Zhang 0001. 1-6 [doi]
- Audio-Driven Emotion-Aware 3D Talking Face Generation from Single ImageChun-Shuo Qiu, Feng-Lin Liu, Hongbo Fu 0001, Fan Zhang 0063, Yan-Pei Cao, Yu-Kun Lai, Lin Gao 0004. 1-6 [doi]
- Evading Deepfake Detectors via Adversarially Degrading and Restoring Forged ImagesZhengli Shi, Chenhao Lin, Zhengyu Zhao 0001, Peter Peer, Chao Shen 0001. 1-6 [doi]
- LL4G: Self-Supervised Dynamic Optimization for Graph-Based Personality DetectionLingzhi Shen, Yunfei Long, Xiaohao Cai, Guanming Chen, Yuhan Wang, Imran Razzak, Shoaib Jameel. 1-6 [doi]
- EG-Gaussian: Epipolar Geometry and Graph Network Enhanced 3D Gaussian SplattingBeizhen Zhao, Yifan Zhou, Zijian Wang, Hao Wang. 1-6 [doi]
- NeRFSwap: A NeRF-Based Generative Model for Face SwappingShuangyi Tan, Mingzhi Mao, Guanbin Li. 1-6 [doi]
- Generative Image Coding with Diffusion PriorJianhui Chang. 1-6 [doi]
- Latent Diffusion-based Face Anonymization with Identity and Attribute DecouplingChenrui Liu, Zhichao Lian. 1-6 [doi]
- Forensicability Assessment: Not All Samples Qualify for Recapture DetectionYongqi Chen, Lin Zhao 0017, Rizhao Cai, Zitong Yu, Changsheng Chen, Bin Li. 1-6 [doi]
- Enhancing Personalized Recommendation via Metacognitive ProfileJiaqi Yin, Jingyang Qiao, Tiong-Thye Goh, Yi Hu. 1-6 [doi]
- MRKD: Monotonic Relationship-based Knowledge Distillation for SAR Image RecognitionJielei Wang, Zihan Cheng, Guoming Lu, Kexin Li, Guangchun Luo. 1-6 [doi]
- 3D-Contrastive Anchors and Structure Enhancement for Multi-modal RepresentationsMingkai Sheng, Jichao Wang, Yi Liu, Wen Cheng, Lingfang Zeng. 1-6 [doi]
- PCM-SAR: Physics-Driven Contrastive Mutual Learning for SAR ClassificationPengfei Wang, Hao Zheng 0009, Zhigang Hu 0001, Aikun Xu, Meiguang Zheng, Liu Yang 0015. 1-6 [doi]
- Beyond Sliders: Mastering the Art of Diffusion-based Image ManipulationYufei Tang, Daiheng Gao, Pingyu Wu, Wenbo Zhou, Bang Zhang, Weiming Zhang 0001. 1-6 [doi]
- Bayesian-Inspired Cross-Spectral Fusion Network for Robust Depth EstimationJiafu Yan, Wenwei Luo, Yuguo Hu, Changhua Zhang, Mengmeng Jing, Lin Zuo. 1-6 [doi]
- Exploiting Long and Short Temporal Dependence for Low-Light Video EnhancementHao Luo, Lingyu Zhu 0006, Yudong Mao, Yixuan Li, Zhiwei Zhong, Shanshe Wang, Shiqi Wang 0001. 1-6 [doi]
- Combating the Negative Optimization in Source-Free Domain Adaptive Medical Image Segmentation via Selective Online Self-TrainingWenjuan Zhou, Wei Chen 0009, Yulin He, Chen Li 0034. 1-6 [doi]
- SCGRL: Graph representation learning based on edge structure contrastive self-supervised frameworkRuishuang Sun, Ruiting Wang, Enguang Zuo, Junyu Zhu, Chen Chen 0078, Cheng Chen, Xiaoyi Lv. 1-6 [doi]
- DCCL: Discriminative Cosine Center Learning for 3D Cross-Modal Retrieval with Real-world ImageZengyu Liu, Zhitao Liu, Yi Li, Zhenjiang Du, Lei Zhang, Ning Xie 0003. 1-6 [doi]
- BAMNet: A Brain Area Mapping-Based Multimodal Saliency Prediction MethodShibo Wang. 1-6 [doi]
- Teaching LLMs for Step-Level Automatic Math Correction via Reinforcement LearningJunsong Li, Jie Zhou 0015, Yutao Yang, Bihao Zhan, Qianjun Pan, Yuyang Ding, Qin Chen 0001, Jiang Bo, Xin Lin 0001, Liang He 0001. 1-6 [doi]
- Privacy-Preserving Gait Authentication Scheme Based on Partial Euclidean Distance in Cloud ComputingTong Ji, Yunting Tao, Fanyu Kong, Guoyan Zhang, Yuliang Shi, Jia Yu 0003. 1-6 [doi]
- ICG-MVSNet: Learning Intra-view and Cross-view Relationships for Guidance in Multi-View StereoYuxi Hu, Jun Zhang 0102, Zhe Zhang, Rafael Weilharter, Yuchen Rao, Kuangyi Chen, Runze Yuan, Friedrich Fraundorfer. 1-6 [doi]
- Spatial-Spectral Fusion Neural OperatorWei Li 0034, Jiawei Jiang 0002, Ni Xu, Ying Cui, Yan Li, Jianwei Zheng 0001. 1-6 [doi]
- Incongruity-aware Cross-modal Interaction Network for Multimodal Sarcasm DetectionYujun Wu, Chen Wang, Meixuan Chen, Tongguan Wang, Ying Sha. 1-6 [doi]
- Instruction-aware Memory Network for Video RecognitionBimei Wang, Haijiang Li, Jisheng Dang, Yun Wang, Zhixuan Chen, Jiyuan Lin, Teng Wang, Jun Yang. 1-6 [doi]
- BanditRewriter: Training-free Adaptive Prompt Optimization for Text-to-Image GenerationAo Shen, Yue Liu, Yanlei Shang. 1-6 [doi]
- Clouds and Haze Co-Removal Based on Saliency-Guided Multi-Scale Diffusion Model for Remote Sensing ImagesJingxuan Zhang, Libao Zhang. 1-6 [doi]
- KAN-SAM: Kolmogorov-Arnold Network Guided Segment Anything Model for RGB-T Salient Object DetectionXingyuan Li, Ruichao Hou, Tongwei Ren, Gangshan Wu. 1-6 [doi]
- Take What I Need: Active Data Distillation for Federated LearningHongcheng Li, Yucan Zhou, Yibin Wang, Xiaoyan Gu 0001, Bo Li 0063, Weiping Wang 0005. 1-6 [doi]
- HarmonyIQA: Pioneering Benchmark and Model for Image Harmonization Quality AssessmentZitong Xu, Huiyu Duan, Guangji Ma, Liu Yang, Jiarui Wang, Qingbo Wu, Xiongkuo Min, Guangtao Zhai, Patrick Le Callet. 1-6 [doi]
- DWS-FedSeg: A Federated Learning Framework for Automatic Segmentation of CT and MRI ImagesYunhe Feng, Lingren Wang, Jiaxin Wang. 1-6 [doi]
- CDFormer: Cross-Domain Few-Shot Object Detection Transformer Against Feature ConfusionBoyuan Meng, Xiaohan Zhang, Peilin Li, Zhe Wu, Yiming Li, Wenkai Zhao, Beinan Yu, Hui-Liang Shen. 1-6 [doi]
- MFA-Net: A Multi-Stage Network for Facial Acupoint Localization with Global-Local Feature Fusion and Acupoint EncodingChao Liu, Chuanlin Liao, Tingting Zhang, Yi Lin. 1-6 [doi]
- Multi-Task Self-Supervised Learning for Automated Measurement of Left Ventricular Ejection Fraction in EchocardiographyZhanpeng Xu, Yu Lu 0001, Wei Zhang, Xiaoqing Li 0005, Shijie Shi, Xianghua Fu. 1-6 [doi]
- Aparecium: Revealing Secrets from Physical PhotographsZhe Lei, Jie Zhang, Jingtao Li, Tianwei Zhang 0004, Haibin Kan, Weiming Zhang, Nenghai Yu. 1-6 [doi]
- GA-Clip: Semantic-Aware Graph Augmentation for Contrastive LearningShuaiqi Lu, Yi Guo 0008, Zhenlin An, Yan Zhu, Ning Huang. 1-6 [doi]
- Roadside Monocular 3D Detection for Small Objects: A Novel Feature Enhancement by Pyramid Depth Prediction and Regional RefinementJie Tang, Haoran Pan. 1-6 [doi]
- End-To-End Casual Video Reconstruction: Geometry, Pose and MotionWenyu Li, Peng Qiao, Sidun Liu, Zongxin Ye, Ziteng Zhang, Zhenglun Sun, Yong Dou. 1-6 [doi]
- Enhanced Cross-modal 3D Retrieval via Tri-modal ReconstructionJunlong Ren, Hao Wang. 1-6 [doi]
- BiFD: A Bidirectional Feature Discrepancy Defense against Hijacking Attack in Split LearningXiaoyang Xu 0001, Wenzhe Yi, Juan Wang 0006, Yong Zhuang, Mengda Yang, Ziang Li, Yaxin Liu. 1-6 [doi]
- CAPAA: Classifier-Agnostic Projector-Based Adversarial AttackZhan Li, Mingyu Zhao, Xin Dong, Haibin Ling, BingYao Huang. 1-6 [doi]
- Semantic-guided Representation Learning for Multi-Label RecognitionRuhui Zhang, Hezhe Qiao, Pengcheng Xu, Mingsheng Shang 0001, Lin Chen 0023. 1-6 [doi]
- CAMCKG: A Framework for Trigger-Action Recommendation Combining Attention Mechanism and Continuous Kernel Graph ConvolutionJiangfeng Li, Shijie Wang, Zijun Huang, Yifan Li. 1-6 [doi]
- Navigating the Implicit Map: Community-Aware Disentangled Experts for Multi-Modal Knowledge Graph CompletionShichong Li, Bin Chen, Yichen Xin, Zhangtao Cheng, Qing Chen, Ting Zhong, Fan Zhou 0002. 1-6 [doi]
- Adaptive Gaussian Mixture Model with Hierarchical Propagation for One-Class Graph Fraud DetectionXiaoxiang Li, Xinyu Jiang, Zining Wang, Chang Liu, Zhibin Ni, Hai Wan, Xibin Zhao. 1-6 [doi]
- Language-Conditioned Waypoint Predictor for Continuous Vision-and-Language NavigationZeyu Wang, Yuankai Qi, Dong An, Xu Yang, Hongxin Li, Zhaoxiang Zhang 0001. 1-6 [doi]
- A Multi-Stage Framework for Multimodal Controllable Speech SynthesisRui Niu, Weihao Wu, Jie Chen, Long Ma, Zhiyong Wu 0001. 1-6 [doi]
- Cross-modal Shared Concept Learning for Text-to-Image Person RetrievalDi He 0008, Xinshan Zhu, Lan Zhang, Siyu Wang, Zhong Zhang. 1-6 [doi]
- Enabling Haptic-Integrated Interactive Holographic Video Streaming Powered by 5G Edge ComputingPeng Qian, Ning Wang 0001, Carl C. Udora, Carlos Velez Redondo, Jingxuan Men, Rahim Tafazolli. 1-6 [doi]
- DreamPBR: Text-driven High-Resolution SVBRDF Generation with Multimodal GuidanceLinxuan Xin, Zheng Zhang, Zhiyi Pan 0001, Jinfu Wei, Duan Gao, Wei Gao 0003. 1-6 [doi]
- Spatio-Temporal Point Convolutional Network With Meta-motion Level Refinement for Point Cloud-Based Human Action RecognitionQian Huang, Zhaoyu Chen, Ge Gao, Shihao Han, Qing Meng, Xing Li. 1-6 [doi]
- Dual-Domain Iterative Refinement Network for Camouflaged Object DetectionQingzheng Wang, Ning Li, Jiazhi Xie. 1-6 [doi]
- CLIP-driven Few-Shot Continual LearningZiqi Gu, Chunyan Xu, Zhen Cui 0001. 1-6 [doi]
- Image Demoiréing Using Dual Camera Fusion on Mobile PhonesYanting Mei, Zhilu Zhang 0001, Xiaojun Wu 0001, Wangmeng Zuo. 1-6 [doi]
- Keyword-Oriented Multimodal Modeling for Euphemism IdentificationYuxue Hu, Junsong Li, Meixuan Chen, Dongyu Su, Tongguan Wang, Ying Sha. 1-6 [doi]
- Efficient Knowledge Transfer in Multi-Task Learning through Task-Adaptive Low-Rank RepresentationXiao Zhang, Kangsheng Wang, Tianyu Hu, Huimin Ma 0001. 1-6 [doi]
- DB-NeRF: An Effective Dual-Branch Representation for Neural Radiance FieldsHailan Shen, Yixiang Jiang, Zailiang Chen 0001, Xujing Liu, Jian Zhang. 1-6 [doi]
- Temporal Invariant Feature Combined with Arbitrary Enhancement for Missing Modality Emotion RecognitionJiahao Fan, Weiting Chen, Zheming Fan, RuiZhi Yu. 1-6 [doi]
- AMUSE: Adaptive Multi-Segment Encoding for Dataset WatermarkingSaeed Ranjbar Alvar, Mohammad Akbari, David Ming Xuan Yue, Yong Zhang. 1-6 [doi]
- BFPS: A Boundary-Focused Polyp Segmentation Model via Frequency Domain SeparationWanqi Ma, Huanhuan Lv, Songru Jiang, Jiale Wu. 1-6 [doi]
- Visual Semantic Description Generation with MLLMs for Image-Text MatchingJunyu Chen, Yihua Gao, Mingyong Li. 1-6 [doi]
- Conditional Residual Coding with Explicit-Implicit Temporal Buffering for Learned Video CompressionYi-Hsin Chen, Kuan-Wei Ho, Martin Benjak, Jörn Ostermann, Wen-Hsiao Peng. 1-6 [doi]
- History Tracker: Retrieving Historical Image Embeddings for Efficient Fine-Grained Reasoning in Vision-Language ModelsJiahua Bao, Siyao Cheng, Jiaxing Du, Ziqian Li, Changjiang He, Jie Liu 0001. 1-6 [doi]
- False Negatives Consensus Suppression for Text-to-Image Person Re-identificatioRuigeng Zeng, Wentao Ma 0003, Qinglin Wang, XinJun Mao, Jie Liu 0002. 1-6 [doi]
- Beckman Adversarial DefenseA. V. Subramanyam. 1-6 [doi]
- MEScan360: A Memory-Enhanced Scanpath Prediction Model for Omnidirectional ImagesYuchen Zhang, Dandan Zhu 0001, Kaiwei Zhang, Fei Jiang, Guangtao Zhai. 1-6 [doi]
- Real-World Video Dehazing based on Optical Flow Deformable Attention Fusion and Contrastive LearningMengnan Zhang, Gang Zhou, Linghui Ma, Zhaoxi Liu, Li Zhang, Zhenhong Jia. 1-6 [doi]
- Multi-Passage Retrieval-Augmented Multimodal Language Generation Model for Knowledge-Based Visual Question AnsweringSiyu Cheng, Chao Yang, Bin Jiang. 1-6 [doi]
- Beyond the Label: Unveiling Fairness through Dynamic Attribute Projections in ClassificationHaoze Jiang, Zunlei Feng, Jiacong Hu, Binde Hu, Mingli Song, Yuanyu Wan. 1-6 [doi]
- Semantic-Aware Adaptation with Hierarchical Multimodal Prompts for Few-Shot LearningWenhao Li, Qiangchang Wang, Jing Li, Shengnan Zhao, Mindi Ruan, Yilong Yin. 1-6 [doi]
- MSD-HENet: Multi-Scale Detail-Preserving Holistic Enhancement Network for Infrared ImagesYijing Zhao, Chao Wang, Guanyu Liu, Yumeng Liu, Ruiheng Zhang. 1-6 [doi]
- SS-MPP: Semi-Supervised Shape-Aware Medical Image Segmentation Based on Multi-Scale Pixel-Wise PrototypeKanqi Wang, Xiaowei Lu, Haoyun Wang, Yang Zhao 0019, Gang Liu. 1-6 [doi]
- Uncertainty-guided Multi-modal Sequential RecommendationLi Yin, Baigang Mi, Yi Fan. 1-6 [doi]
- Component Adaptive Clustering for Generalized Category DiscoveryMingfu Yan, Jiancheng Huang, Yifan Liu 0001, Shifeng Chen. 1-6 [doi]
- AADN++: Latent Feature Improves Adversarial Defense Transferability on Object TrackingZhewei Wu, Ruilong Yu, Shilin Qiu, Qihe Liu, Shijie Zhou 0002, Zhun Zhang. 1-6 [doi]
- DMDH: Decentralized Multi-agent Distributed Hashing for Multimedia RetrievalYunfei Chen 0015, Yitian Long, Zhan Yang 0001, Jun Long. 1-6 [doi]
- Vote & Mix: Plug-and-Play Token Reduction for Efficient Vision TransformerShuai Peng, Di Fu, Baole Wei, Liangcai Gao, Zhi Tang 0001. 1-6 [doi]
- Evidential Graph Contrastive Alignment for Source-Free Blending-Target Domain AdaptationJuepeng Zheng, Yibin Wen. 1-6 [doi]
- ScNet: Scene-Consistency Network Learning for Multi-Agent Motion ForecastingJianxin Shi, Xiaolong Chen, Yusen Xie, Jinhao Chen, Fali Wang, Jun Ma, Tianyu Wo. 1-6 [doi]
- CE-LoRA: Consistent Person Synthesis by Exploring the Model's Spatial ConsistencyDelong Liu, Cheng Lei, Shuai Jiang, Zhicheng Zhao 0001, Fei Su. 1-6 [doi]
- Global Perception Federated Recommender System for Click-Through Rate PredictionYicheng Di, Jiansong Fan, Rui Zhang, Song Shen, Jiayu Bao, Rongsheng Hu, Yuan Liu 0021. 1-6 [doi]
- HMPNet: A Feature Aggregation Architecture for Maritime Object Detection from a Shipborne PerspectiveYu Zhang, Fengyuan Liu, Juan Lyu, Yi Wei, Changdong Yu. 1-6 [doi]
- IllusionBench: A Large-scale and Comprehensive Benchmark for Visual Illusion Understanding in Vision-Language ModelsYiming Zhang, Zicheng Zhang, Xinyi Wei, Xiaohong Liu 0001, Guangtao Zhai, Xiongkuo Min. 1-6 [doi]
- DLLM: Enhancing Open-World Object Detection with Dynamic Learning and Large ModelsYangyang Huang, Xing Xi, Ronghua Luo. 1-6 [doi]
- G4Seg: Generation for Inexact Segmentation Refinement with Diffusion ModelsTianjiao Zhang, Fei Zhang 0016, Jiangchao Yao, Ya Zhang 0002, Yanfeng Wang 0001. 1-6 [doi]
- UniBind: Leveraging LLM-Augmented Knowledge Base for Scene IntegrationZhonghao Zhang, Ruonan Zhang, Libo Liu. 1-6 [doi]
- Delight-UPS: Uncalibrated Photometric Stereo via Diffusion Model-Based RelightingZhenyu Qiao, Jiajun Sun, MingYun He, Liu Yu, Rui Zhou, Ping Kuang. 1-6 [doi]
- S3SR: Towards Efficient Image Super-Resolution with Selective State Space ModelPei Wang, Xiaotong Luo, Zekun Ai, Yanyun Qu. 1-6 [doi]
- Decoupling Overlapped Feature Spaces: When Continual Learning Meets Fine-Grain ClassificationZhikun Feng, Mingyu Wu 0011, Ping Kuang, Kang Dang, Mian Zhou, Liu Yu. 1-6 [doi]
- Pedestrian Trajectory Prediction Driven by Bidirectional Intention-InteractionHang Yu, Yansen Yu, Jiayan Qiu. 1-6 [doi]
- Double-Shrink: Enhancing Model Robustness under SDN Noise by Reducing Uncertain ConfidenceNaihao Wang, Can Zhang, Yunfeng Liu, Wentao Chen, Ruirui Li. 1-6 [doi]
- Learning from Noisy Data Using Pretrained Vision-Language RepresentationsYuqi Liao, Aodong Li, Yisha Chen, Qianfang Xu, Jiarui Xie, Anxin Li, Bo Xiao. 1-6 [doi]
- Multi-Hypothesis 3D Hand Mesh Recovering from a Single Blurry ImageYuming Chen, Rongyu Chen, Zhongqun Zhang, Yihua Cheng, Hyung Jin Chang. 1-8 [doi]
- Label-guided Facial Retouching ReversionGuanhua Zhao, Yu Gu, Xuhan Sheng, Yujie Hu, Jian Zhang. 1-6 [doi]
- Robust Blind Spatio-Temporal Adaptive Video Watermarking Based on 3-D SymmetryFei Zhang, Hongxia Wang. 1-6 [doi]
- CLIP Brings Better Features to Visual Aesthetics LearnersLiwu Xu, Jinjin Xu, Yuzhe Yang 0001, Xilu Wang 0001, Yi-Jie Huang, Yaqian Li. 1-6 [doi]
- OFF3D:Object-Centric Feature Field for 3D Scene SegmentationQinwei Lin, Bing Wang, JunJie Zhao, Jun Xu 0019, Haoqian Wang. 1-6 [doi]
- Contrastive Invariant Risk Minimization for Grounded Situation RecognitionZhaoquan Yuan, Chengbin Zhao, Yuting Tang, Lishu Guo, Xiao Wu 0001, Changsheng Xu. 1-6 [doi]
- Optimizing Efficiency and Visual-Textual Alignment for LLM-Based Radiology Report GenerationZailong Chen, Peng Gao, Yujian Lee, Johan Barthelemy, Luping Zhou, Lei Wang 0001. 1-6 [doi]
- DUPL: Domain-agnostic Unknown-aware Prompt Learning for Threshold-free Open-set Domain GeneralizationFangbin Xu, Dongyue Chen 0001, Shizhuo Deng, Tong Jia 0001, Hao Wang 0073. 1-6 [doi]
- Towards Aligned Data Forgetting via Twin Machine UnlearningHaoxuan Ji, Zheng Lin, Yuyao Sun, Fei Gao, Yuhang Wang, Haichang Gao, Zhenxing Niu. 1-6 [doi]
- A Semantic-Enhanced Heterogeneous Graph Learning Method for Flexible Objects RecognitionKunshan Yang, Wenwei Luo, Yuguo Hu, Jiafu Yan, Mengmeng Jing, Lin Zuo. 1-6 [doi]
- Diff-Cleanse: Identifying and Mitigating Backdoor Attacks in Diffusion ModelsHao Jiang, Jin Xiao, Xiaoguang Hu, Tianyou Chen, Jiajia Zhao. 1-6 [doi]
- Uneven Event Modeling for Partially Relevant Video RetrievalSa Zhu, Huashan Chen, Wanqian Zhang, Jinchao Zhang 0002, Zexian Yang, Xiaoshuai Hao, Bo Li 0063. 1-6 [doi]
- SUEDE: Shared Unified Experts for Physical- Digital Face Attack Detection EnhancementZuying Xie, Changtao Miao, Ajian Liu 0001, Jiabao Guo, Feng Li, Dan Guo, Yunfeng Diao. 1-6 [doi]
- Unified Line Segment Detection and DescriptionXinyu Lin, Yingjie Zhou 0001, Zhen Long, Yipeng Liu 0001, Lu Yang 0002, Ce Zhu. 1-6 [doi]
- Uncertainty-Driven Weakly Supervised Dehazing Network: Integrating Dynamic Attention and Multi-Scale Feature FusionJinbin Wang, Aiping Yang, Yumeng Liu, Qinghua Hu. 1-6 [doi]
- Dual Mutual Information-Driven Multimodal Recommendation with Denoising Graph AutoencoderMengduo Yang, Jie Zhou, Meng Xi 0002, Xiaohua Pan, Ying Li 0001, Yangyang Wu, Jinshan Zhang 0001, Jianwei Yin. 1-6 [doi]
- Utilizing Additional Personalized Representations for Personalized Federated LearningShulan Yin, Yingxun Fu, Li Ma. 1-6 [doi]
- Multi-Scale Hypergraph Relational Reasoning for Weakly Supervised Recognition of Group ActivitiesChongyang Xu, Runtian Zheng, Ziliang Feng, Chengfang Zhang. 1-6 [doi]
- Mamba-SLAM: Enhancing Neural Implicit SLAM with Uncertainty and MambaJiaming Lu, Yunrui Zhu, Ruyu Liu, Xu Cheng 0003, Jianhua Zhang 0002, Bo Sun, Xiufeng Liu 0001. 1-6 [doi]
- Promoting Segment Anything Model towards Highly Accurate Dichotomous Image SegmentationXianjie Liu, Keren Fu, Yao Jiang 0002, Qijun Zhao. 1-6 [doi]
- PhD-GS: Real-World Underwater Scene Reconstruction Using Gaussian SplattingYu Du, Runfa Chen, Wenhang Ge, Fuchun Sun 0001, Ling Wang, Xiao Lv. 1-6 [doi]
- Text to Trajectory: Enhancing and Evaluating LLMs for Embodied Task PlanningYihan Tang, Yong Xu 0007, Ruotao Xu, Yan Huang 0031, Si Wu 0002, Patrick Le Callet. 1-6 [doi]
- MPT-CLIP: Multi-modal Patch-level Prompt Alignment in CLIP for Zero-shot Semantic SegmentationBoliang Hao, Fangyu Wu, Yifan Lu, Bailing Zhang. 1-6 [doi]
- Neural-MCRL: Neural Multimodal Contrastive Representation Learning for EEG-based Visual DecodingYueyang Li, Zijian Kang, Shengyu Gong, Wenhao Dong, Weiming Zeng, Hongjie Yan, Wai-Ting Siok, Nizhuan Wang 0001. 1-6 [doi]
- Embedding Compression Distortion in Video Coding for MachinesYuxiao Sun, Yao Zhao 0001, Meiqin Liu 0002, Chao Yao, Weisi Lin. 1-6 [doi]
- TC-NeRF:Temporal Consistent Neural Radiance Fields with Cross-View Complementation for Occluded Object RemovalZicheng Wu, Li-Hsuan Chang, Kuan-Wen Chen. 1-6 [doi]
- WL-MVSNet: Frequency-Aware and Regularized Learning for Multi-View StereoYan Ma, Ruijie Peng, Suping Wu. 1-6 [doi]
- QCG-SLAM: Quadtree-based Condensed Gaussian Splatting for Visual SLAMXun Fang, Zixuan Hua, Xiao Zhao, Lihua Zhang. 1-6 [doi]
- SpeechPrune: Context-Aware Token Pruning for Speech Information RetrievalYueqian Lin, Yuzhe Fu, Jingyang Zhang, Yudong Liu, Jianyi Zhang, Jingwei Sun 0002, Hai Helen Li, Yiran Chen 0001. 1-6 [doi]
- DRMOE: Towards Better Mixture of Experts via Dual Routing StrategyHaiyang Liu, Shaojian Qiu, Hai Lin, Yingjie Kuang, Shunpeng Li. 1-6 [doi]
- TSRS-Net: A Trilaterally Supervised Residual Network for Accurate Segmentation of Prostate Lesion Ablation Regions from MRI and Surgical Plan ImagesYixin Li, Haifeng Wang, Zhichao Yan, Ye Luo. 1-6 [doi]
- Prototype-guided Vision Foundation Models fine-tuning for Domain Generalized Semantic SegmentationFengwen Liu, Huan Hu, Xiangbin Wu, Wenqiang Hu. 1-6 [doi]
- Multi-Granularity Based Collaborative Learning for Semi-Supervised HashingShuai Cheng 0002, Lin Wang, Xiaoshuai Hao, Wanqian Zhang, Xiaohua Chen, Wei Wang. 1-6 [doi]
- SpatialMe: Stereo Video Conversion Using Depth-Warping and Blend-InpaintingJiale Zhang, Qianxi Jia, Yang Liu 0003, Wei Zhang 0012, Wei Wei, Xin Tian. 1-6 [doi]
- Consolidating Selective SSM with Spatial-Angular and Bidirectional Structural Fusion Perception for Light Field Semantic SegmentationWenbin Yan, Qingwei Wu, Hua Chen, Xiaogang Zhang, Shengjie Hu. 1-6 [doi]
- End-to-End Lyric-to-Melody Generation via Chord Integration and Bar-Level ModelingKe Gu 0003, Peng Bai, Zhen Lei, Yue Zhou 0012, Zhicong Wu, Xiaodong Shi. 1-6 [doi]
- Rethinking 3D Robotic Perception: Elastic Voxel Representation with Splatting DistillationShaohui Pan, Yong Xu 0007, Ruotao Xu, Zihan Zhou 0007, Si Wu 0002, Zhuliang Yu, Patrick Le Callet. 1-6 [doi]
- Enhancing Long Video Understanding via Hierarchical Event-Based MemoryDingxin Cheng, Mingda Li, Jingyu Liu, Yongxin Guo 0001, Bin Jiang, Qingbin Liu, Xi Chen 0003, Bo Zhao. 1-6 [doi]
- STP4D: Spatio-Temporal-Prompt Consistent Modeling for Text-To-4D Gaussian SplattingYunze Deng, Haijun Xiong, Bin Feng 0001, Xinggang Wang, Wenyu Liu 0001. 1-6 [doi]
- Rethinking DeNoising Training for DETR-based Object DetectionBin Jiang 0006, Yiming Fan, Chao Yang 0015, Chenglong Lei, Ruiqi Hu, Zheng Zhou. 1-6 [doi]
- Interact with me: Joint Egocentric Forecasting of Intent to Interact, Attitude and Social ActionsTongfei Bian, Yiming Ma 0003, Mathieu Chollet, Victor Sanchez, Tanaya Guha. 1-6 [doi]
- Few-Shot 3D Face Generation via a Controllable Diffusion Model Guided by Text and ImagesJinfu Wei, Zheng Zhang, Qinchuan Zhang, Ran Liao, Duan Gao. 1-6 [doi]
- Source-Free Domain Adaptation via Transformer-based Object-centric PerceptionZiyun Cai, Weilong Gao, Yawen Huang, Jie Song 0014, Chang-Hui Hu 0001, Tengfei Zhang 0001. 1-6 [doi]
- CLAP: Overcoming Language Priors via Contrastive Learning and Answer PerturbationHaoquan Wang, Yong Chen, Shengbo Chen, Hong Rao. 1-6 [doi]
- DF-Net: A Dual Fusion Network for Accurate Video Temporal GroundingHaolong Yan, Binghao Tang, Boda Lin, Jiachen Li, Si Li 0001. 1-6 [doi]
- Object Placement for AnythingBingjie Gao, Bo Zhang, Li Niu. 1-6 [doi]
- Multi-Attention Guided Knowledge Distillation For High-Performance Object DetectionZhihao Kong, Qifeng Lin, Qishen Shen, Jiayi Qiu, Gang Fu, Yuanlong Yu. 1-6 [doi]
- Texture-aware Intrinsic Image Decomposition with Model- and Learning-based PriorsXiaodong Wang, Zijun He, Xin Yuan. 1-6 [doi]
- HDCompression-DNA: Hybrid-Diffusion Neural Image Compression via DNA StorageCihan Ruan, Lei Lu, Rongduo Han, Wei Jiang, Wei Wang, Haoyu Wu, Qiming Yuan, Yanting Guo, Yanzhi Wang, Nam Ling. 1-6 [doi]
- FlowJD: Your Imagination Can Help You Jailbreak in Visual Language ModelsXiaotian Zou, Yongkang Chen, Qianqian Han, Ke Li. 1-6 [doi]
- Continuous Lane Detection Network with Hybrid Feature Fusion and Differential AggregationZhiqiang Zeng, Longpei Wu, Xiaodong Wang, Fei Yan, Haiyan Huang. 1-6 [doi]
- Refining Interactions: Enhancing Anisotropy in Graph Neural Networks with Language SemanticsZhaoxing Li, Haifeng Zhang, Xiaoming Zhang, Chengxiang Liu. 1-6 [doi]
- Decoding Emotional Silences: Reliable Multimodal Sentiment Analysis with Bipolar UncertaintyYutao Wei, Hongzhu Fu, Yuxiang Li, Yichen Xin, Xovee Xu, Fan Zhou 0002, Ting Zhong. 1-6 [doi]
- Efficient Shared KVCache Attention Inference for Multimodal Large Language ModelsShouxu Kuang, Limin Cheng, Yixin Chen, Hang Qin, Ling Li. 1-6 [doi]
- Concretely Efficient Three-party Oblivious SelectionShang Song, Lin Liu 0018, Rongmao Chen, Wei Peng 0005. 1-6 [doi]
- EFDiT: Efficient Fine-grained Image Generation Using Diffusion Transformer ModelsKun Wang, Donglin Di, Tonghua Su, Lei Fan 0007. 1-6 [doi]
- Multi-Scale Diffusion: Enhancing Spatial Layout in High-Resolution Panoramic Image GenerationXiaoyu Zhang, Teng Zhou, Xinlong Zhang, Jia Wei, Yongchuan Tang. 1-6 [doi]
- Action Decomposition-based Actor-Critic for Supply Chain OptimizationZhengrong Chen, Qinghua Zhu, An Zeng, Yuzhu Ji, Baoyao Yang, Dan Pan 0001. 1-6 [doi]
- InterLayer: Efficient Inference with Interleaved Scheduling and Layer-Specific OptimizationLimin Cheng, Hang Qin, Shouxu Kuang, Xinyu Wang, Ling Li, Yanjun Wu, Chen Zhao. 1-6 [doi]
- TRAMFuse: Text image Tampering Detection via Directional Residual Attention MechanismXingqian Guo, Tingting Chai, Lunke Fei, Jialing Xu, Guanglu Zhou, Xiangqian Wu, Haoxing Cao. 1-6 [doi]
- Mutual Semantic Bridged Tri-Tower Fusion for Audio-Visual SegmentationJingqi Qu, Hui Yu, Dongchen Zhu, Jiamao Li. 1-6 [doi]
- SMPL Normal Map Is All You Need for Single-view Textured Human ReconstructionWenhao Shen, Gangjian Zhang, Jianfeng Zhang, Yu Feng, Nanjie Yao, Xuanmeng Zhang, Hao Wang. 1-6 [doi]
- Video Flow as Time Series: Discovering Temporal Consistency and Variability for VideoQAZijie Song, Zhenzhen Hu 0004, Yixiao Ma, Jia Li 0013, Richang Hong. 1-6 [doi]
- QTG-VQA: Question-Type-Guided Architectural for VideoQA SystemsZhixian He, Pengcheng Zhao, Shujin Lin. 1-6 [doi]
- Progressively Enhanced Camouflaged Object Detection via Boundary AwarenessJinyang Wang, Wei Wu. 1-6 [doi]
- UniSep: Universal Target Audio Separation with Language Models at ScaleYuanyuan Wang, Hangting Chen, Dongchao Yang, Weiqin Li, Dan Luo, Guangzhi Li, Shan Yang, Zhiyong Wu 0001, Helen Meng, Xixin Wu. 1-6 [doi]
- Adaptive Semantic Alignment for Automated Radiology Report Generation via Cross-Modal Knowledge IntegrationSibo Ju, Zhaozhen Chen, Yulong Xiao, Yiqing Shen 0003, Yanzhou Su, Kai Chen, Xiangwen Liao. 1-6 [doi]
- HLV-1K: A Large-scale Hour-Long Video Benchmark for Time-Specific Long Video UnderstandingHeqing Zou, Tianze Luo, Guiyang Xie, Victor Xiao Jie Zhang, Fengmao Lv, Guangcong Wang, Junyang Chen, Zhuochen Wang, Hansheng Zhang, Huaijian Zhang. 1-6 [doi]
- Exploring State Space Model in Wavelet Domain: An Infrared and Visible Image Fusion Network via Wavelet Transform and State Space ModelTianpei Zhang, Yiming Zhu, Jufeng Zhao, Guangmang Cui, Yuchen Zheng. 1-6 [doi]
- Adaptive Training Meets Progressive Scaling: Elevating Efficiency in Diffusion ModelsWenhao Li, Xiu Su, Yu Han, Shan You, Tao Huang 0020, Chang Xu 0002. 1-6 [doi]
- Elastic Architecture Search for Efficient Language ModelsShang Wang. 1-6 [doi]
- Multimodal Mixture of Low-Rank Experts for Sentiment Analysis and Emotion RecognitionShuo Zhang, Jinsong Zhang, Zhejun Zhang, Lei Li. 1-6 [doi]
- Key-semantics Alignment Learning with Contextual Understanding for Video Moment RetrievalChenghua Gao, Min Li, Junxing Ren, Lin Chen, Jitao Fu, Wenwen Su. 1-6 [doi]
- Cross-Modal Semantic-Aware Network for Audio-Visual Event LocalizationLiang Liu, Shuaiyong Li, Yongqiang Zhu. 1-6 [doi]
- STGGait: A Graph Transformer Network for Pose-based Gait RecognitionWansong Qin, Zhijie Han, Yaru Li. 1-6 [doi]
- RIDE: Robust and Decentralized Federated Learning with Input ValidationZhi Lu, Mengyuan Zou, Samir M. Umran, Yuhao Long, Songfeng Lu, Junjun Wu, Mu Wang. 1-6 [doi]
- SAVE-GSL: Scalable and Expressive Graph Structure Learning for Large GraphsManxin Xu, Shengjie Zhao 0001, Jin Zeng, Weichao Chen 0001, Shilong Dong. 1-6 [doi]
- A Synthetic-to-Real Dehazing Method based on Domain UnificationZhiqiang Yuan, Jie Zhou, Jinchao Zhang. 1-6 [doi]
- Redesigning Upsampling in Decoders with Aligned Feature Aggregation for Semantic SegmentationQinjie Hu, Fei Qi 0001, Kaiwen Fu, Chengyuan Chang, Xiaotian Wang 0001, Kun Liu 0015, Guangming Shi. 1-6 [doi]
- DiffusionTalker: Efficient and Compact Speech-Driven 3D Talking Head via Personalizer-Guided DistillationPeng Chen, Xiaobao Wei, Ming Lu, Hui Chen 0020, Feng Tian 0001. 1-6 [doi]
- Mask-Guided Transformer with Hybrid Supervision for 3D Instance SegmentationQi Zeng, Jianwei Guo, Haobo Qin, Yinchang Zhou, Weiliang Meng, Xiaopeng Zhang 0001. 1-6 [doi]
- Patch-Wise Hypergraph Contrastive Learning with Dual Normal Distribution Weighting for Multi-Domain Stain TransferHaiyan Wei, Hangrui Xu, Bingxu Zhu, Yulian Geng, Aolei Liu, Wenfei Yin, Jian Liu. 1-6 [doi]
- Scalable Multi-Kernel Clustering with Dynamic ProcrustesLizhu Wu, Yan Chen 0036, Peng Zhou 0006, Liang Du 0003. 1-6 [doi]
- RBDN: A Robust Background Denoising Network for Weakly Supervised Temporal Language GroundingYifan Lyu, Zehua Zang, Hongzhou Wu, Lixiang Liu, Jiangmeng Li. 1-6 [doi]
- Multi-Modality Representation Learning for Antibody-Antigen Interactions PredictionPeijin Guo, Minghui Li, Hewen Pan, Ruixiang Huang, Lulu Xue, Shengqing Hu, Zikang Guo, Wei Wan, Shengshan Hu. 1-6 [doi]
- SwinCAE: Capsule Autoencoder using Shifted Windows for 3D Human Pose EstimationXiufeng Liu 0005, Zhongqiu Zhao, Yi Yang 0001, Donghui Hu, Zhao Zhang 0001. 1-6 [doi]
- Model-Guardian: Protecting against Data-Free Model Stealing Using Gradient Representations and Deceptive PredictionsYunfei Yang, Xiaojun Chen, Yuexin Xuan, Zhendong Zhao. 1-6 [doi]
- Target Distribution Agnostic Domain Adaptation for in-the-Wild Image Classification under Both Domain and Label ShiftsAotian Zheng, Jenq-Neng Hwang, Rania Hussein, Farron Wallace, Kelsey Magrane, Lauren Shiosaka. 1-6 [doi]
- One-Shot Federated Learning with Classifier-Free Diffusion ModelsObaidullah Zaland, Shutong Jin, Florian T. Pokorny, Monowar H. Bhuyan. 1-6 [doi]
- Confidence Breeds Success: Improving Fake News Video Detection via LVLM-Assisted InferenceYuchen Zhang, Mingxin Li, Chao Gao, Xianghua Li. 1-6 [doi]
- HSRMamba: Efficient Wavelet Stripe State Space Model for Hyperspectral Image Super-ResolutionBaisong Li, Xingwang Wang, Haixiao Xu. 1-6 [doi]
- PhysFFTFormer: A Frequency Domain-based Vision Transformer for Efficient Remote Physiological MeasurementFangyuan Liu, Sirui Zhao, Tong Xu 0001, Yu Sun 0021, Hao Wang 0076, Suojuan Zhang, Enhong Chen. 1-6 [doi]
- GLRB: Heterogeneous Federated Continual Learning via Global and Local RebalanceHaodong Zhang, Liu Yang, Zihan Jiang. 1-6 [doi]
- Trans-Diff:Transformer-based Video Summarization with DiffusionCai Pan, Guowei Zhang, Rui Zhong. 1-6 [doi]
- Few-shot Prompt Learning with Large Vision-Language Model for Image Deep HashingYe Liu, Yan Pan 0002, Jian Yin 0001. 1-6 [doi]
- DynamicGaussian: Spatio-temporally Consistent 4D Gaussian Splatting for High-Fidelity Monocular Videos ReconstructionWenhao Dong, Youwen Yuan, Bowen Zhang, Xi Zhao. 1-6 [doi]
- MLLM-DataEngine: Closing the Loop of Multimodal Instruction Tuning Data GenerationZhiyuan Zhao, Bin Wang 0065, Linke Ouyang, Yiqi Lin, Pan Zhang 0001, Xiaoyi Dong, Jiaqi Wang 0003, Conghui He. 1-6 [doi]
- From 2D Images to 3D Model: Weakly Supervised Multi-View Face Reconstruction with Deep FusionWeiguang Zhao, Chaolong Yang, Jianan Ye, Rui Zhang 0012, Yuyao Yan, Xi Yang 0008, Bin Dong 0003, Amir Hussain 0001, Kaizhu Huang. 1-6 [doi]
- Geometrically-plausible and Semantically-consistent Generation of Indoor PanoramasZhiliang Zeng, Mengyang Wu, Xianzhi Li, Wenzhao Gao, Shaohui Jiao, Chi-Wing Fu. 1-6 [doi]
- MG-STK: Weakly Supervised Multi-Granularity Learning Guided by Semantic Topological KnowledgeQi Shen, Liu Yang, Canguang Ruan. 1-6 [doi]
- WDiff: Wavelet-based Diffusion Models for Surgical Endoscopic Image Low-Light EnhancementZeyu Lei, Lidan Fu, Anqi Xiao, Jie Tian 0001, Zhenhua Hu. 1-6 [doi]
- FSRF: Factorization-guided Semantic Recovery for Incomplete Multimodal Sentiment AnalysisZiyang Liu, Pengjunfei Chu, Shuming Dong, Chen Zhang, Mingcheng Li, Jin Wang. 1-6 [doi]
- Fast CU Partition Algorithm For 360-Degree Videos on VVCDayong Wang, Shijie Du, Yu Sun, Shuyin Xia, Frédéric Dufaux, Hongwei Guo 0001, Guo-Yin Wang 0001, Ce Zhu. 1-6 [doi]
- The Motion in the Details: Adapting CLIP for Action Recognition via Dual-prompt GuidanceLongjuan Sun, Xixia Xu, Dongchen Zhu, Jiamao Li. 1-6 [doi]
- A Novel Perspective on Leveraging Hubness in VAE for Eliminating Representative Shift Vectors in Few-Shot LearningQuanlin Chen, Chunjin Ye, Yiming Ma, Jiahui Pan 0003, Jingcong Li. 1-6 [doi]
- Foreground Focus: Enhancing Coherence and Fidelity in Camouflaged Image GenerationPei-Chi Chen, Yi Yao, Chan-Feng Hsu, Hong-Xia Xie, Hung-Jen Chen 0001, Hong-Han Shuai, Wen-Huang Cheng. 1-6 [doi]
- Training Robust DNNs with Noisy Labels via Contrastive Re-Calibration LearningYongfeng Dong, Jiaji Wang, Zhen Wang, Guifang Wu, Hao Cheng. 1-6 [doi]
- Enhancing Handwritten Mathematical Expression Recognition with Structure and Counting Aware NetworkShiqi Mou, Zijie Li, Juxiang Zhou, Jun Wang 0101, Jianhou Gan. 1-6 [doi]
- Corer: Concept Residue Erasing in Text-to-Image Diffusion ModelsYufan Liu 0002, Jinyang An, Huashan Chen, Wanqian Zhang, Ming Li, Dayan Wu, Jingzi Gu, Zheng Lin 0001, Weiping Wang 0005. 1-6 [doi]
- DPCD: A Quality Assessment Database for Dynamic Point CloudsYating Liu, Yujie Zhang, Qi Yang 0003, Yiling Xu, Zhu Li 0001, Ye-Kui Wang. 1-6 [doi]
- MamFusion: Multi-Mamba with Temporal Fusion for Partially Relevant Video RetrievalXinru Ying, Jiaqi Mo, Jingyang Lin, Canghong Jin, Fangfang Wang, Lina Wei. 1-6 [doi]
- Adaptive Distribution-Aware Modeling for Transformer TrackingMingyu Cao, Huibin Tan, Xueqiong Li, Wanrong Huang, Kedi Zhang, Yuhua Tang, Shaowu Yang. 1-6 [doi]
- NOVA3D: Normal Aligned Video Diffusion Model for Single Image to 3D GenerationYuxiao Yang, Peihao Li 0003, Yuhong Zhang, Junzhe Lu 0001, Xianglong He, Minghan Qin, Weitao Wang, Haoqian Wang. 1-6 [doi]
- A Knowledge Noise Mitigation Framework for Knowledge-based Visual Question AnsweringZhiyue Liu, Sihang Liu, Jinyuan Liu, Xinru Zhang. 1-6 [doi]
- Enhancing Open-Vocabulary Panoptic Segmentation with Semantic-Guided Q-TuningYanxiang Huang, Kai Zhang, Yuxiang Wang, Dongtai Du, Yuping Yuan, Zheng Zhao. 1-6 [doi]
- HyperMAN: Hypergraph-enhanced Meta-learning Adaptive Network for Next POI RecommendationJinze Wang, Tiehua Zhang, Lu Zhang 0063, Yang Bai, Xin Li, Jiong Jin. 1-6 [doi]
- DiffLane: Diffusion Model-Based Lane Mask Generation for Accurate Video Lane DetectionWenxiang Liu, Yongkang Liu, Weiliang Meng, Gaoqi He, Jianhua Li. 1-6 [doi]
- Efficient Explicit Joint-level Interaction Modeling with Mamba for Text-guided HOI GenerationGuohong Huang, Ling-An Zeng, Zexin Zheng, Shengbo Gu, Wei-Shi Zheng 0001. 1-6 [doi]
- Nucleus-SAM:Point-Supervised SAM for Nucleus SegmentationYu Zhou, Xing Wu, Liangshan Zhu, Chengliang Wang, Zailin Yang, Yao Liu. 1-6 [doi]
- SCA-ZegCLIP: Shape- and Context-aware CLIP for Zero-shot Semantic SegmentationChunrui Li, Yi Zhang, Shu Hu. 1-6 [doi]
- Object-Centric Feature Enrichment for Single-Domain Generalized Object DetectionShukuan Yuan, Zihao Zhang, Yahong Han. 1-6 [doi]
- Fed3D: Enhancing Security in Federated Learning with Dataset DistillationCanhui Wu, Wei Xi 0003, Yuwei Fan, Yuhao Shen 0001, Jizhong Zhao. 1-6 [doi]
- Multi-granularity Frequency Difference-Aware Attention for Video Question AnsweringMingyang Liu, Fan Zhou 0001, Ruomei Wang 0001, Baoquan Zhao. 1-6 [doi]
- Supplementary Material for STTODE: Spatio-Temporal Transformer Ordinary Differential Equation Networks for Pedestrian Trajectory ForecastingYi Zou, Yingjie Liu, Jian Yang, Mingsong Chen, Xuan Tang, Xian Wei. 1-6 [doi]
- Adversarial Examples Detection Based on Adversarial Attack SensitivityCong Ming, Haojie Yuan, Xiangwen Wang, Qi Chu 0001, Tao Gong, Bin Liu 0016, Nenghai Yu. 1-6 [doi]
- HSACNet: Hierarchical Scale-Aware Consistency Regularized Semi-Supervised Change DetectionQi'ao Xu, Pengfei Wang, Yanjun Li, Tianwen Qian, Xiaoling Wang. 1-6 [doi]
- MambaMIC: An Efficient Baseline for Microscopic Image Classification with State Space ModelsShun Zou, Zhuo Zhang 0020, Yi Zou, Guangwei Gao. 1-6 [doi]
- A Watermark Updating Framework for Multi-stage Image Content DistributionYanyan Liu, Bin Liu, Jie Zhang, Xiang Zhang, Zehua Ma, Nenghai Yu. 1-6 [doi]
- Pixel-Level Adaptive Refinement Framework with Knowledge Distillation for Weakly Supervised Semantic SegmentationYulian Li, Xinfang Qin, Zhengwen Shen, Shuyu Han, Jun Wang. 1-6 [doi]
- GAMED-Snake: Gradient-aware Adaptive Momentum Evolution Deep Snake Model for Multi-organ SegmentationRuicheng Zhang, Haowei Guo, Zeyu Zhang, Puxin Yan, Shen Zhao. 1-6 [doi]
- ConAvatar: Harnessing Facial Mesh for Controllable Avatar AnimationZhen Tan, Wei Wei. 1-6 [doi]
- VG-Net: Vision Transformer based Graph Fusion Representation for Multi-label Pattern Image RetrievalErwan Ye, Ying Li. 1-6 [doi]
- Global-to-Local Color Correction with Full-Region Coverage for Multi-view Light Field ImagesYixu Huang, Rui Zhong, Ségolène Rogge, Adrian Munteanu 0001. 1-6 [doi]
- Enhancing Dynamic CAPTCHA Verification Based on Multimodal Trustworthiness Fusion NetworkChenxi Liu, Huayu Shou, Yuqing Yin, Xu Yang 0011, Qiang Niu. 1-6 [doi]
- A Generalizable and Expressive Meta-Diffusion Policy for RTC Bandwidth PredictionZhiyuan Chen, Nuowen Kan, Chenglin Li, Wenrui Dai, Junni Zou, Hongkai Xiong. 1-6 [doi]
- Federated Open-Set Domain Generalization with Adaptive Adjustment Boundary and WeightsHaoyuan Liang, Shilei Cao 0005, Yushan Lai, Juepeng Zheng. 1-6 [doi]
- Masked Generative Extractor for Synergistic Representation and 3D Generation of Point CloudsHongliang Zeng, Ping Zhang, Fang Li, Jiahua Wang, Tingyu Ye, Zichen Wei. 1-6 [doi]
- Contextualizing Borderline ECG Analysis via Multi-Modal Feature Extraction and Large Language Model InferenceYanlin Xu, Yiwei Ru, Dongsen Zhang, Yongji Liu, Zhenan Sun. 1-6 [doi]
- BeatFM: Improving Beat Tracking with Pre-trained Music Foundation ModelGanghui Ru, Jieying Wang, Jiahao Zhao, Yulun Wu, Yi Yu, Nannan Jiang, Wei Wang, Wei Li. 1-6 [doi]
- Geometrically-Inspired Irregular Expansion Techniques for Graph-based Point Cloud LearningQi Zhang 0082, Haoqian Wang, Yuanxi Peng, Teng Li 0011. 1-6 [doi]
- Relational Enhancement Network for Industrial Defect DetectionHaotian Linghu, Meiqin Liu, Senlin Zhang. 1-6 [doi]
- Think Twice: Empowering Action Recognition Models with Human-Like Deep ReasoningXiangning Ruan, Baoxing Xie, Zhaohui Hou, Qixiang Yin, Fei Su, Zhicheng Zhao 0001. 1-6 [doi]
- Perceiving Smoothness: Temporal Consistency Learning for Multi-Frame-Rate Video Quality AssessmentJinliang Han, Xiongkuo Min, Wei Sun 0029, Guangtao Zhai. 1-6 [doi]
- LFNet: Cross-Modal LiDAR-Fisheye Fusion Network for 3D Semantic SegmentationWeijian Zhang, Zhiwei Zhang 0005, Tianfang Sun, Zhizhong Zhang 0001, Xin Tan 0002, Yuan Xie 0006. 1-6 [doi]
- ITJP: Image and Text Joint Prompts for Few-Shot Whole Slide Image ClassificationZiwei Zhu, Xinzhu Zhang, Zhikang Zhao, Jing Zhao. 1-6 [doi]
- Pixel-wise Single Image Reflection Removal Method Based on Reinforcement LearningYucheng Wang, Xueshi Yu, Zhengzhe Zhang, Xiankai Lu, Yilong Yin, Qian Zheng, Wenjia Meng. 1-6 [doi]
- Dynamic Token Selective Transformer for Aerial-Ground Person Re-IdentificationYuhai Wang, Maryam Pishgar. 1-6 [doi]
- VidCtx: Context-aware Video Question Answering with Image ModelsAndreas Goulas, Vasileios Mezaris, Ioannis Patras. 1-6 [doi]
- ABC-GS: Alignment-Based Controllable Style Transfer for 3D Gaussian SplattingWenjie Liu, Zhongliang Liu, Xiaoyan Yang, Man Sha, Yang Li. 1-6 [doi]
- Prohibited Items Segmentation via Occlusion-aware Bilayer ModelingYunhan Ren, Ruihuang Li, Lingbo Liu, Changwen Chen. 1-6 [doi]
- Incrementally Constrained Tucker Decomposition for Feature Extraction of Structural Diffusion Tensor Imaging DataFei He, Houji Du, Fan Zhang, Yipeng Liu, Ce Zhu. 1-6 [doi]
- A Refined ECG Delineation Framework Incorporating Single-Beat Mode and Conditional Random FieldZhenqin Chen, Yuying Bao, Fengbo Wang, Yiwei Lin, Jinshan Xu. 1-6 [doi]
- Multimodal Emotion Recognition in Conversations via Graph Structure LearningFeng Xiong, Geng Tu, Yice Zhang, Jun Wang, Shiwei Chen, Bin Liang 0004, Yue Yu 0001, Min Yang 0007, Ruifeng Xu 0001. 1-6 [doi]
- Uncovering Personality Traits via Multimodal LLM for Personalized Image Emotion AnalysisJianzhang Gao, Hao Pu, Yuchong Sun, Ruihua Song. 1-6 [doi]
- Structure-Guided Camouflaged Object Detection with Progressive Enhancement StrategyQingzheng Wang, Jiazhi Xie, Ning Li. 1-6 [doi]
- A Simple and Better Baseline for Visual GroundingJingchao Wang, Wenlong Zhang, Dingjiang Huang, Hong Wang 0021, Yefeng Zheng 0001. 1-6 [doi]
- AAAD: Asynchronous Inter-Variable Relationship-Aware Anomaly Detection for Multivariate Time SeriesHongyi Liu, Xiaosong Huang, Mengxi Jia, Lingzhe Zhang, Tong Jia, Zhonghai Wu, Ying Li 0012. 1-6 [doi]
- SymND: Detecting Backdoor Attacks in Self-Supervised Facial Representation TasksLiYue Zhu, Changchun Yin, Liming Fang, Zhen Qin. 1-6 [doi]
- Self-ReS: Self-Reflection in Large Vision-Language Models for Long Video UnderstandingJoĂ£o Pereira 0007, Vasco Lopes, David Semedo, JoĂ£o Neves 0006. 1-9 [doi]
- Boosting Audio-Visual Segmentation via Triple-Modalities AlignmentYujian Lee, Peng Gao, Zailong Chen, Wentao Fan, Guquan Jing, Yiyang Hu. 1-6 [doi]
- Automated Radiology Report Generation Based on Topic-Keyword Semantic GuidanceJing Xiao 0005, Hongfei Liu, Ruiqi Dong, Jimin Liu, Haoyong Yu. 1-6 [doi]
- Automatic Natural Image Matting via Dual Encoder AggregationMeng-Lun Yu, Wen-Jiin Tsai. 1-6 [doi]
- Content-Style Disentangled Audio Style Transfer via Diffusion ModelYiran Wang, Jiasheng Lu, Jun Chen, Xinyu Zhang, Yingshan Liang, Zhicheng Du, Qingyang Shi, Shao-Lun Huang. 1-6 [doi]
- FLR: Feature-based Label Recovery in Federated Learning with Classifier-free CommunicationYibin Wang, Yucan Zhou, Xiaoyan Gu, Weiping Wang. 1-6 [doi]
- DLVQA: A Dynamic Loss Approach For Visual Question Answering with Language BiasesShuocheng Wang, Zhenzhen Wang, Qingfeng Wu. 1-6 [doi]
- Multi-Grained Alignment for Visual GroundingHongbing Li, Bo Xiao, Linyi Yang, Xinran Wang, Qi Li. 1-6 [doi]
- Multi-sentence Video Grounding for Long Video GenerationWei Feng, Xin Wang 0019, Hong Chen 0011, Zeyang Zhang 0001, Wenwu Zhu 0001. 1-6 [doi]
- GASEM: Boosting Generalized and Actionable Parts Segmentation and Pose Estimation via Object Motion PerceptionLiu Liu, Ran Zhang, Wenbo Xu, Li Zhang, Yiming Tang, Qi Wu, Hao Wu. 1-6 [doi]
- Active Object Tracking with Occluded Targets Estimation and Adversarial Reinforcement LearningZheng Chen, Wengang Zhou 0001, Houqiang Li. 1-6 [doi]
- SDS-TG: Secure Diffusion Steganography in Text-Guided Generative ImagesHaozhong Yang, Hongxia Wang, Jinhe Li, Fei Zhang. 1-6 [doi]
- Imperceptible Beam-Sensitive Adversarial Attacks for LiDAR-based Object Detection in Autonomous DrivingFuyao Cai, Daizong Liu, Xiang Fang, Jixiang Yu, Keke Tang, Pan Zhou 0001. 1-6 [doi]
- Target-oriented Multimodal Sentiment Classification with Counterfactual-enhanced DebiasingZhiyue Liu, Fanrong Ma, Xin Ling. 1-6 [doi]
- DiBAN: Dual-Drive Broad Attentive Network for Speech Emotion RecognitionGongli Zhang, C. L. Philip Chen, Tong Zhang 0015, Zhulin Liu, Xiaoman Hu, Bianna Chen. 1-6 [doi]
- FairFHTL: Achieving Task-Agnostic Fairness in Federated Hetero-Task LearningTeng Zhang, Yiqiang Chen 0001, Xinlong Jiang, Wuliang Huang, Qian Chen, Chenlong Gao, Zhirui Wang 0004, Bingjie Yan. 1-6 [doi]
- Task-Aware Knowledge Prompt and Distillation for Cross-Domain Few-Shot LearningJun Liang 0002, Yunyu Zou, Yang Peng, Yalong Cheng, Rui Luo, Yishu Liu, Bingzhi Chen. 1-6 [doi]
- OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion ModelLiuhan Chen, Zongjian Li, Bin Lin 0014, Bin Zhu, Qian Wang, Shenghai Yuan, Xing Zhou, Xinhua Cheng, Li Yuan 0007. 1-6 [doi]
- Pop-Diffuseq: Controllable Symbolic Music Multi-Instrument Infilling and Accompaniment Generation with Long-Axis AttentionYi Zou, Haonan Cheng, Long Ye, Qin Zhang. 1-6 [doi]
- Cross-View Neighborhood Contrastive Multi-View Clustering with View Mixup Feature LearningYixuan Ye, Yang Zhang 0073, Liang Peng, Rui Li 0045, Cheng Liu 0001, Si Wu 0002, Hau-San Wong. 1-6 [doi]
- STSA: Spatial-Temporal Semantic Alignment for Visual DubbingZijun Ding, Mingdie Xiong, Congcong Zhu, Jingrun Chen. 1-6 [doi]
- Inversion-Free Image Editing via Rectified FlowZhengwei Peng, Conghan Yue, Tong Duan, Dongyu Zhang 0002. 1-6 [doi]
- General Distortion Metric Based Multiple Histograms Modification for Reversible Data HidingYinan Xiao, Shijun Xiang. 1-6 [doi]
- CLIP Guided Multimodal Prototype Learning for One-Shot Semantic SegmentationYulei Jian, Lingma Sun, Xiaofeng Wang, Jin Tang. 1-6 [doi]
- Prototype Optimal Transport for Box-Supervised 3D Instance SegmentationYe Zhou, Wenfei Yang, Tianzhu Zhang, Xiang Liu. 1-6 [doi]
- CHRIS: Clothed Human Reconstruction with Side View ConsistencyDong Liu 0002, Yifan Yang, Zixiong Huang, Yuxin Gao, Mingkui Tan. 1-6 [doi]
- LogiCoTab: Controllable Tabular Data Synthesis with Logical Relationships AwarenessZiyue Wang, Hongwei Ding, Yunqi Liu, Yan Feng, Xiaohui Cui. 1-6 [doi]
- Harnessing Counterfactual Reasoning for Explainable Multi-Modal Fact Verification with Large Language ModelsChaozhuo Li, Hui Pang, Xi Zhang 0008, Litian Zhang, Feiran Huang, Ming Lu. 1-6 [doi]
- One General Plug-In for Facial Heatmap-based Keypoint DetectionHanyu Jiang 0004, Jian Xue 0002, Xing Lan, Ke Lu 0002. 1-6 [doi]
- Neeko: Model Hijacking Attacks Against Generative Adversarial NetworksJunjie Chu 0002, Yugeng Liu, Xinlei He 0001, Michael Backes 0001, Yang Zhang 0016, Ahmed Salem 0001. 1-6 [doi]
- AS-Memory: Adaptive Sparse Memory Meeting Video-Language ModelsBimei Wang, Huilin Song, Jisheng Dang, Fei Shen, Hui Zhang, Liting Wang, Mangang Xie, Jizhao Liu, Jiasi Weng. 1-6 [doi]
- BPCLIP: A Bottom-up Image Quality Assessment from Distortion to Semantics Based on CLIPChenyue Song, Chen hui, Wei Zhang 0192, Haiqi Zhu, Shaohui Liu, Hong Huang, Feng Jiang 0001. 1-6 [doi]
- DAE-Fuse: An Adaptive Discriminative Autoencoder for Multi-Modality Image FusionYuchen Guo, Ruoxiang Xu, Rongcheng Li, Weifeng Su. 1-6 [doi]
- ReDet: Effective Real-time Object Detection via Efficient Multi-scale Extraction AggregationJian Li, Xin Jiang 0010, Lu Jin 0001, Zechao Li. 1-6 [doi]
- MoPE: Mixture of Policy Experts and Verification with Multimodal Information for Instance ImageGoal NavigationYijie Zeng, Xinyi Chen, Kexun Chen, Zhixuan Shen, Haonan Luo, Tianrui Li 0001. 1-6 [doi]
- Free Try-On: Virtual Try-On without Garment-Agnostic Images and Warped GarmentsWei Zhang, Xuekang Peng, Zhichao Lian. 1-6 [doi]
- Enhancing Hateful Meme Detection via Modality Enhancement and Multi-View FusionYing Zeng, Meiling Liu, Jiyun Zhou, Jingfeng Zhang. 1-6 [doi]
- Stair-LIF: Boosting the Representation of Spiking Neural Networks with Learnable Incremental Multi-Threshold NeuronsJilong Luo, Yinsheng Chen, Yue Liu, Jinghai Wang, Zhiyi Yu, Shanlin Xiao. 1-6 [doi]
- SPMamba: Leveraging Long-Sequence Modeling with State Space Models for Speech SeparationKai Li 0047, Guo Chen, Runxuan Yang, Xiaolin Hu 0001. 1-6 [doi]
- HGCL: Semi-Supervised Polyp Segmentation via Hierarchical Granularity Contrastive LearningXiaogang Du, Dong Wang, Tao Lei 0005, Tongfei Liu, Yingbo Wang, Asoke K. Nandi. 1-6 [doi]
- TopoLayer: A Universal Neural Network Layer for Topological Feature Learning on Point Clouds using Persistent HomologyZechao Guan, Shuai Du, Qingshan Liu. 1-6 [doi]
- Relation-Aware Graph Attention Network for Nuclei ClassificationLingbo Zhang, Ye Zhang, Linghan Cai, Xianchao Guan, Kai Zhang, Yongbing Zhang 0002. 1-6 [doi]
- Spectrum-Assisted Mamba for Infrared Small Target DetectionYongji Li, LuPing Wang. 1-6 [doi]
- Faster-SNN: Towards Faster and Better Spiking Neural Networks with Hybrid Neural CodingYinsheng Chen, Jilong Luo, Zhiyi Yu, Shanlin Xiao. 1-6 [doi]
- Challenging Dataset and Multi-Modal Gated Mixture of Experts Model for Remote Sensing Copy-Move Forgery UnderstandingZe Zhang, Enyuan Zhao, Yi Jiang, Jie Nie, Xinyue Liang. 1-6 [doi]
- OGS-Mapping: Object-Level 3D Gaussian Splatting MappingXinyu Liu, Zhenghao Qi, Rong Ding. 1-6 [doi]
- Enhancing 3D Gaussian Splatting Compression via Spatial Condition-based PredictionJingui Ma, Yang Hu, Luyang Tang, Jiayu Yang, Yongqi Zhai, Ronggang Wang. 1-6 [doi]
- Learning Adaptive High-Frequency Semantic Guidance for Low-light Image EnhancementHao Li, Jingxuan Zhou, JinLong Wang, Jiangmeng Li, Xiongxin Tang, Fanjiang Xu. 1-6 [doi]
- Towards Trustworthy Model via Uncertainty Verification in Multimodal Sentiment AnalysisChen Tang, Yangle Li, Tingrui Shen, Xinrong Gong, Tong Zhang. 1-6 [doi]
- Coordinated Uni-modal Assistance for Enhancing Multi-modal LearningHongpeng Pan, Yang Yang. 1-6 [doi]
- AKVQ-VL: Attention-Aware KV Cache Adaptive 2-Bit Quantization for Vision-Language ModelsZunhai Su, Wang Shen, Linge Li, Zhe Chen, Hanyu Wei, Huangqi Yu, Kehong Yuan. 1-6 [doi]
- Learning from Stochastic LabelsMeng Wei 0006, Xinzheng Xu, Peng Ying, Renke Sun, Guanjun Wang, Zhongnian Li. 1-6 [doi]
- GTPC-SSCD: Gate-guided Two-level Perturbation Consistency-based Semi-Supervised Change DetectionYan Xing, Qi'ao Xu, Zongyu Guo, Rui Huang 0006, Yuxiang Zhang 0003. 1-6 [doi]
- FDG-Diff: Frequency-Domain-Guided Diffusion Framework for Compressed Hazy Image RestorationRuicheng Zhang, Kanghui Tian, Zeyu Zhang, Qixiang Liu, Zhi Jin. 1-6 [doi]
- MADLLM: Multivariate Anomaly Detection via Pre-trained LLMsWei Tao, Xiaoyang Qu, Kai Lu 0002, Jiguang Wan 0001, Guokuan Li, Jianzong Wang. 1-6 [doi]
- DTSNet: A Denoising Teacher-Student Network with Reverse Distillation for Anomaly DetectionTaixiang Lin, Shuyuan Lin, Yanjie Liang, Rong Chen, Yang Lu. 1-6 [doi]
- Knowledge Distilled Group Prompts Learning for HOI Detection with Large Vision-Language ModelsXiaoqian Han, Guanglin Niu, Mingliang Zhou, Xiaowei Zhang. 1-6 [doi]
- TAD-IVR: Enhancing Temporal Action Detection via Instrumental Variable RegressionMinglin Hong, Bo Sun, Jun He, Yinghui Zhang. 1-6 [doi]
- DAPL: Integration of Positive and Negative Descriptions in Text-Based Person SearchYuchuan Deng, Zhanpeng Hu, Zijie Xin, Chuang Deng, Qijun Zhao. 1-6 [doi]
- Reinforced Model MergingJiaqi Han, Jingwen Ye, Shunyu Liu 0001, Haofei Zhang, Jie Song 0011, Zunlei Feng, Mingli Song. 1-6 [doi]
- Partially View-aligned Clustering with Unbiased Semantic LearningLiang Zhao 0005, Ziyue Wang, Yukun Yuan 0002. 1-6 [doi]
- RobusTReID: Defending Vision Transformer for Robust Image ReIDHua Zhang, Tingting Xiao, Li Sun 0012, Qingli Li. 1-6 [doi]
- ADoP: A Universal, Robust, Efficient, and Plug-and-Play Adversarial Example DetectorRui Yang 0032, Qindong Sun, Jiaming Cai, Jiangtao Yu. 1-6 [doi]
- CSDet: Clutter Suppression-Aided SAR Inshore Ship Detection NetworkYao Wang 0031, Shuang Li, Ganggang Dong, Hongwei Liu. 1-6 [doi]
- Neural Implicit Reconstruction and Fast Rendering Based on Dual Spherical ShellZijian Wang, Yuqi Liu, Yan Zhao, Binghao Wang, Shen Cai, Yanting Zhang. 1-6 [doi]
- Multi-Scale Tubularity-Aware U-NetYue Sun, Jie Song 0014, Ziyun Cai, Ying Wang, Liang Xiao 0001, Yawen Huang. 1-6 [doi]
- DynaGS-SLAM: Robust Dynamic SLAM with 3D Gaussian SplattingZiyi Huang, Binbin Yan, Dongliang Wang, Jinglun Feng, Shuo Chen, Xiangcheng Yi. 1-6 [doi]
- Lightweight Learning-Based In-Loop Filter for Real-Time Video CodingYanchen Zhao, Wenhong Duan, Jiaqi Zhang, Zhimeng Huang, Lin Li 0062, Qi Wang, Siwei Ma 0001. 1-6 [doi]
- DuMo: A Dual-Model Framework for Effective Long-tailed Object DetectionChenbo Zhang, Yinglu Zhang, Jihong Guan, Shuigeng Zhou. 1-6 [doi]
- Detecting AI-Generated Video via Frame ConsistencyLong Ma, Zhiyuan Yan, Qinglang Guo, Yong Liao, Haiyang Yu, Pengyuan Zhou. 1-6 [doi]
- Human-Inspired Situated Question Answering with Large Language ModelsXinyu Zhao, Weichen Xu 0001, Jian Cao 0002, Tianhao Fu, Ruilong Ren, Xing Zhang 0002. 1-6 [doi]
- A Multi-Grained Perception Model for Sentiment Analysis with Perceived Contrastive Focal LossJin Wei, Jiajie Lin, Zhenguo Yang, Haoran Xie 0001, Fuqiang Yu, Xiaoping Li 0001. 1-6 [doi]
- Bilateral Enhanced Complementary Network for Camouflaged Object DetectionYejing Guo, Ziqi Wang, Xia Yuan, Chunxia Zhao. 1-6 [doi]
- PhysLight: Accurate rPPG Heart Rate Measurement with Adaptive Video RelightingMenglin Zhang, Xiaoxin Guo, Bohao Qu, Xiaofeng Cao 0002, Shuifa Sun, Qing Guo 0005. 1-6 [doi]
- Texture-Aware Neural Radiance Fields Watermarking for Resisting Feature-Modulation Surrogate Model AttacksLei Tan, Yuliang Xue, Guobiao Li, Zhenxing Qian, Sheng Li 0006, Xinpeng Zhang 0001. 1-6 [doi]
- OmniRestore: Robust Universal Image Restoration from Combined and Unspecified DegradationsAnjusree Karnavar, Yang Li, Jiajun Liu, Jun Zhou, Junhu Wang. 1-6 [doi]
- HCMA-UNet: A Hybrid CNN-Mamba UNet with Axial Self-Attention for Efficient Breast Cancer SegmentationHaoxuan Li, Wei Song, Peiwu Qin, Xi Yuan, Zhenglin Chen. 1-6 [doi]
- Prompt-driven Multi-modal Unsupervised Domain Adaptation for 3D Semantic SegmentationMingwei Xing, Yao Wu, Yachao Zhang 0001, Yanyun Qu. 1-6 [doi]
- LLDNet: Joint Low-light Enhancement and Local Motion Deblurring in the DarkHaigen Liu, Yanyang Yan, Wenqi Ren. 1-6 [doi]
- Wavelet Convolution and Multi-Scale Attention Network for Image Tampering LocalizationYun Song, Yaoyao Xu, Jiaxin Chen, He Yang, Dengyong Zhang, Miaohui Wang. 1-6 [doi]
- Probabilistic Embeddings with Causal Constraint for Error Detection in Egocentric Procedural VideosTong Hou, Shenshen Li, Xun Jiang 0001, Zheng Wang 0044, Fumin Shen, Xing Xu 0001. 1-6 [doi]
- Analysing and Predicting Radiologists' Expertise Using Eye-Tracking Data: Insights for Diagnostic Decision-MakingYueran Ma, Jiang Liu, Yixiao Li, Yingying Wu, Richard White, Phillip Wardle, Gualtiero Colombo 0001, Padraig Corcoran, Wei Zhou, Hantao Liu. 1-6 [doi]
- Coding-Free Multiscale Latent Variables for Lossless Point Cloud Attribute CompressionQiang Xu, Lixuan Meng, Guangjie Zhang, Wei Gao, Ge Li. 1-6 [doi]
- Prior-Guided Test Time Adaptation for Blind Image Quality AssessmentShishun Tian, Fangjie Hou, Guanghui Yue 0001, Yuanhao Gong, Wenbin Zou, Ting Su 0004. 1-6 [doi]
- REAL: Retrieval-Augmented Prototype Alignment for Improved Fake News Video DetectionYili Li, Jian Lang, Rongpei Hong, Qing Chen, Zhangtao Cheng, Jia Chen, Ting Zhong, Fan Zhou 0002. 1-6 [doi]
- Multi-Level Graph Pruning-Based Framework for Graph Retrieval-Augmented GenerationHongxu Li, Xiaodi Li, Fulin Su, Qinglang Guo. 1-6 [doi]
- Enhancing Data-Free Substitute Training for Black-Box Adversarial AttacksZijian Ling, Wenyu Zhou, Yi Ouyang, Yuting Zhou, Man Zhou. 1-6 [doi]
- VIP-PCQA: A Multi-Modal Framework for No-reference Point Cloud Quality AssessmentKang Fu, Zicheng Zhang, Huiyu Duan, Xiaohong Liu 0001, Xiongkuo Min, Jiarui Wang, Guangtao Zhai. 1-6 [doi]
- Spatio-Temporally Consistent Depth Estimation for Dynamic Scenes using 3D Scene FlowsYu Cai, Tianjiao Jing, Chang Liu, Zhengxuan Lian, Shi-Sheng Huang, Hua Huang 0001. 1-6 [doi]
- MAO: Efficient Model-Agnostic Optimization of Prompt Tuning for Vision-Language ModelsHaoyang Li, Siyu Zhou, Liang Wang, Guodong Long. 1-6 [doi]
- CEFW: A Comprehensive Evaluation Framework for Watermark in Large Language ModelsShuhao Zhang 0011, Bo Cheng 0001, Jiale Han 0001, Yuli Chen 0001, Zhixuan Wu, Changbao Li, Pingli Gu. 1-6 [doi]
- PDNet: Patch-Wise Deformation Network for Cross-Modal Point Cloud CompletionJingwen He, Zhenjiang Du, Ning Xie, Lei Zhang. 1-6 [doi]
- Object Isolated Attention for Consistent Story VisualizationXiangyang Luo 0002, Junhao Cheng, Yifan Xie, Xin Zhang, Tao Feng, Zhou Liu, Fei Ma 0006, Fei Yu. 1-6 [doi]
- ET-Talk: Effective Training Strategy to Enhance Synchrony and Fidelity for Talking Face GenerationBaiqin Wang, Xiangyu Zhu 0001, Fan Shen, Hao Xu, Shukai Chen, Zhen Lei 0001. 1-6 [doi]
- Infrared Small Target Detection via Multi-Path Deep ConductionYongji Li, LuPing Wang. 1-6 [doi]
- Assessing the Generalizability of Deep Models without Out-of-Distribution DataGuoqing Zhu, Xiaojie Gan, Lingye Zhao, Luojun Lin. 1-6 [doi]
- ASTAnet: Transformer-based Siamese Network for Robust Audio-to-Audio Alignment in Amateur User Generated Audio ClipsMalya Singh, Priyankar Choudhary, Abdulmotaleb El-Saddik, Mukesh Saini. 1-6 [doi]
- Learning from Global to Local: Adaptive Frequency-Aware and Spatial-Alignment for Knowledge DistillationWenkuan Li, Xubin Wu, Shuo Gao, Haifang Li. 1-6 [doi]
- Towards Improved Deep Metric Learning via Unsupervised Object LocationChangxin Ye, Yushan Zhang, Xinyi Xu, Wei Huangfu, Cheng Deng. 1-6 [doi]
- Aspect-attentioned Prompting for Multimodal Sentiment AnalysisYutian Li, Jiaming Yang, Yiwen Hu, Lap-Kei Lee, Fu Lee Wang, Zhenguo Yang. 1-6 [doi]
- STFTCodec: High-Fidelity Audio Compression through Time-Frequency Domain RepresentationTao Feng, Zhiyuan Zhao, Yifan Xie, Yuqi Ye, Xiangyang Luo, Xun Guan, Yu Li. 1-6 [doi]
- Adaptive Optimization Strategy for Semi-supervised Arbitrary-oriented Object DetectionJiecong Chen, Chenlin Fu, Yingying Zhu. 1-6 [doi]
- ReCLIP: Reconstruction-Refined Zero-/Few-Shot Anomaly Classification and SegmentationLanning Zhang, Yali Shi, Shujie Lan, Fei Gao, Hao Qin, Nannan Wang. 1-6 [doi]
- Diverse Audio Caption Generation with Semantic-aware Diffusion ModelHualei Wang, Yiming Li, Hong Liu, Xiangdong Wang. 1-6 [doi]
- Enhancing Object Coherence in Layout-to-Image SynthesisYibin Wang, Changhai Zhou, Honghui Xu. 1-6 [doi]
- SAMDiffusion: Semantic Segmentation with Diffusion Model and Segmentation Anything ModelYihao Wang, Xinyu Mu, Peixiang Liu, Zihao Zhang, Zhiyi Wang, Xiaoming Huang. 1-6 [doi]
- JGHand: Joint-Driven Animatable Hand Avater via 3D Gaussian SplattingZhoutao Sun, Xukun Shen, Yong Hu, Yuyou Zhong, Xueyang Zhou. 1-6 [doi]
- RWKV-UI: UI Understanding with Enhanced Perception and ReasoningJiaxi Yang, Haowen Hou. 1-6 [doi]
- Enhancing Object-Attribute Alignment in Diffusion Models via Training-Free Contrastive Parallel DenoisingWentao Xie, Xingyu Li. 1-6 [doi]
- ReF-LLE: Personalized Low-Light Enhancement via Reference-Guided Deep Reinforcement LearningMing Zhao, Pingping Liu, Tongshun Zhang, Zhe Zhang. 1-6 [doi]
- Pioneer: Encrypted Video Traffic Identification for Mixed Transmission of Video-Audio SegmentsWeitao Tang, Taizhong Xu, Meijie Du, Die Hu 0004, Xu Tang, Qingyun Liu 0001. 1-6 [doi]
- Magnitude-Phase Dual-Path Speech Enhancement Network based on Self-Supervised Embedding and Perceptual Contrast Stretch BoostingAlimjan Mattursun, Liejun Wang, Yinfeng Yu, Chunyang Ma. 1-6 [doi]
- DFDUN: Deep Infrared and Visible Image Fusion with Diffusion Prior Unfolding NetworkMaoyi Xiong, Jun-Jie Huang, Zihan Chen, Tianrui Liu 0001, Xueqiong Li, Lin Liu, Wentao Zhao, Yuhua Tang. 1-6 [doi]
- FastAno: Accelerating Defect Image Generation with Efficient SamplingHaoyu Guan, Qianzi Yu, Kai Zhu 0004, Yang Cao 0010, Yu Kang 0001. 1-6 [doi]
- Encryption and Authentication with a Lensless Camera Based on a Programmable MaskEric Bezzam, Martin Vetterli. 1-6 [doi]
- EAV-Mamba: Efficient Audio-Visual Representation Learning for Weakly-Supervised Temporal Action LocalizationQuan Zhang, Jinwei Fang, Yuxin Qi, Mingyang Wan, Guojun Ma, Ke Zhang, Chun Yuan. 1-6 [doi]
- RealityAvatar: Comprehensive Head Avatar Generation with 360° RenderingHouteng Yu, Hao Zhu, Xun Cao. 1-6 [doi]
- Group-On: Boosting One-Shot Segmentation with Supportive QueryHanjing Zhou, Mingze Yin, Danny Z. Chen, Jian Wu 0001, Jintai Chen. 1-6 [doi]
- RotatedMVPS: Multi-view Photometric Stereo with Rotated Natural LightSongyun Yang, Yufei Han, Jilong Zhang, Kongming Liang, Peng Yu, Zhaowei Qu, Heng Guo. 1-6 [doi]
- Quality-Guided Dynamic Memory for LLMs-based Long-Term Video UnderstandingBimei Wang, Jingmei Jiao, Jisheng Dang, Qingrun Jiang, Jiyuan Lin, Zhixuan Chen, Teng Wang, Jun Yang. 1-6 [doi]
- Multi-modal Speech Transformer Decoders: When Do Multiple Modalities Improve Accuracy?Yiwen Guan, Viet Anh Trinh, Vivek Voleti, Jacob Whitehill. 1-6 [doi]
- Contrastive Adversarial Learning for Region-Aware Weakly Annotated Object Segmentation in Hazy Remote Sensing ImagesWanning Zhu, Libao Zhang. 1-6 [doi]
- ATD-AMSMamba: Improving Robustness of State Space Models for Multimodal Sentiment AnalysisYahong Li, Zhanxun Dong, Zhou Fang, Lai Li. 1-6 [doi]
- 3D Human Motion Corpus Moment Retrieval via Multi-Granularity Semantic AlignmentWenlong Wang, Dahua Gao, Pengfei He, Xinyu Liu, Danhua Liu. 1-6 [doi]
- OG-Mapping: Octree-based Structured 3D Gaussians for Online Dense MappingMeng Wang, Junyi Wang, Changqun Xia, Chen Wang, Yue Qi. 1-6 [doi]
- End to End Text to Sign Language Generation using MultiGAUNabeela Khan. 1-6 [doi]
- UniVG: Towards UNIfied-modal Video GenerationLudan Ruan, Lei Tian, Chuanwei Huang, Xu Zhang, Xinyan Xiao. 1-6 [doi]
- Research on Audio-Visual Quality Assessment Dataset and Method for User-Generated Omnidirectional VideoFei Zhao, Da Pan 0001, Zelu Qi, Ping Shi 0001. 1-6 [doi]
- AMMSM: Adaptive Motion Magnification and Sparse Mamba for Micro-Expression RecognitionXuxiong Liu, Tengteng Dong, Fei Wang, Weijie Feng, Xiao Sun. 1-6 [doi]
- TDE-VC: Timbre Disentanglement and Extraction Via Consistency for Zero-Shot Voice ConversionYing Hu, Shangkun Tu, Fan Li, Lijun He, Hai Yan, Yan Li. 1-6 [doi]
- Improving Human-AI Collaboration in Medical Diagnosis with Combination AdviceXuehan Zhao, Jiaqi Liu, Zhiwen Yu, Bin Guo. 1-6 [doi]
- ES-Parkour: Advanced Robot Parkour with Bio-inspired Event Camera and Spiking Neural NetworkQiang Zhang 0029, Jiahang Cao, Jingkai Sun, Yecheng Shao, Gang Han, Wen Zhao, Yijie Guo, Renjing Xu. 1-6 [doi]
- CosGaussian: Towards Text-to-3D Semantically Controllable 3D Object Style Transfer with Gaussian SplattingWendong Li, Gaojie Wu, Xiang Huang, Wei-Shi Zheng 0001. 1-6 [doi]
- I-Lora: Iterative Merging of Routing-Tuned Low-Rank Adapters for Multi-Task LearningGuoqing Zhao, Qi Zhang 0053, Shaopeng Zhai, Dazhong Shen, Tianyi Zhang, Yu Qiao 0001, Tong Xu 0001. 1-6 [doi]
- Multi-Modal Contrastive Fusion for Consensus Learning in Sequential Group RecommendationYue Kou, Dong Li 0023, Qixiang Tang, Derong Shen, Tiezheng Nie. 1-6 [doi]
- DreamAnimate: Temporal Consistency and Detail Preservation for Character AnimationLulu Tian, Hongxun Yao, Zhaopan Xu, Jiankun Zhu, Xi Chen 0110, Yuxin Hou. 1-6 [doi]
- Safety-constrained Reinforcement Learning with Interaction-aware for Decision-making of Autonomous DrivingDi Zhang, Haonan Luo, Honglin Dong, Jianfeng Lu. 1-6 [doi]
- GCA-SUNet: A Gated Context-Aware Swin-UNet for Exemplar-Free CountingYuzhe Wu, Yipeng Xu, Tianyu Xu, Jialu Zhang 0003, Jianfeng Ren, Xudong Jiang 0001. 1-6 [doi]
- BMCA: Weakly Supervised Semantic Segmentation via Beta Modulation and Cross-Modality AlignmentYing Gao 0004, Jing Lin, Wentian Cai, Yandan Chen, Zihao Huang, Zhiyong Xia. 1-6 [doi]
- MTSD: Simple Yet Effective Self-Distillation for Generalizable Deepfake DetectionDexu Zhu, Jie Cao 0002, Jiangnan Shao, Zhida Zhang, Junxian Duan, Ran He 0001. 1-6 [doi]
- Selective Masking Adversarial Attack on Automatic Speech Recognition SystemsZheng Fang 0014, Shenyi Zhang, Tao Wang 0081, Bowen Li 0016, Lingchen Zhao, Zhangyi Wang. 1-6 [doi]
- OmniStyle: Attention-Optimized Global and Local Image Stylization with Diffusion Model InversionJiarong Cheng, Xihang Qiu, Qing Zhou, Ming Li 0011, Chun Li, Yao Lu, Fei Richard Yu. 1-6 [doi]
- TrojFlow: Flow Models are Natural Targets for Trojan AttacksZhengyang Qi, XiaoHua Xu. 1-6 [doi]
- Universal Scene Graph Generation via Semantic Feature AlignmentXiangyu Zhang, Guoxi Qiu, Yong Xu, Jinghua Wang. 1-6 [doi]
- Enhancing Diffusion-based Dataset Distillation via Adversary-Guided Curriculum SamplingLexiao Zou, Gongwei Chen, Yanda Chen, Miao Zhang. 1-6 [doi]
- Insulator Defect Detection Method Based on Lightweight Feature Extraction and Efficient Cross-Scale FusionZhi Yang, Chunyang Ma, Liejun Wang, Zhiqing Guo. 1-6 [doi]
- Egocentric Online Action Segmentation with Behavior-Centred Feature AugmentationZhangye Han, Xun Jiang 0001, Zheng Wang 0044, Xin Liu 0011, Fumin Shen, Xing Xu 0001. 1-6 [doi]
- Masked Self-Supervised Learning and Semantic Noise Separation for Video Anomaly DetectionQiao Wang, Menghao Zhang, Lei Zhang, Qi Qi 0001, Haifeng Sun 0001, Pengfei Ren 0001, Bo He 0003, Jing Wang, Jingyu Wang. 1-6 [doi]
- Enhanced Self-Supervised Multi-View Representations with Modality-Missing Robustness for Audio-Visual Speech RecognitionFei Su, Cancan Li, Juan Liu. 1-6 [doi]
- ECAIF: Efficient Context Aware Information Fusion Network for Medical Image SegmentationLuyao Ren, Wenxin Yu, Zhiqiang Zhang, Chang Liu, Jun Gong. 1-6 [doi]
- Extended Short- and Long-Range Mesh Learning for Fast and Generalised Garment SimulationAoran Liu, Kun Hu, Clinton Mo, ChangYang Li, Zhiyong Wang 0001. 1-6 [doi]
- Advancing Multi-Hop Question Answering via Alternating Retrieval and Reasoning over Multi-view Knowledge IntegrationMengchao Liu, Chao Yang, Bin Jiang, Chenglong Lei. 1-6 [doi]
- FedRF: Input-side Client Drift Mitigation for Federated Learning via Reusing FeaturesLingxiao Kong, Jiahui Jiang, Wenchao Xu 0001, Haozhao Wang, Ruixuan Li 0001. 1-6 [doi]
- AGFT-Tracker: Adaptive Game-Based PEFT for Object Tracking with PLMsMingyu Cao, Xihuai He, Xueqiong Li, Kedi Zhang, Yuhua Tang, Wanrong Huang, Huibin Tan. 1-6 [doi]
- Hyperbolic Space Learning Method Leveraging Temporal Motion Priors for Human Mesh RecoveryXiang Zhang, Suping Wu, Weibin Qiu, Zhaocheng Jin, Sheng Yang. 1-6 [doi]
- Coarse-To-Fine Graph Reasoning for 3D Hand Mesh ReconstructionDan Fu, Wai-Keung Wong, Lunke Fei, Tingting Chai, Yuzhu Ji, Qinghua Zhu 0001. 1-6 [doi]
- SANE: Enhancing Large-scale Scene Representation with Semantic-aware NeRF ExpertsZesheng Wang 0002, Yufeng Wang 0004, Shuangkang Fang, Xinrui Zhang, Dacheng Qi, Shengxi Li, Mai Xu, Wenrui Ding. 1-6 [doi]
- Multi-Level Normalizing Flow for Comprehensive Anomaly Detection and LocalizationJie Shi, Xin Wen, Shijie Guo, Robert H. Deng, Jianan Xie, Rui Cao. 1-6 [doi]
- GT-free_XAI: A Ground Truth-Free XAI Framework for Decision Interpretation and EvaluationYanchu Wu, Feng Tian. 1-6 [doi]
- Beyond Statistical Correlation: Causal Insights into Emotion RecognitionTao Chen 0017, Yanrong Guo, Shijie Hao, Richang Hong. 1-6 [doi]
- GlanceVAD: Exploring Glance Supervision for Label-efficient Video Anomaly DetectionHuaxin Zhang, Xiang Wang 0012, Xiaohao Xu, Xiaonan Huang, Changxin Gao, Yuehuan Wang, Shanjun Zhang, Nong Sang. 1-6 [doi]
- Discrimination-based Method for Image Object Detection with Random Distinct ProposalsJingzhi Zhang, Chengjie Bai. 1-6 [doi]
- A Spatial-Frequency Domain Joint Mechanism Network for Cross-modal Semantic SegmentationYiheng Qu, Zhibing Zhang, Liqiang He. 1-6 [doi]
- Create Anything Anywhere: Layout-Controllable Personalized Diffusion Model for Multiple SubjectsWei Li, Hebei Li, Yansong Peng, Siying Wu, Yueyi Zhang, Xiaoyan Sun 0001. 1-6 [doi]
- NVPose: Novel View Data Augmentation for Human Pose EstimationYiqing Xu, Liwei Liao, Ronggang Wang. 1-6 [doi]
- Attribute-Guided Zero-Shot CLIP in Image ClassificationGuoxi Qiu, Xiangyu Zhang, Yong Xu, Jinghua Wang. 1-6 [doi]
- k2SSL: A Faster and Better Framework for Self-Supervised Speech Representation LearningYifan Yang 0005, Jianheng Zhuo, Zengrui Jin, Ziyang Ma 0001, Xiaoyu Yang 0005, Zengwei Yao, Liyong Guo, Wei Kang 0006, Fangjun Kuang, Long Lin, Daniel Povey, Xie Chen 0001. 1-6 [doi]
- LeAffordNav: Enhancing Open-vocabulary Mobile Manipulation with LLM-guided Exploration and Affordance-aware NavigationYuanwen Chen, Haoran Li 0010, Yaran Chen, Dongbin Zhao. 1-6 [doi]
- Layer-wise Parameter Robustness for Continual Test-time AdaptationHaoyu Xiong, Qiuxia Yang, Chengchao Wang, Tianze Zhong, Zhengpeng Zhao, Yuanyuan Pu. 1-6 [doi]
- Efficient Prompt Tuning for Hierarchical Ingredient RecognitionYinxuan Gui, Bin Zhu 0006, Jingjing Chen 0001, Chong-Wah Ngo. 1-6 [doi]
- WSGS: A Speech-Driven Zero-Shot System for 6D Robotic Arm GraspingYitong Ge, Lin Zhang, Yang Chen, Ying Shen. 1-6 [doi]
- Context Consistency Learning via Sentence Removal for Semi-Supervised Video Paragraph GroundingYaokun Zhong, Siyu Jiang, Jian Zhu, Jian-Fang Hu. 1-6 [doi]
- CT-MIE: Computed Tomography Multi-Task Image Enhancement via Vision-Language ModelYucheng Zeng, Aihua Mao, Xianghong Wang, Tianye Niu. 1-6 [doi]
- Flexible Streaming Temporal Action Segmentation with Diffusion ModelsJinrong Zhang, Wenjun Wen, Shenglan Liu 0001, Sifan Zhang, Yuning Ding, Lin Feng 0001. 1-6 [doi]
- DiffDeid: High-Quality Face De-identification and Recovery via Diffusion InversionZheyuan Liu 0011, Jun Jia, Hongyi Miao, Yiwei Yang 0007, Yanwei Jiang, Yingjie Zhou 0003, Zhi Liu, Guangtao Zhai. 1-6 [doi]
- Region Confidence Refinement with Progressive Semantic Mining for Source-Free Domain Adaptive Object DetectionZichong Chen, Zeyu An, Jian Cheng 0003. 1-6 [doi]
- JointDeblur-Gs: Joint Blur-Aware Gaussian SplattingSijia Hu, Peng Chen, Xinxiao Wang, Luyue Sun, Guanghao Li, Hongyu Wang, Jian Pu. 1-6 [doi]
- DyPho-SLAM : Real-time Photorealistic SLAM in Dynamic EnvironmentsYi Liu, Keyu Fan, Bin Lan, Houde Liu. 1-6 [doi]
- Prompt-Based Two-Stage Enhancement for Low-Light Object DetectionBohan Xiong, Kan Chang, Mingyang Ling, Shilin Huang, Shucheng Xia, Yujian Yuan. 1-6 [doi]
- CA-Diff: Collaborative Anatomy Diffusion for Brain Tissue SegmentationQilong Xing, Zikai Song, Yuteng Ye, Yuke Chen, Youjia Zhang, Na Feng, Junqing Yu, Wei Yang 0034. 1-6 [doi]
- Towards Specialized and Generalizable Geometry Restoration of Compressed Point CloudsLixuan Meng, Qiang Xu, Shan Liu 0001, Wei Gao 0003, Ge Li 0002. 1-6 [doi]
- CrossMuSim: A Cross-Modal Framework for Music Similarity Retrieval with LLM-Powered Text Description Sourcing and MiningTristan Tsoi, Jiajun Deng, Yaolong Ju, Benno Weck, Holger Kirchhoff, Simon Lui. 1-6 [doi]
- Lightweight Video Super-Resolution Network Based on Pyramid Optical Flow Extraction and AlignmentXiaoqiang Cui, Kaixuan Hou, Jianping Luo. 1-6 [doi]
- Can Drowsiness be seen in the eyes? A new detection method of driver drowsiness levels based on eye-tracking dataRunlin Zhang, Qing Xu, Yueming Zhu, Chuntie Chen. 1-6 [doi]
- When Epipolar Transformers Meets Implicit Neural Super-Resolution in Multi-View StereoBoyang Song, Jin Xiao, Xiaoguang Hu, Guofeng Zhang 0002, Jiaqi Shi, Hao Jiang. 1-6 [doi]
- SemanticLoom: Category-aware Dynamic Fusion for Multi-class Few-shot Image SynthesisJie Wang, Yan Huang 0031, Yunfei Zhang, Tianyi Chen, Si Wu 0002, Yong Xu 0007, Patrick Le Callet. 1-6 [doi]
- Low-Redundancy Knowledge Generation and Modality-Aware Interaction for Multimodal Information Extraction in Social MediaShizhou Huang, Bo Xu 0023, Changqun Li, Yang Yu, Xin Lin 0001. 1-6 [doi]
- SELIC: Semantic-Enhanced Learned Image Compression via High-Level Textual GuidanceHaisheng Fu, Jie Liang 0001, Zhenman Fang, Jingning Han. 1-6 [doi]
- ExScene: Free-View 3D Scene Reconstruction with Gaussian Splatting from a Single ImageTianyi Gong, Boyan Li, Yifei Zhong, Fangxin Wang. 1-6 [doi]
- AWUR: Adaptive Wavelet and Uncertainty Refinement for Semi-Supervised Medical Image SegmentationHailan Shen, Yuqi Li, Zailiang Chen 0001, Hui Liu, Wenyan Zhong, Yudi Wang. 1-6 [doi]
- DTAD: A Distribution-Transformed Supervised Anomaly Detection MethodLingxing Chen, Yang Gu, Yi Guo, Jianqi Chen, Yingting Zhu, Yehong Zhuo, Dongmei Jiang, Yiqiang Chen. 1-6 [doi]
- Self-Supervised Point Cloud Completion based on Multi-View Augmentations of Single Partial Point CloudJingjing Lu, Huilong Pi, Yunchuan Qin, Zhuo Tang, Ruihui Li. 1-6 [doi]
- TD-BFR: Truncated Diffusion Model for Efficient Blind Face RestorationZiying Zhang, Xiang Gao, Zhixin Wang, Qiang Hu 0003, Xiaoyun Zhang 0001. 1-6 [doi]
- Edge and Localization Feature Guidance Network for Accurate Polyp SegmentationYulong Bai, Songlin Li, Xiuhong Li, Kuan Wang, Rong Wan, Haochu Ku, Mengge Lu. 1-6 [doi]
- Fine-Grained Body Part Control in Text-Driven Motion Synthesis with Interactive IntentionSiyuan Fan, Longling Sun, Bo Peng, Bo Du 0001, Xiantao Cai. 1-6 [doi]
- Localizing Step-by-Step: Multimodal Long Video Temporal Grounding with LLMHoulun Chen, Xin Wang 0019, Hong Chen 0011, Wei Feng, Zihan Song 0003, Jia Jia 0001, Wenwu Zhu 0001. 1-6 [doi]
- Weakly Supervised Object Detection Framework based on Classification-Localization ConsistencyYihuan Zhu, Simiao Wang, Mingyu Lu, Zhengxing Sun. 1-6 [doi]
- RLK-Net: An Efficient Residual Large Kernel Convolution with Channel-Wise Adaptive Feature Fusion for Medical Image SegmentationQingxue Zhao, Zhongjie Pan, Di Wu, Ge Tang, Jun Tian. 1-6 [doi]
- DPDEdit: Detail-Preserved Diffusion Models for Multimodal Fashion Image EditingXiaolong Wang, Zhiqi Cheng, Jue Wang, Huizi Xue, Xiaojiang Peng. 1-6 [doi]
- Towards Robust Time-Of-Flight Depth Denoising with Confidence-Aware Diffusion ModelChangyong He, Jin Zeng, Jiawei Zhang, Jiajie Guo. 1-6 [doi]
- Multi-mode Bidirectional Feature Fusion and Domain-consistency Refinement for Real-time Monocular 6D Object Pose EstimationShuo Yang, Junyi Wang, Yue Qi. 1-6 [doi]
- Fitted-Singer: Singing Voice Synthesis with Style Control and Rhythm ControlYu Cao, Sijia Li, Shiguang Liu. 1-6 [doi]
- Shift-Driven Learning for Unsupervised Domain AdaptationWentang Chen, Yibin Wen, Juepeng Zheng. 1-6 [doi]
- Robust Generalized Zero-Shot Learning via Dual-Stream Variational Autoencoders and Out-of-Distribution DetectionXue Han 0017, Zhixiang Li, Wenchuan Zhang, Hanyuan Huang, Wentao Fan 0001. 1-6 [doi]
- SSTD: Stripe-Like Space Target Detection Using Single-Point Weak SupervisionZijian Zhu, Ali Zia, Xuesong Li 0001, Bingbing Dan, Yuebo Ma, Enhai Liu, Rujin Zhao. 1-6 [doi]
- Enhancing Human Motion Prediction via Multi-range Decoupling Decoding with Gating-adjusting AggregationJiexin Wang, Wenwen Qiang, Zhao Yang 0006, Bing Su 0001. 1-6 [doi]
- An End-To-End Class-Aware Transformer Framework For Weakly-Supervised Semantic SegmentationWenzhe Gu, Kaiwen Li, Bin Zhang, Baosheng Liu. 1-6 [doi]
- BI-RADS Boosted Breast Cancer Diagnosis With Masked Pretraining On Imbalanced Ultrasound DataXueqian Pang, Ziyun Li, Junhui Lv, Ruiquan Ge, Zhuoxuan Wu, Fei Gao. 1-6 [doi]
- PopuDet: Autism Spectrum Disorder Detection in Population Graphs via Micro-macro Relationship Construction and Multi-feature FusionManman Yuan, Ting Xu, Jiazhen Ye, Peican Zhu, Jiacheng Wang, Keke Tang. 1-6 [doi]
- VoxelDet: Towards Accurate 3D Object Detection with Voxel Pruning and Fine Geometric ShapeJia Wen, Jialin Li, Ting Zhang. 1-6 [doi]
- Wavelet-based Global-Local Interaction Network with Cross-Attention for Multi-View Diabetic Retinopathy DetectionYongting Hu, Yuxin Lin, Chengliang Liu 0003, Xiaoling Luo 0001, Xiaoyan Dou, QiHao Xu, Yong Xu 0001. 1-6 [doi]
- AnyArtisticGlyph: Multilingual Controllable Artistic Glyph GenerationXiongbo Lu, Yaxiong Chen, Shengwu Xiong 0001. 1-6 [doi]
- MCSMoG: Multi-Conditional Diffusion for Stylized Motion Generation with Parametric ControlYi Yang, Xinzhu Li, Yufeng Chen, Guanghui Yue 0001, Wei Zhou 0021, Zhuo Su 0001, Ruomei Wang 0001, Fan Zhou 0001, Baoquan Zhao. 1-6 [doi]
- Shape-Preserving and Surface-Fitting Network for 3D Lane DetectionJianhua Li, Yongkang Liu, Gaoqi He, Wenxiang Liu, Weiliang Meng. 1-6 [doi]
- Customizing Image Codecs for Text-Rich Screen Content with Plugin Processing NetworksHao Wang 0184, Junyan Huo, Shuai Wan, Kun Yang, Gaoxing Chen, FuZheng Yang 0001. 1-6 [doi]
- TGATrack: Template-Guided Low-Rank Adaption for Robust RGB-T TrackingShihui Zhang, Junbin Su, Jiawei Zhang, Ziteng Xue, Zhipeng Zhang. 1-6 [doi]
- AIM-VR: All-in-One Video Restoration via Dual-Path Mamba with Frequency Adaptive FusionZhizhou Lu, Tianrui Liu, Zihan Chen, Junjie Huang 0001, Xueqiong Li, Baili Xiao, Wentao Zhao. 1-6 [doi]
- Mobile-StereoHPE: Real-Time Mobile 3D Hand Pose Estimation from Stereo Gray ImagesDongfang Zhao, Menghe Zhang, Yangwen Liang, Shuangquan Wang, Kee-Bong Song, Donghoon Kim. 1-6 [doi]
- Study of Finger Biometrics on Finger Semantic Segmentation and Finger Shape AuthenticationJunduan Huang, Dacan Luo, WeiLi Yang, Jiahui Pan, Wenxiong Kang. 1-6 [doi]
- Mutual Teaching: Semi-supervised Medical Image Classification with Cross Structural Consistency LearningChuankai Xu, Junhao Li, Ruxin Wang. 1-6 [doi]
- ExGAT: Build Explicit Dependencies for Incomplete Multi-Modal Learning via Graph Attention NetworkBinyu Zhao, Wei Zhang, Zhaonian Zou. 1-6 [doi]
- VL-UR: Vision-Language-guided Universal Restoration of Images Degraded by Adverse Weather ConditionsZiyan Liu, Yuxu Lu, Hushan Yu, Dong Yang. 1-6 [doi]
- Addressing Emotion Bias in Music Emotion Recognition and Generation with Frechet Audio DistanceYuanchao Li, Azalea Gui, Dimitra Emmanouilidou, Hannes Gamper. 1-6 [doi]
- ArtTypo: Multi-Level Controlled Artistic Typography with Iterative FeedbackKaiyue Liu, Lei Wu 0002, Mingzhe Yu, Xiaole Liu, Yajie Xu, Xiangxu Meng. 1-6 [doi]
- A Novel Framework for Realistic 3D Scene Regeneration with Graph of ThoughtsYitian Kou, Kaiwei Zhang, Dandan Zhu 0001, Xiongkuo Min, Guangtao Zhai. 1-6 [doi]
- Uncertainty-Guided Iterative Architecture for Stereo MatchingWeiqing Xiao, Fengjun Zhong, Hao Zhao. 1-6 [doi]
- Generalized Audio Deepfake Detection Using Frame-level Latent Information EntropyBotao Zhao 0001, Zuheng Kang, Yayun He, Xiaoyang Qu, Junqing Peng, Jing Xiao 0006, Jianzong Wang. 1-6 [doi]
- SU-SAM: A Simple Unified Framework for Adapting SAM in Underperformed SceneYiran Song, Qianyu Zhou 0001, Xuequan Lu, Zhiwen Shao, Lizhuang Ma. 1-6 [doi]
- C3S3: Complementary Competition and Contrastive Selection for Semi-Supervised Medical Image SegmentationJiaying He, Yitong Lin, Jiahe Chen, Honghui Xu, Jianwei Zheng 0001. 1-6 [doi]
- CPMDiff: Classifier Probability Measurement for Out-of-Distribution Detection via Diffusion ModelsYongheng Xu, Kaiyu Song, Hanjiang Lai. 1-6 [doi]
- Zero-shot Quantization of Vision Transformers: Leveraging Multi-model Ensembles and Attention MixupYao Li, Xinrui Chen 0001, Zhuozhen Yu, Shunzhou Wang, Wei Gao 0003. 1-6 [doi]
- MDMU:Multimodal Dynamic Mamba UNet for Multimodal sentiment analysisWeibin Li, Jiazheng Huang, Bohuan Xue, Wenhao Shao, Yijun Liu, Xiaoyu Tang. 1-6 [doi]
- HRGR: Enhancing Image Manipulation Detection via Hierarchical Region-aware Graph ReasoningXudong Wang, Jiaran Zhou, Huiyu Zhou 0001, Junyu Dong, Yuezun Li. 1-6 [doi]
- Self-Supervised Learning for Transparent Object Depth Completion Using Depth from Non-Transparent ObjectsXianghui Fan, Zhaoyu Chen, Mengyang Pan, Anping Deng, Hang Yang. 1-6 [doi]
- Sparse-view 3D Open-vocabulary Gaussian Splatting via Collaborative Contrastive LearningGuibiao Liao, Anjie Wang, Mingxuan Chen, Zhijun Fang. 1-6 [doi]
- Learning to Unify Audio, Visual and Text for Audio-Enhanced Visual Answer LocalizationZhibin Wen, Bin Li. 1-6 [doi]
- A Low-Rank Defense Method for Adversarial Attack on Diffusion ModelsJiaxuan Zhu, Siyu Huang. 1-6 [doi]
- α-SAV: Generalized Weighted Input Verification for Secure Aggregation in Federated LearningZhi Lu, Yuhao Long, Qirui Zhou, Mengyuan Zou, Wenjie Cai, Songfeng Lu. 1-6 [doi]
- UniTD: A Benchmark with Unified Text-Domain for Text-to-Image Person ReIDPing Lai, Yihang Duan, Hao Ni 0002, Liangcheng Fu, Hui Xu, Pengpeng Zeng. 1-6 [doi]
- Twin Progressive Generative Adversarial Network For High-Resolution Image InpaintingZhiying Li 0003, Weibin Chen, Zhaoxin Fan, Kaichuan Kong, Xiaobo Jin, Guanggang Geng. 1-6 [doi]
- Adaptive Semantic Compression: Compatible Bitstream for Scalable Human-Machine Perception Sample AdaptionShaokang Wang, Dingquan Li, Guoqing Xiang, Jinchang Xu, Shanghang Zhang, Xiaodong Xie. 1-6 [doi]
- Real-World Retrieval Support Zero-Shot Learning: A Novel Learning Paradigm and an Efficient Balanced Generative FrameworkGang Yang, Xinyue Ju, Yipeng Xu, Yici Zhang. 1-6 [doi]
- Mamba-Based Blind Stitched Wide Field of View Light Field Image Quality Assessment via Dual-Viewport SamplingRui Zhou, Gangyi Jiang, Linwei Zhu, Yeyao Chen, Yueli Cui, Ting Luo 0001, Haiyong Xu. 1-6 [doi]
- AdaptiveFusion: LiDAR-Camera Adaptive Fusion for 3D Object DetectionYuhan Zhou, Xiaotian Li, Baojie Fan. 1-6 [doi]
- A Zero Decoding Approach to Video ClassificationChen Ye Gan, Jiangtao Wen, Yuxing Han. 1-6 [doi]
- SituLM: Leveraging Visual Instruction Tuning and an Augmented SWiG Dataset for Enhanced Grounded Situation RecognitionYuran Wang, Zhi-Qi Cheng. 1-6 [doi]
- TriModal Enhanced Fusion Network: Advancing Multimodal Representation and Fusion for Enhanced Multimodal Intent RecognitionYixuan Wang, Kehan Wang, Huayu Zhang, Ming Fang, Shuhua Liu. 1-6 [doi]
- PDFIN: Prompt-Guided Dynamic Feature Integration Network for Few-Shot Class-Incremental Remote Sensing Scene ClassificationKaili Lu, Jian Ji, Ruoxue Li, Falin Wang, Chengwei Xu. 1-6 [doi]
- Cross-Structure and Semantic Enhancement for Diabetic Retinopathy GradingXue Xia 0005, Zipeng Lin, Jingying Zhu, Jiebin Yan, Yuming Fang. 1-6 [doi]
- Rethinking Joint Optimization in Feature Compression: Insights from Person Re-IdentificationChangsheng Gao, Zhuoyuan Li 0001, Li Li 0040, Dong Liu 0002, Feng Wu 0001, Weisi Lin. 1-6 [doi]
- Prompt-Guided Multi-Task Decoupling for Speech Presentation Skills AssessmentZihua Xiong, Jiachen Tan, Tingting Zhang, Bin Wu 0001, Chunping Zheng. 1-6 [doi]
- Attributed Synthetic Data Generation for Zero-shot Domain-specific Image ClassificationShijian Wang, Linxin Song, Ryotaro Shimizu, Masayuki Goto, Hanqian Wu. 1-6 [doi]
- A Depth Semantic Perception Network for Camouflage Object DetectionZijun Wei, Songlin Li, Xiuhong Li, Boyuan Li, Zhenhong Jia, Haochu Ku. 1-6 [doi]
- Identity-Preserving Talking Head Cross-Identity Reenactment with Adaptive Structure NormalizationZhao Jing, Hongxia Bie, Haobo Lei, Jiali Wang, Yichen Zhi, Zhisong Bie. 1-6 [doi]
- Exploring Flexibility in Incremental Few-Shot Object DetectionDongdong Gong, Tengfei Gong, Yaxiong Chen, Jinglin Yuan, Shengwu Xiong 0001. 1-6 [doi]
- Enhancing Cross-modal Semantic Consistency via Key Token Alignment for Image-text RetrievalHuilong Lin, Yangtao Wang, Meie Fang, Yanzhao Xie, Da Chen, Xiaocui Li 0001, Weilong Peng, Siyuan Chen, Maobin Tang, Ping Li 0016. 1-6 [doi]
- STPM: Spatial-Temporal Point Mamba for Activity Recognition Using mmWave Radar Point CloudsYingru Chen, Zhihao Guo, Haimin Zhang, Min Xu. 1-6 [doi]
- TEVLA: Text-oriented Enhancement for Vision-Language Alignment in Relation ExtractionJunLin Chen, Qiushan Guo, Ka-Chun Cheung, Mingrui Liang, Dezhi Chen. 1-6 [doi]
- Fast and Physically-based Neural Explicit Surface for Relightable Human AvatarsJiacheng Wu, Ruiqi Zhang, Jie Chen, Hui Zhang. 1-6 [doi]
- Stepwise Schema-Guided Prompting Framework with Parameter Efficient Instruction Tuning for Multimedia Event ExtractionXiang Yuan, Xinrong Chen, Haochen Li, Hang Yang, Guanyu Wang, Weiping Li, Tong Mo. 1-6 [doi]
- RDFNet: Real-time Object Detection Framework for Foggy ScenesTianle Fang, Zhenbing Liu, Yutao Tang, Yingxin Huang, Haoxiang Lu, Chuangtao Zheng. 1-6 [doi]
- Accurate and Efficient Privacy-Preserving Image SURF Feature ExtractionXiangyu Gao, Zhekai Luo, Peijia Zheng, Jian Li, Rui Yang. 1-6 [doi]
- A Unified Inverse-Tone-Mapped HDR Video Quality Assessment Method across Two HDR FormatsLeidong Fan, Xiongkuo Min, Qing Li, Anjie Wang. 1-6 [doi]
- VFFG-CL: Virtual Fusion Feature Generation with Curriculum Learning for Missing-Modality Emotion RecognitionXiaolan Tang, Yan Xiang, Zhengtao Yu, Yuxin Huang. 1-6 [doi]
- Efficient Text-to-Motion via Multi-Head Generative Masked ModelingHeng Li, Xing Liufu, Xiaotong Lin 0002, Jian Zhu, Jian-Fang Hu. 1-6 [doi]
- Mimicing Real-world Knowledge to Generate 3D Adversarial Point CloudsTengjun Liu, Qianbin Guo, Xuanchi Gong, Huan Zhang, Xianyi Chen. 1-6 [doi]
- Global Intervention and Distillation for Federated Out-of-Distribution GeneralizationZhuang Qi, Runhui Zhang, Lei Meng 0001, Wei Wu, Yachong Zhang, Xiangxu Meng. 1-6 [doi]
- Mitigating Knowledge Forgetting by Generative Knowledge Replay and Forgetting-aware Aggregation in Semi-Supervised Federated LearningHongquan Liu, Yixin Ren, Jihong Guan, Shuigeng Zhou. 1-6 [doi]
- ReFEdit: Rehearsal-Free Lifelong Knowledge Editing for Large Language ModelsXianjie Mo, Youcheng Pan, Yongshuai Hou, Ping Luo, Yang Xiang 0003. 1-6 [doi]
- DATTA: Domain Diversity Aware Test-Time Adaptation for Dynamic Domain Shift Data StreamsChuyang Ye, Dongyan Wei, Zhendong Liu, Yuanyi Pang, Yixi Lin, Qinting Jiang, Jingyan Jiang, Dongbiao He. 1-6 [doi]
- Imperceptible and Robust Adversarial Perturbation: Attention-Guided Watermark Vaccine Against Watermark RemovalYujiang Li, Zhili Zhou, Zhongliang Yang, Baowei Wang, Tao Qi, Xiaohua Xie, Jiantao Zhou. 1-6 [doi]
- Explore the Asymmetric Interference Sound Field for High-precision LocalizationXiaojie Yu, Mingzhi Pang, Zhongxu Bao, Xu Yang 0011, Qiang Niu, Yuqing Yin. 1-6 [doi]
- Causal Deconfounding for Spurious Correlation in Domain GeneralizationBin Qin 0001, Yi Li, Jiangmeng Li, Xuesong Wu 0005, Yupeng Wang 0005, Jianwen Cao 0001. 1-6 [doi]
- Beyond Macro-Actions: A Bio-Inspired Framework for Fine-Grained Micro-Action RecognitionYiwei Ru, Churan Yu, Dongsen Zhang, Mupei Li, Yongji Liu, Zhaofeng He. 1-6 [doi]
- CoDiff-SaK: Controllable Diffusion Model with Segment Anything Knowledge for Low-dose CT Image DenoisingFenghang Zhang, Guang Feng, Xizhan Gao, Wanying Wu, Sijie Niu. 1-6 [doi]
- Advanced Backdoor Threats and Countermeasures in Dataset CondensationCanhui Wu, Wei Xi, Dashan Gao, He Yang, Jizhong Zhao. 1-6 [doi]
- Revisiting DETR for Small Object Detection via Noise-Resilient Query OptimizationXiaocheng Fang, Jieyi Cai, Huanyu Liu, Wenxiu Cai, Yishu Liu, Bingzhi Chen. 1-6 [doi]
- MIINT: Infuse Intuitive Data Correspondence for Model InterpretationYuyang Wang, Ligeng Chen, Bing Mao. 1-6 [doi]
- Feature Affinity based Clustering for Test-Time Adaptation for Image Quality AssessmentMeghna Kapoor, Vinit Jakhetiya, Badri Narayan Subudhi, Ankur Bansal, Weisi Lin. 1-6 [doi]
- Decoupled and Interactive Regression Modeling for High-performance One-stage 3D Object DetectionWeiping Xiao, Yiqiang Wu, Yu Qin, Chenghai Mao, Jia Liu, Xiaomao Li. 1-6 [doi]
- Rectified Mixed-Label Learning for Semi-Supervised Medical Image SegmentationZeyu An, Zichong Chen. 1-6 [doi]
- Bridging the Gap: Balancing Human Perception and Detector Attention in Adversarial AttacksMingye Xie, Suncheng Xiang, Xian Gao, Ting Liu 0016, Yuzhuo Fu. 1-6 [doi]
- Incorporating Audio-Guided Visual Attention into Sound Event Localization and Detection with Source Distance EstimationQing Wang 0008, Jun Du 0002, Hengyi Hong, Maocheng Hu, Mingqi Cai, Xin Fang. 1-6 [doi]
- QEMesh: Employing A Quadric Error Metrics-Based Representation for Mesh GenerationJiaqi Li, Ruowei Wang, Yu Liu, Qijun Zhao. 1-6 [doi]
- SCJD: Sparse Correlation and Joint Distillation for Efficient 3D Human Pose EstimationWeihong Chen, Xuemiao Xu, Haoxin Yang, Yi Xie, Peng Xiao, Cheng Xu, Huaidong Zhang, Pheng-Ann Heng. 1-6 [doi]
- MMPX: Multi-modal Mamba Prompter to Large Vision Foundation Model for RGB-X Semantic SegmentationPengfei Wu, Ye Liu 0005, Hao Gao 0005, Jun Liu 0036. 1-6 [doi]
- ContrastiveGaussian: High-Fidelity 3D Generation with Contrastive Learning and Gaussian SplattingJunbang Liu, Enpei Huang, Dongxing Mao, Hui Zhang, Xinyuan Song 0002, Yongxin Ni. 1-6 [doi]
- Hierarchical Graph Learning Framework for Multimodal Conversational Emotion RecognitionJiandong Shi, Ming Li 0065, Guoheng Huang, Siwei Zhou, Yongchun Gu, Zhanle Zhu. 1-6 [doi]
- SAM-GA: SAM-Guided Grouped Aggregation Network for Weakly Supervised cardiac MRI SegmentationYang Li, Chengliang Wang, Xing Wu, Yonggang Luo, Peng Wang, Haidong Wang. 1-6 [doi]
- Hypergraph Self-Supervised Learning for Survival Prediction on Whole Slide ImagesYining Zhao, Hao Liu, Jielong Yan, Yongji Tian, Xiangmin Han. 1-6 [doi]
- GateM2Net: A Gated Multi-Modal Network for Joint Emotion and Sentiment AnalysisLi Yin, Baigang Mi, Yi Fan. 1-6 [doi]
- IGDiT: Illumination-Guided Low-light Image Enhancement with Diffusion Transformer ModelsBin Niu, Zhibin Zhang, Liqiang He. 1-6 [doi]
- Diffusion-Driven Source Consistency for Gradual Domain AdaptationWenwei Luo, Yuguo Hu, Jiafu Yan, Mengmeng Jing, Lin Zuo. 1-6 [doi]
- FALCON: Feedback-driven Adaptive Long/short-term memory reinforced Coding OptimizatioNZeyuan Li, Yangfan He, Lewei He, Jianhui Wang, Tianyu Shi, Bin Lei, Yuchen Li 0015, Qiuwu Chen. 1-6 [doi]
- Adaptive Pixel Classification and Equivalent Large Kernels for Lightweight Image Super-ResolutionPengyu Lin, Xunxun Zeng, Wanling Liu, Huayi Chen, Fei Chen. 1-6 [doi]
- Spectral Enhanced Tuning: An Efficient Plug-and-Play Framework for Frequency-Aware DehazingCheng Tang 0004, Wenqi Lou, Qianyu Cheng, Jiayi Tuo, Wei Fu, Tianhao Jiang, Chao Wang 0003, Xuehai Zhou. 1-6 [doi]
- IP-KGQA: Intent-Aware Prompt Learning for Knowledge Graph Question AnsweringZheng Dai, Chun Ding, Tianyi Chen, Si Wu 0002, Yong Xu 0007, Runzhe Liang, Tianshi Xu, Yedong Li, Dapeng Wu 0001. 1-6 [doi]
- An End-to-End Model for Photo-Sharing Multi-Modal Dialogue GenerationPeiming Guo, Sinuo Liu, Yanzhao Zhang, Dingkun Long, Pengjun Xie, Meishan Zhang, Min Zhang 0005. 1-7 [doi]
- Only One Stage: A Chemical-Aware Model for Accurate Combustion Chemical Kinetics PredictionZhenglun Sun, Peng Qiao, Yong Dou, Rongchun Li, Sidun Liu, Wenyu Li, Wenjie Hu. 1-6 [doi]
- DiffMissing: Denoising Diffusion Model for Multivariate Time Series Forecasting with Variable MissingBingheng Pang, Wei Li, Zhuoxuan Liang, Yidan Chen, Zhihong Wang, Moustafa Youssef 0002. 1-6 [doi]
- From History to Goal: Enhanced Vision-and-Language Navigation with Historical TraceabilityXinguang Zhu, Min Wang 0019, Li Li 0040, Wengang Zhou 0001, Houqiang Li. 1-6 [doi]
- Eff-DFQT: Efficient Model Inversion for Data-free Quantization of Vision TransformersMengkui Li, Xinrui Chen 0001, Hai Chen, Kang Zhao, Yanping Zhang, Shu Zhao, Fulan Qian. 1-6 [doi]
- Causal Intervention with Active Learning for Large Vision-Language Models in Egocentric ContextsWenxin Meng, Shenshen Li, Lei Wang 0185, Hao Yang, Chong Peng, Peng Yan, Xing Xu 0001. 1-6 [doi]
- LiVo: Bandwidth-Efficient Live Volumetric Video Streaming with Compact Capture and EncodingYizong Wang, Mingjia Yang, Liming Pang, Dong Zhao, Siwei Ma, Wen Gao. 1-6 [doi]
- Location-Oriented Sound Event Localization and Detection with Spatial Mapping and Regression LocalizationXueping Zhang, Yaxiong Chen, Ruilin Yao, Yunfei Zi, Shengwu Xiong 0001. 1-6 [doi]
- ATM-NeRF: Learning Adaptive Tone Mapping for Normal-Light Neural Radiance Field ReconstructionMin Wang, Xin Huang, Qing Wang. 1-6 [doi]
- Black-box Universal Adversarial Perturbations for Image and Video Quality Assessment MethodsGeorgii Bychkov, Sergey Lavrushkin, Dmitriy S. Vatolin. 1-6 [doi]
- CMRFusion: Efficient Feature Decomposition for RGB-T Fusion via Cross Modality Mask ReconstructionChao Yang, Chao Tian, Guoqing Zhu, Qiang Wang 0051, Zhenyu He 0001. 1-6 [doi]
- Towards Robust Visual Question Answering via Causal Intervention and Contrastive LearningWei Li, Zhixin Li. 1-6 [doi]
- Semantic and Temporal Integration in Latent Diffusion Space for High-Fidelity Video Super-ResolutionYiwen Wang, Xinning Chai, Yuhong Zhang, Zhengxue Cheng, Jun Zhao, Rong Xie 0004, Li Song 0001. 1-6 [doi]
- Gradient-guided Attention Fusion Network for Camouflaged Object DetectionWenrui Li, Meijun Sun, Cheng Liu, Xinyu Yan, Zheng Wang. 1-6 [doi]
- Semantic-Guided Residual Learning for the Quality Assessment of Enhanced ImagesShishun Tian, Zhiwei Lan, Zhengyu Zhang, Ting Su 0004, Xia Li 0006, Lu Zhang 0037. 1-6 [doi]
- MACA-VQA: Quality Assessment of UGC Videos via Multi-level Distortion Adaptation and Spatiotemporal Cross-Attention FusionBo Hu 0008, Yimeng Zhao, Leida Li, Lihuo He, Wen Lu, Xinbo Gao 0001. 1-6 [doi]
- Enabling Communication-efficient and Robust Federated Learning over Packet Lossy Networks via Random Interleaved Vector QuantizationYixuan Guan, Jianwei Niu 0002, Tao Ren 0001, Xuefeng Liu 0001. 1-6 [doi]
- FoCTTA: Low-Memory Continual Test-Time Adaptation with FocusYoubing Hu, Yun Cheng, Zimu Zhou, Anqi Lu, Zhiqiang Cao, Zhijun Li 0002. 1-6 [doi]
- CASD: Counterfactual Augmentation for Social Bot Detection on TwitterPin Xu, Fangfang Yuan, Yueshan Wang, Diandian Guo, Cong Cao, Yanbing Liu. 1-6 [doi]
- EasySplat: View-Adaptive Learning makes 3D Gaussian Splatting EasyAo Gao, Luosong Guo, Tao Chen, Zhao Wang, Ying Tai, Jian Yang 0003, Zhenyu Zhang 0005. 1-6 [doi]
- Interactive Sketch-Based Person Re-Identification with Text FeedbackXinyi Wu, Cuiqun Chen, Hui Zeng, Zhiping Cai, Bo Du 0001, Mang Ye. 1-6 [doi]
- A Fourier priors-Guided Diffusion Model for Image Harmonization with Structure-Preservation and Illumination-ConsistencyTianyou Wang, Xun Cai, Yanbo Gao, Yibo Wang, Shuai Li. 1-6 [doi]
- Human-MoE: Multimodal Full-Body Human Image Synthesis with Component-driven Mixture of ExpertsYu-Jiu Huang, I-Chen Lin. 1-6 [doi]
- Contrastive Intent-Disentangled Variational AutoEncoder for Sequential RecommendationYafan Yuan, Zhen Liu, Xinxin Yang, Sibo Lu. 1-6 [doi]
- Data-Free Knowledge Distillation with Diffusion ModelsXiaohua Qi, Renda Li, Long Peng, Qiang Ling 0001, Jun Yu 0001, Ziyi Chen 0005, Peng Chang 0002, Mei Han, Jing Xiao 0006. 1-6 [doi]
- Cross-Modal Task Verification via Hypergraph-based Sequential MatchingZhiyi Huang, Xun Jiang 0001, Zheng Wang 0044, Fumin Shen, Jingkuan Song, Xing Xu 0001. 1-6 [doi]
- Direct Preference Optimization for LLM-Enhanced Recommendation SystemsChao Sun, Yaobo Liang, Yaming Yang 0001, Shilin Xu 0001, Tianmeng Yang, Yunhai Tong. 1-6 [doi]
- Decoupling Representations with Quantized Vectors for Semi-Supervised Action Quality AssessmentLingfeng Ye, Kumie Gedamu, Jie Shao 0001. 1-6 [doi]
- ESTJ: Efficient Semantic Segmentation via Token Joint MergingZiniu Liu, Mingqing Liu, Fengxia Han, Xingtong Liu, Chuan Liu, Xi Zhang, Hao Deng 0002, Shengjie Zhao 0001. 1-6 [doi]
- Harnessing Pre-trained Language Models for EEG-based Epilepsy DetectionTao Lu, Shangyang Li. 1-6 [doi]
- Adaptive Illumination Transfer Network for Shadow RemovalYinan Wang, Si Wu 0002, Yong Xu 0007, Yan Huang 0031, Patrick Le Callet. 1-6 [doi]
- Perturbing Confounders via Causal Disentanglement for Domain GeneralizationJingliang Bian, Junhao Li, Jian Xu, Ruxin Wang 0001. 1-6 [doi]
- AdaMHF: Adaptive Multimodal Hierarchical Fusion for Survival PredictionShuaiyu Zhang, Xun Lin, Rongxiang Zhang, Yu Bai, Yong Xu, Tao Tan, Xubin Zheng, Zitong Yu. 1-6 [doi]
- RLCP: A Reinforcement Learning-based Copyright Protection Method for Text-to-Image Diffusion ModelZhuan Shi, Jing Yan, Xiaoli Tang, Lingjuan Lyu, Boi Faltings. 1-6 [doi]
- ECG-Chat: A Large ECG-Language Model for Cardiac Disease DiagnosisYubao Zhao, Jiaju Kang, Tian Zhang, Puyu Han, Tong Chen. 1-6 [doi]
- InpaintFormer: Prompt-guided High-Quality Face Inpainting with Mask-Aware Self-AttentionZhouhao Ouyang, Wen Xue, Tianyi Chen, Yan Huang 0031, Si Wu 0002, Yong Xu 0007, Patrick Le Callet, Dapeng Oliver Wu. 1-6 [doi]
- Leveraging 2D Annotations for Cost-Effective Dynamic Urban Scene ReconstructionChuming Wang, Yingshuang Zou, Haoqian Wang. 1-6 [doi]
- Unifying Spatio-Temporal Contexts for Advanced Text-Video RetrievalYanhao Huang, Baoyao Yang, Junxiang Chen, Wenbin Yao, Dixin Chen. 1-6 [doi]
- Privacy-Preserving Anti-Recompression Video Watermarking in Bitstream DomainZhekai Luo, Xiangyu Gao, Peijia Zheng, Jian Li, Weiqi Luo 0001. 1-6 [doi]
- AlignKT: Explicitly Modeling Knowledge State for Knowledge Tracing with Ideal State AlignmentJing Xiao, Chang You, ZhiYu Chen. 1-6 [doi]
- Latent Feature and Attention Dual Erasure Attack against Multi-View Diffusion Models for 3D Assets ProtectionJingwei Sun, Xuchong Zhang, Changfeng Sun, Qicheng Bai, Hongbin Sun 0001. 1-6 [doi]
- RetouchDiffusion: Unsupervised Personalized Image Retouching via Diffusion ModelsYang Dong, Zhuoqi Ma, Zejun You, Yunan Li 0001, Qiguang Miao. 1-6 [doi]
- Meta-Learning Empowered Meta-Face: Personalized Speaking Style Adaptation for Audio-Driven 3D Talking Face AnimationXukun Zhou, Fengxin Li, Ziqiao Peng, Xinyu Wang, Hongyan Liu 0002, Zhaoxin Fan, Jun He 0008. 1-6 [doi]
- DEQuant: Distribution-Enhanced Reconstruction for Post-Training QuantizationGuoming Lu, Guodong Zou, Dongnan Liu, Heng Yin, Jielei Wang, Guangchun Luo. 1-6 [doi]
- Injecting Cross-modal Fine-Grained Perception into LLMs for 3D Object-of-Interest UnderstandingQianqian Sun, Lu Shi, Linna Zhang, GaoYun An, Yi Jin 0001, Yidong Li, Yigang Cen. 1-6 [doi]
- Fine-tuned Multimodal Large Language Models are Zero-shot Learners in Image Quality AssessmentRui Xiong, Li Chen, Zhida Feng, Jiaxiang Liu, Shikun Feng. 1-6 [doi]
- Long-Tailed Federated Learning with Fixed ClassifierYi Li, Weichao Li, Xin Zheng 0008, Haiyan Fu, Yanqing Guo. 1-6 [doi]
- Global Semantic Extraction for Adaptive Cross-Semantic Learning: A Novel Framework for Remote Sensing Change CaptionQiaoli Sun, Yan Wang, Xiaoyu Song, Hongyi Dong. 1-6 [doi]
- WDRE-NET: Wavelet-Differential Convolution and Region-Expansion to Enhance Weakly Supervised Adjacent Nuclei SegmentationMeng Geng, Qian Huang, Yulin Chen, Xuejie Zhang. 1-6 [doi]
- Defect Detection-Guided Reconstruction Network for Ground Penetrating Radar B-Scan ImagesZilong Ling, Xinran Zhong, Siyu Zhou, Yu Yang, Zhongcheng Gui, Huabin Wang. 1-6 [doi]
- MAPLE: Modality-Agnostic Prototype Learning for Egocentric Action RecognitionDa Li, Di Zhou, Yishan Zou, Shenghua Li, Meng Liu. 1-6 [doi]
- Culture-based Adversarial Attack on Text-to-Image ModelsFuyi Yang, Chenyu Zhang 0003, Lanjun Wang. 1-6 [doi]
- Prototype Guided Multi-Scale Class Aggregation for Generalized Few-Shot Semantic SegmentationWenxin Jiang, Peng Qin, Guanhua Zhang, Kai Wu, Shengke Wang. 1-6 [doi]
- Tapping Beyond Hands: Assisting No-Handed Touch Interaction under Situational ImpairmentsYawen Zheng, Jin Huang 0009, Hao Zhang 0120, Yang Li 0058, Juan Liu 0008, Yulong Bian, Chenglei Yang, Xiangxu Meng. 1-6 [doi]
- Representation Disentanglement for Semantic CodingJinming Liu 0001, Junhao Geng, Lexiang Lv, Wenjun Zeng 0001, Xin Jin 0014. 1-6 [doi]
- Supplementary Material for "NoiseActor: A Noise-Action Collaborative Framework for Privacy-Preserving Action Recognition without Privacy Labels"Xiao Li, Xiao-Ming Wu 0002, Delong Zhang, Kun-Yu Lin, Yi-Xing Peng, Ling-An Zeng, Wei-Shi Zheng 0001. 1-3 [doi]
- Integrate-and-Fire Compressor: Learning to Compress Context for LLMs AdaptivelyYunlong Zhao, Xiyun Li, Ziyi Wang, Haoran Wu, Minglun Han, Bo Xu 0002. 1-6 [doi]
- MalDenoise: Enhancing Robustness of API-Based Malware Detection Against Adversarial AttacksXiaohui Chen, Xin Wang, Zuhui Yue, Zheng Li, Peipei Liu, Hongsong Zhu. 1-6 [doi]
- Advancing Safe Language Generation: Exploring Alternative Constrained RLHFFanyu Meng, Zhixin Bai, Yanming Wang, Jing Huo, Boyan Wang, Xi Yang, Yang Gao 0001. 1-6 [doi]
- T-Dreamer: Topology-Aware Text-to-3D GenerationXiaoxuan Wu, Qiulu Li, Lin Shu, Ke Lv, Ke Chen. 1-6 [doi]
- MotionFlow: Learning Implicit Motion Flow for Complex Camera Trajectory Control in Video GenerationGuojun Lei, Chi Wang, Yikai Wang, Hong Li, Ying Song, Weiwei Xu. 1-6 [doi]
- Boosting Adversarial Transferability by Constructing Adversarial TrajectoriesQiang Wan, Sanshuai Cui, Anjie Peng, Hui Zeng 0002, Rong Wei. 1-6 [doi]
- Enhancing Few-Shot Class-Incremental Learning via Cross-Modal Bias AlignmentDesen Wang, Zhiming Chen, Xiang Qiu, Yishu Liu, Bingzhi Chen. 1-6 [doi]
- CFF: Coarse-to-Fine-to-Fusion Semantic Prototype Generation for Zero-Shot ClassificationYuting Lin, Xuanwen Su, Tengfei Liang, Yi Jin 0001, Tao Wang 0011, Yidong Li. 1-6 [doi]
- Model Discrepancy Learning: Synthetic Faces Detection Based on Multi-ReconstructionQingchao Jiang, Zhishuo Xu, Zhiying Zhu 0001, Ning Chen 0007, Haoyue Wang, Zhongjie Ba. 1-6 [doi]
- Towards A Real-World Road Damage Detection DatasetMenghao Hu, Zuogan Tang, Xiaoshan Yang, Zhe Wu 0006, Zhouxin Yang, Shaocong Wu, Yaguang Song, Kui Hou, Qingfang Zheng, Yaowei Wang 0001. 1-6 [doi]
- SQ-Delta: Ultra-High Delta Compression for LLMs via Joint Sparsification-QuantizationYanfeng Jiang, Zelan Yang, Bohua Chen, Shen Li, Yong Li, Tao Li. 1-6 [doi]
- Pyramid-based Mamba Multi-class Unsupervised Anomaly DetectionNasar Iqbal, Niki Martinel. 1-6 [doi]
- Synopses of Movie Narratives: a Video-Language Dataset for Story UnderstandingYidan Sun, Qin Chao, Yangfeng Ji, Boyang Li 0001. 1-6 [doi]
- A Deep Single Image Rectification Approach for Pan-Tilt-Zoom CamerasTeng Xiao, Qi Hu, Qingsong Yan, Wei Liu, Zhiwei Ye, Fei Deng. 1-6 [doi]
- Knowledge Rumination for Client Utility Evaluation in Heterogeneous Federated LearningXiaorui Jiang, Yu Gao, Hengwei Xu, Qi Zhang, Yong Liao, Peng Yuan Zhou. 1-6 [doi]
- Nutrition Prediction from Food Images Using Foundation ModelsVitalii Emelianov, Niki Martinel. 1-6 [doi]
- Fraesormer: Learning Adaptive Sparse Transformer for Efficient Food RecognitionShun Zou, Yi Zou, Mingya Zhang, Shipeng Luo, Zhihao Chen, Guangwei Gao. 1-6 [doi]
- ElimPCL: Eliminating Noise Accumulation with Progressive Curriculum Labeling for Source-Free Domain AdaptationJie Cheng, Hao Zheng 0009, Meiguang Zheng, Lei Wang, Hao Wu, Jian Zhang. 1-6 [doi]
- Residual-based Efficient Bidirectional Diffusion Model for Image Dehazing and Haze GenerationBing Liu, Le Wang, Hao Liu, Mingming Liu. 1-6 [doi]
- EPIC: Efficient Prompt Interaction for Text-Image ClassificationXinyao Yu 0003, Hao Sun 0013, Zeyu Ling, Ziwei Niu, Zhenjia Bai, Rui Qin, Yen-Wei Chen 0001, Lanfen Lin. 1-6 [doi]
- Latent-Info and Low-Dimensional Learning for Human Mesh Recovery and Parallel OptimizationXiang Zhang, Suping Wu, Sheng Yang. 1-6 [doi]
- Align-AV-HuBERT: AV-HuBERT with Audio-Visual Temporal AlignmentCancan Li, Fei Su, Juan Liu. 1-6 [doi]
- OSA: Object-level Scale Alignment for Small Object Detection in Large-Scale ImagesYuxiang Wang, Yixuan Ji, Xiangqin Chen, Chuanyuan Tan, Yajin Li, Haozhong Xue, Zheng Zhao. 1-6 [doi]
- LiveImage: Motion Condition Guided Diffusion Model for Video Motion TransferGaurav Rai, Ojaswa Sharma. 1-6 [doi]
- Local Data Quantity-Aware Weighted Averaging for Federated Learning with Dishonest ClientsLeming Wu, Yaochu Jin, Kuangrong Hao, Han Yu 0001. 1-6 [doi]
- EarlyMix: Hierarchical Mixing for Early Time Series ClassificationShuguo Hu, Jun Hu, Junwei Lv, Huaiwen Zhang. 1-6 [doi]
- RKU: Relevant Knowledge-aware Unlearning for Federated Continual LearningHaodong Zhang, Liu Yang, Zihan Jiang. 1-6 [doi]
- ProDehaze: Prompting Diffusion Models Toward Faithful Image DehazingTianwen Zhou, Jing Wang, Songtao Wu, Kuanhong Xu. 1-6 [doi]
- Adapting Cross-Modal Semantic Discrepancy in Text-based Person SearchXinpan Yuan, Jiabao Li, Wei Xia, Wenguang Gan, Mengxi Ying, Liujie Hua. 1-6 [doi]
- THOR: Text to Human-Object Interaction Diffusion via Relation InterventionQianyang Wu, Ye Shi 0001, Xiaoshui Huang, Lan Xu 0003, Jingyi Yu 0001, Jingya Wang. 1-7 [doi]
- LiPlan: A Multimodal Dataset for Livable Urban Environment Layout GenerationJianrong Wang, Shuyun Zhang, Ying Guo, Qi Li, Ju Zhang, Di Jin. 1-6 [doi]
- GDNeRF: Generalizable Depth-based NeRF for sparse view synthesisSergio Montoya de Paco, IvĂ¡n Huerta, Josep Escrig. 1-6 [doi]
- Poison in the Well: Feature Embedding Disruption in Backdoor AttacksZhou Feng, Jiahao Chen, Chunyi Zhou, Yuwen Pu, Qingming Li, Shouling Ji. 1-6 [doi]
- G-TADS: GUI Task-Ability Decoupling Strategy for High-Adaptability Multimodal Intelligent AgentsZhiqiang Xia, Xinyuan Zhang, Yang Li, Yuchen Liu, Runyu Shi, Jiaming Xu. 1-6 [doi]
- ROMA: Regularization for Out-of-distribution Detection with Masked AutoencodersXiaochen Feng, Yuan Jiang, Hao Sha, Yongbing Zhang 0002. 1-6 [doi]
- RealMind: Advancing Visual Decoding and Language Interaction via EEG SignalsDongyang Li, Haoyang Qin, Mingyang Wu, Jiahua Tang, Chen Wei 0006, Quanying Liu. 1-6 [doi]
- Bi-Grid Reconstruction for Image Anomaly DetectionHuichuan Huang, Zhiqing Zhong, Guangyu Wei, Yonghao Wan, Wenlong Sun, AiMin Feng. 1-6 [doi]
- Knowledge Transfer and Domain Adaptation for Fine-Grained Remote Sensing Image SegmentationShun Zhang, Xuechao Zou, Kai Li 0023, Congyan Lang, Shiying Wang, Pin Tao, Tengfei Cao. 1-6 [doi]
- Dataset Pruning: Optimizing Image Datasets with a Cross-Validation MethodYanmin Chen, Shuo Wang, Mengyao Zhou, Chenglin Liu, Jun Luo. 1-6 [doi]
- Leave the Bias in Bias: Mitigating the Label Noise Effects in Continual Visual Instruction Fine-TuningXiaoyu Tan, Teqi Hao, Xihe Qiu, Shaojie Shi, Yuan Cheng, Wei Chu, Yinghui Xu, Yuan Qi 0001. 1-7 [doi]
- Video Quality Assessment for Resolution Cross-Over in Live SportsJingwen Zhu, Yixu Chen, Hai Wei, Sriram Sethuraman, Yongjun Wu. 1-6 [doi]
- Brainstorming Brings Power to Large Language Models of Knowledge ReasoningZining Qin, Chenhao Wang 0001, Jianxiong Guo, Huiling Qin, Weijia Jia 0001. 1-6 [doi]
- Rethinking Steel Surface Defect Segmentation with Pseudo Mixup and Self DistillationJialin Xu, Jing Tang, Yankai Jin, Jun Liu, Zeyu Gong. 1-6 [doi]
- Degradation-Aware Multi-Task Image Restoration with State Space ModelsTao Wu, Purui Bai, Huaibo Huang, Jie Cao 0002, Yuang Ai, Ran He 0001. 1-6 [doi]
- CSCE: Boosting LLM Reasoning by Simultaneous Enhancing of Causal Significance and ConsistencyKangsheng Wang, Xiao Zhang, Juntao Lyu, Tianyu Hu, Huimin Ma 0001. 1-6 [doi]
- Token-Driven Linkage Network: One-Shot Adaptation of SAM for Challenging Segmentation ScenariosYao Shen, Kaiyang Zeng, Guangyao Li. 1-6 [doi]
- Hierarchical Sub-action Tree for Continuous Sign Language RecognitionDejie Yang, Zhu Xu, Xinjie Gao, Yang Liu. 1-6 [doi]
- Efficient Diffusion Bridge with Initial-Value Correction Strategy for Super-ResolutionJiati Cai, Yue Lei, Wenxin Tai, Xing He, Ting Zhong, Jia Chen, Fan Zhou 0002. 1-6 [doi]
- Cognitive Inspired Generalization Boosting for Face Forgery DetectionYunwen Huang, Hua Yang. 1-8 [doi]
- GauSurfaceAvatar: A Realistic Human Head Model with Variable Texture Based on 2D GaussiansLijie Geng, Junli Zhao, Lin Gao 0004, Ran Yi 0002, Fuqing Duan, Zhenkuan Pan 0001, Yong-Jin Liu 0001. 1-6 [doi]
- P2WNet: Homography Estimation for Part-To-Whole and Cross-Modality ScenariosShangXuan Xie, Haifeng Wu, Wen Li 0001, Lixin Duan. 1-6 [doi]
- CLEARSTR: Contextual Learning with Edge-guided and Adaptive-texture Reconstruction for Scene Text RemovalSanhita Pathak, Vinay Kaushik, Brejesh Lall. 1-6 [doi]
- 3DGCoding: Novel Framework for 3D Gaussian Video Incremental Training and CodingPeiheng Wang, Haodan Zhang, Quanlu Jia, Jiangkai Wu, Liming Liu, Haoyang Wang, Xinggong Zhang. 1-6 [doi]
- Time-Series Anomaly Detection Method Based on Frequency-Domain Decoupling and CorrectionDi Niu, Enyuan Zhao, Jie Nie, Min Ye, Shusong Yu, Xinyue Liang. 1-6 [doi]
- SE(3)-Equivariant Multi-Scale Graph Transformer for Multi-Resolution 3D Aneurysm SegmentationXudong Ru, Xingce Wang, Peng Du, Yanghui Yan, Shaolong Liu, Yi-Cheng Zhu, Wuyang Shui, Zhongke Wu. 1-6 [doi]
- Multi-view Video Coding with Decoupled Neural Representation for Multi-modal Traffic DataSiqian Nie, Xin Ding, Jiabo Wu, Sihan Lin, Qiong Liu. 1-6 [doi]
- True Match: Leveraging 2D-Assisted Queries for Multi-view 3D Detection in Polar SpaceYefei Hou, Jie Tang. 1-6 [doi]
- UniSync: A Unified Framework for Audio-Visual SynchronizationTao Feng, Yifan Xie, Xun Guan, Jiyuan Song, Zhou Liu, Fei Ma 0006, F. Richard Yu. 1-6 [doi]
- Diff-Art: Category-level Articulation Pose Estimation via Conditional DiffusionYukang Huo, XianHui Meng, Li Zhang 0104, Haonan Jiang, Yan Zhong 0001, Mingyuan Yao, Haihua Wang. 1-6 [doi]
- GraphTEN: Graph Enhanced Texture Encoding NetworkBo Peng, Jintao Chen, Mufeng Yao, Chenhao Zhang, Jianghui Zhang, Mingmin Chi, Jiang Tao. 1-6 [doi]
- Spectrum-Adaptive Distribution of 2D Gaussians for Image Representation and CompressionZunian Wan, Jiancheng Zhao, Yepeng Ding, Lingfeng Zhang 0002, Hiroyuki Sato 0002, Takefumi Ogawa. 1-6 [doi]
- Cross Knowledge Distillation between Artificial and Spiking Neural NetworksShuhan Ye, Yuanbin Qian, Chong Wang 0001, Sunqi Lin, Jiazhen Xu, Jiangbo Qian, Yuqi Li. 1-6 [doi]
- Multi-Scale Core-Peripheral Attention Network for Camouflaged Object DetectionYueqian Quan, Tiancheng Pan, Chuangjie Fang, Yan Li, Jianwei Zheng 0001. 1-6 [doi]
- Make Prototypes Perform Again: Prior-Prototypes Based Feature learning Framework for Few-Shot HashingYi Lu, Shu Li, Huanglong Dong, Shuxiang Hou, Yurong Qian. 1-6 [doi]
- MixLGN: Mixed Local-Global Network for 3D Human Pose GenerationXinyang Liu, Sanyi Zhang, Chixuan Wei, Yinghao Yang 0002, Long Ye. 1-6 [doi]
- Refined Temporal Pyramidal Compression-and-Amplification Transformer for 3D Human Pose EstimationHanbing Liu, Zhi-Qi Cheng, Wangmeng Xiang, Jun-Yan He, Bin Luo 0008, Yifeng Geng, Xuansong Xie. 1-6 [doi]
- Optimization of Multimodal Inputs Based on Diffusion Models: Zero-Shot Semantic Image GenerationLeiLei Wang, Renjie Lu, Fengzhao Sun, Yunxiang Zhang, Jun Yu 0001, Qingsong Liu, Jianqing Sun, Jiaen Liang. 1-6 [doi]
- A GAN-based Data Poisoning Backdoor Attack Method for Palmprint Recognition CNNsYuqi Wang, Bob Zhang. 1-6 [doi]
- MSA-SAM2Net: A Polyp Segmentation Framework Based on Large Kernel Multi-Scale AttentionJiyun Li, Jie Pan, Chen Qian, Ying Shen, Jiabao Zhao. 1-6 [doi]
- Visibility-GS: Visibility-Wise Densification of 3D Gaussian SplattingHaoru Deng, Jiaxiang Qian, Shuangli Du, Sha Li, Ruoling Qi, Zhenyu Xu. 1-6 [doi]
- Hyperspherical Dataset Distillation via Contrastive Embedding AlignmentShuoxi Zhang, Hanpeng Liu, Stephen Lin 0001, Kun He 0001. 1-6 [doi]
- PSUMatch: Unifying Open-Set Semi-Supervised Learning with Progressive Semantic UniversumChenyang Song, Songcan Chen. 1-6 [doi]
- Rethinking Cross-view Object Geo-Localization: Towards Many-to-Many Real-world LocalizationYuanyuan Li, Qingwang Zhang, Yingying Zhu. 1-6 [doi]
- SparseDM: Toward Sparse Efficient Diffusion ModelsKafeng Wang, Jianfei Chen, He Li, Zhenpeng Mi, Jun Zhu. 1-7 [doi]
- IMTrack: Interlayer Interoperability and Multi-scene Optimization for Visual Multimodal Target TrackingRui Zhu, Zhaokang Lu, Bohan Liu, Yun Yang, Hua Yue, Chaogang Wang, Zixin Zhou. 1-6 [doi]
- MDC: Modality Distribution Consistent Distillation for Multi-View 3D Object DetectionHuikai Liu, Junyin Wang, Wenqian Zhu, Bin Fu, Shengwu Xiong, Cheng Liu. 1-6 [doi]
- Geometric-Aware Mapping and Uncertainty Modeling for Semantic Scene CompletionXianzhu Liu, Yuhe Zhu, Weiyu Zhao, Chen hui, Jianping Zhong. 1-6 [doi]
- WT-BCP: Wavelet Transform based Bidirectional Copy-Paste for Semi-Supervised Medical Image SegmentationMingya Zhang, Liang Wang 0006, Limei Gu, Tingsheng Ling, XianPing Tao. 1-6 [doi]
- Knowledge Graphs Acquisition via Forward-Reverse Relation Enhanced Contrastive Pretraining from Large-scale ModelsLiu Yu, Fenghui Tian, Ping Kuang, Zhikun Feng, Fan Zhou. 1-6 [doi]
- Rethinking Cross-Modality Fusion Mamba from a Frequency Domain PerspectiveZeyu Wang, Huiying Xu, Yun Liu, Chen Li, Xinzhong Zhu, Xiaolei Zhang, Hongbo Li. 1-6 [doi]
- Neural Representations for Scalable Video CodingYiying Wei, Hadi Amirpour, Christian Timmerer. 1-6 [doi]
- Noise Mitigation for Unsupervised Cross-Domain Image RetrievalJiayang Liu, Kai Wang, Zheng Wang 0044, Xin Liu 0011, Fumin Shen, Xing Xu 0001. 1-6 [doi]
- Serial Low-rank Adaptation of Vision TransformerHouqiang Zhong, Shaocheng Shen, Ke Cai, Zhenlong Wu, Jiangchao Yao, Yuan Cheng, Xuefei Li, Xiaoyun Zhang 0001, Li Song 0001, Qiang Hu 0003. 1-6 [doi]
- Distraction Suppression and Feature Modulation Network for Camouflaged Object DetectionHan Lyu, Meijun Sun, Haowei Ran, Yipu Liu, Xinyu Yan 0001, Zheng Wang 0008. 1-6 [doi]
- Efficient Local-Global Collaboration Transcoding for JPEG AIYiming Wang, Zhaobin Zhang, Yaojun Wu, Qian Huang, Bin Tang, Kai Zhang, Li Zhang. 1-6 [doi]
- Iterative Multi-Collaborative Training Network for Point Cloud Learning with Noisy AnnotationsXiao Shao, Weiqi Yan, Yu Zang. 1-6 [doi]
- An EEG Dataset with Subjective-Objective Perception Data for Assessing Stereoscopic Visual Discomfort Induced by 3D Motion VideosNa Lu, Xiaojie Zhao, Li Yao. 1-6 [doi]
- Adaptive Frequency Threshold Pooling for Mitigating Aliasing in Few-Shot SegmentationShangjing Chen, Feng Xu 0008, Xin Lyu 0001, Xin Li 0090. 1-6 [doi]
- An Investigation on Audio-Prompt and Structure Guided Long-Duration Music Generation Based on Diffusion ModelsZiyu Zhao, Zilu Guo, Jun Du 0002, Feng Ma, Jia Pan. 1-6 [doi]
- Slot Inversion for Asymmetric Composed Image RetrievalHaiwen Li, Zining Chen, Ying Liu, Fei Su, Zhicheng Zhao 0001. 1-6 [doi]
- PiCo: Jailbreaking Multimodal Large Language Models via Pictorial Code ContextualizationAofan Liu, Lulu Tang, Ting Pan, Yuguo Yin, Bin Wang, Ao Yang. 1-6 [doi]
- Leveraging Multiple Deep Experts for Online Class-incremental LearningZhe Tao, Lu Yu 0004, Hantao Yao, Changsheng Xu. 1-6 [doi]
- Enhancing Federated Learning Robustness with Pre-trained Staged Modular DistillationJiankang Wei, Xu Ma, Yuan Ma, Hongwei Zhou, Jingtong Huang, Xiaoyu Zhang. 1-6 [doi]
- Gaze4ASD: A Novel Dataset and Visual Saliency Map-Based Method for Autism ScreeningYizhang Yang, Jinshi Cui, Xi Guo, Xing Su, Wei Ni, Junshi Lu, Li Wang, Huimin Ma. 1-6 [doi]
- Visual-Textual Feature Learning for Rare Human-Object Interactions DetectionMingliang Xue, Chong Cao, Zhengyang Zhao, Xiaodong Duan, Shu Cao. 1-6 [doi]
- SmartEdit: Editing-driven Engagement Prediction and Enhancement of Short-VideosSaumya Gupta, Ishita Dasgupta 0002, Stefano Petrangeli, Somdeb Sarkhel. 1-11 [doi]
- CLGC: Continuous Layout Guidance for Consistent Text-to-Video EditingXuancheng Xu, Ming Tao 0002, Bing-Kun Bao. 1-6 [doi]
- IntegralCAM: Integral-based contribution estimation and visualization for convolutional neural networksTeng-Yok Lee. 1-6 [doi]
- Dataset Quantization Augmentation: Improving Dataset Compression Through Complexity-Guided Sampling and AugmentationZiyang Li, Qin Liu, Fengshan Zhao, Yujie Wang, Takeshi Ikenaga. 1-6 [doi]
- Mitigating Cache Noise in Test-Time Adaptation for Large Vision-Language ModelsHaotian Zhai, Xinyu Chen, Can Zhang, Tianming Sha, Ruirui Li. 1-6 [doi]
- PDMambaNet: Poisson Denoising-Aided Twin-Path Mamba for Brain MRI Image SegmentationDayong Ren, Feifei Zhang, Fei Shi, Aoxue Chen. 1-6 [doi]
- High Resolution Wire Segmentation with Domain AdaptionYu Zhong, Tao Xie, Anna Zhu. 1-6 [doi]
- Redundancy Optimization via Mutual Information for Unsupervised Domain AdaptationXing Wei 0002, Dexuan Zhao, Fan Yang 0063, Taizhang Hu, Chong Zhao, Yang Lu 0015. 1-6 [doi]
- Semantic Palette-Guided Color PropagationZi-Yu Zhang, Bing-Feng Seng, Ya-Feng Du, Kang Li, Zhe-Cheng Wang, Zheng-Jun Du. 1-6 [doi]
- Learning Dual-Domain Multi-Scale Representations for Single Image DerainingShun Zou, Yi Zou, Mingya Zhang, Shipeng Luo, Guangwei Gao, Guojun Qi. 1-6 [doi]
- Dialogue Director: Bridging the Gap in Dialogue Visualization for Multimodal StorytellingMin Zhang, Zilin Wang, Liyan Chen, KunHong Liu 0001, Juncong Lin. 1-6 [doi]
- ACCL: A Plug-and-play Adaptive Confusion-aware Contrastive Loss for UAV-to-Satellite GeolocalizationYining Zhu, Zihao Deng, Jun Wang 0012, Boxuan Li, Long Xiao, Jikun Shen, Yuan Yao. 1-6 [doi]
- Tactile Information Coding for DNA Storage with Prospects for AI ApplicationsRongduo Han, Cihan Ruan, Shunye Tang, Haoyu Wu, Nam Ling, Haining Zhang. 1-6 [doi]
- Graph Anomaly Detection via Structure to Attribute ReconstructionXingshen Wei, Wei Liu, Wenzhong Li, Sanglu Lu. 1-6 [doi]
- FMNet: Frequency-Assisted Mamba-Like Linear Attention Network for Camouflaged Object DetectionMing Deng, Sijin Sun, Zihao Li, Xiaochuan Hu, Xing Wu. 1-6 [doi]
- HingeNet: A Harmonic-Aware Fine-Tuning Approach for Beat TrackingGanghui Ru, Jieying Wang, Jiahao Zhao, Yulun Wu 0002, Yi Yu 0001, Nannan Jiang, Wei Wang, Wei Li 0012. 1-6 [doi]
- DialogueAgents: A Hybrid Agent-Based Speech Synthesis Framework for Multi-Party DialogueXiang Li, Duyi Pan, Hongru Xiao, Jiale Han, Jing Tang, Jiabao Ma, Wei Wang, Bo Cheng. 1-6 [doi]
- Transferable Attack against Face Swapping in an Extended SpaceMingzhi Lyu, Yi Huang 0013, Jun Xie, Zihao Zhao, Hong Xu, Adams Wai-Kin Kong. 1-6 [doi]
- Mitigating Hallucination in Large Video-Language Models with Injected SemanticsBimei Wang, Fan Wen, Jisheng Dang, Huiguo He, Xiwen Wang, Nannan Zhu, Jiasi Weng. 1-6 [doi]
- Optimizing and Attacking Embodied Intelligence: Instruction Decomposition and Adversarial RobustnessMinghao Li, Wenpeng Xing, Yong Liu, Wei Zhang, Meng Han. 1-6 [doi]
- Dual Information Speech Language Models for Emotional ConversationsChun Wang, Chenyang Liu, Wenze Xu, Weihong Deng. 1-6 [doi]
- Pretrain like Your Inference: Masked Tuning Improves Zero-Shot Composed Image RetrievalJunyang Chen, Hanjiang Lai. 1-6 [doi]
- VSD2M: Large-scale Vision-language Sticker Dataset for Multi-frame Animated Sticker GenerationZhiqiang Yuan, Jiapei Zhang, Ying Deng, Yeshuang Zhu, Jie Zhou, Jinchao Zhang. 1-6 [doi]
- Tell and Show: A Multimodal Guidance Method for Instructional Video PlanningMingzhe Zhang, Yinghui Zhang, Fengxiang Ge. 1-6 [doi]
- MAMF-Net: Modality-Adaptive Masked Fusion Network for Speech Emotion RecognitionHengrui Li, Tianyi Lu, Jianfeng Wang, Xiaopei Chen, Yongbing Zhang, Shaohui Liu. 1-6 [doi]
- Unified-Modality Attention Network for Multimodal Sentiment AnalysisZuocheng Li, Lishuang Li. 1-6 [doi]
- ALIVE: Asynchronous Lower Body Pose Estimation with Images, Visual-Inertial Odometry and ElectromyographyGuoming Du, Zhen Ding, Xinrun Li, Su Wang, Wendi Peng, Hong Huang, Feng Jiang 0001. 1-6 [doi]
- Zero-1-to-3DGS: a Single Image to 3D Gaussian by Consistent Multi-view GenerationShenghao Yang 0006, Hongtao Zhang, Jianxing Ren, Zhihao Tang 0008, Mingbo Zhao, Yuping Liu. 1-6 [doi]
- A Multi-Branch Network for Pose Trajectory Smoothing and RefinementPanpan Chen, Ying Jiang, Haidong Hu, Chuangye Wang, Haolun Li 0001, Hao Gao 0005. 1-6 [doi]
- ForeNet: Unlocking Long-Term Series Forecasting in High-Dimensional Scenario via Forest StructureXinyu Li, Hao Xu, Zhiheng Yang, Hongxiang Zhou, Hong Lu, Xin Wang, Jin Zhao. 1-6 [doi]
- Lesion Localization for Medical Imaging Using Counter-factual Generation Prompt LearningYang Wei, Yi Pan 0001, Limai Jiang, Juan He, Bokai Yang, Yufu Huo, Yunpeng Cai, Ruitao Xie. 1-6 [doi]
- Multimodal Representation Learning Techniques for Comprehensive Facial State AnalysisKaiwen Zheng, Xuri Ge, Junchen Fu, Jun Peng, Joemon M. Jose. 1-6 [doi]
- Characterizing High-order Interactions between Eye Movement and Head Motion Variables in Augmented Reality-based Navigation ExperienceQing Xu 0002, Shunbo Wang, Yunxiang Jiang, Simon Parkinson, Klaus Schoeffmann, Chuntie Chen. 1-6 [doi]
- Hier-pFedMe: Hierarchical Personalized Federated Learning with Moreau EnvelopesXi Liu, Fanfan Ji, Bo Liu 0005, Xiao-Tong Yuan. 1-6 [doi]
- MSC-Bench: Benchmarking and Analyzing Multi-Sensor Corruption for Driving PerceptionXiaoshuai Hao, Guanqun Liu 0008, Yuting Zhao, Yuheng Ji, Mengchuan Wei, Haimei Zhao, Lingdong Kong, Rong Yin 0001, Yu Liu. 1-6 [doi]
- Knowledge Calibration DistillationChun Xie, Huimin Tong, Guoxi Xu, Yipeng Chen, Li Luking, Yiwei Chen. 1-7 [doi]
- ESTI: An Efficient Spatial-Temporal Interaction Network For Video-Based Person Re-IdentificationGuquan Jing, Peng Gao, Yiyang Hu, Yujian Lee, Hui Zhang 0062. 1-6 [doi]
- StyleRWKV: High-Quality and High-Efficiency Style Transfer with RWKV-like ArchitectureMiaomiao Dai, Qianyu Zhou 0001, Lizhuang Ma. 1-6 [doi]
- TC-GS: Tri-plane based Compression for 3D Gaussian SplattingTaorui Wang, Zitong Yu, Yong Xu. 1-6 [doi]
- Dynamic Importance in Diffusion U-Net for Enhanced Image SynthesisXi Wang, Ziqi He, Yang Zhou. 1-6 [doi]
- IRSTS Generalist: Improving Generalization in Infrared Small Target Segmentation Using One ShotBingbing Dan, Xinyu Tian, Meihui Li, Tao Tang 0005, Jing Zhang. 1-6 [doi]
- MIPP-FL: Personalized Layer Privacy Protection Federated Learning Based on Mutual InformationXijun Zhao, Gang Li, Hongming Chen. 1-6 [doi]
- Variance-Reduction Guidance: Sampling Trajectory Optimization for Diffusion ModelsShifeng Xu, Yanzhu Liu, Adams Wai-Kin Kong. 1-6 [doi]
- Denoising Diffusion Probabilistic Model for Point Cloud Compression at Low Bit-RatesGabriele Spadaro, Alberto Presta, Jhony H. Giraldo, Marco Grangetto, Wei Hu, Giuseppe Valenzise, Attilio Fiandrotti, Enzo Tartaglione. 1-6 [doi]
- QoE Evaluation of Remote Physiotherapy in Volumetric Video and Video-Based Real-Time CommunicationAshutosh Singla, Irene Viola 0001, Jack Jansen 0001, Pablo César. 1-6 [doi]
- Prototype-Based Communication Topology Optimization for Decentralized Federated LearningXinlin Leng, Kangyu Hu, Hanlin Gu, Xiangui Kang, Wenyuan Yang. 1-6 [doi]
- Compression Metadata-assisted RoI Extraction and Adaptive Inference for Efficient Video AnalyticsChengzhi Wang, Peng Yang. 1-6 [doi]
- Time-Series Acoustic Network for Underwater Acoustic Target RecognitionPengyuan Qi, Ye Tian 0027, Guisheng Yin. 1-6 [doi]
- Omni-AD: Learning to Reconstruct Global and Local Features for Multi-class Anomaly DetectionJiajie Quan, Ao Tong, Yuxuan Cai, Xinwei He, Yulong Wang, Yang Zhou. 1-6 [doi]
- 3DGlobalFormer: Three Domain Global Feature Fusion in 3D Human EstimationTianyi Ma, Muqing Wu, Zijian Zhang. 1-6 [doi]
- PGD-N2L: A Parameter-Guided Disentanglement Approach for Normal-To-Lombard Speech ConversionHongyang Chen 0004, Yuhong Yang 0001, Xinmeng Xu, Xingyu Liu, Weiping Tu, Zhongyuan Wang 0001, Cedar Lin, Xin Zhao. 1-6 [doi]
- A Novel Differential Privacy Federated Learning Framework: An Adaptive Budget Allocation and Reversion MethodXu Zhao, Gang Li, Jun Cai. 1-6 [doi]
- ASimp: Automatic High-Poly 3D Mesh Simplification for Preprocessing Based on QoELehao Lin, Hong Kang, Yuqi Shi, Haihan Duan, Abdulmotaleb El-Saddik, Wei Cai 0002. 1-6 [doi]
- Semantic-aware Fine-grained Point Augmentation for 3D Multi-modal Object DetectionWei Li 0190, Kuan Zhu, Haiyun Guo, Honghui Dong, Jinqiao Wang. 1-6 [doi]
- Quantized Memory-Efficient Full-Parameter Tuning with Sign Descent OptimizationXuezhi Zhao, Haichen Bai, Qiang Li, Qi Wang. 1-6 [doi]
- MuseFace: Text-driven Face Editing via Diffusion-based Mask Generation ApproachXin Zhang, Siting Huang, Xiangyang Luo 0002, Yifan Xie, Weijiang Yu, Heng Chang, Fei Ma 0006, Fei Yu. 1-6 [doi]
- Blended-Target Domain Adaptation via Multi-Prompt Coordination LearningYuwu Lu, Yihan Yang. 1-6 [doi]
- Distribution-Aware Hadamard Quantization for Hardware-Efficient Implicit Neural RepresentationsWenyong Zhou, Jiachen Ren, Taiqiang Wu, Yuxin Cheng, Zhengwu Liu, Ngai Wong 0001. 1-6 [doi]
- Efficient RGBT Tracking via Heterogeneous Hierarchical Knowledge DistillationDengdi Sun, Shiqi Liu, Chenglong Li 0002, Andong Lu. 1-6 [doi]
- FOCUS: Fine-grained Optimization with Semantic Guided Understanding for Pedestrian Attributes RecognitionHongyan An, Kuan Zhu, Xin He, Haiyun Guo, Chaoyang Zhao, Ming Tang 0001, Jinqiao Wang. 1-6 [doi]
- GiVE: Guiding Visual Encoder to Perceive Overlooked InformationJunjie Li, Jianghong Ma, Xiaofeng Zhang 0002, Yuhang Li, Jianyang Shi. 1-6 [doi]
- A Temporal Modeling Framework for Video Pre-Training on Video Instance SegmentationQing Zhong, Peng-Tao Jiang, Wen Wang, Guodong Ding, Lin Wu, Kaiqi Huang. 1-6 [doi]
- FDAVS: Exploring Frequency-Driven Modality Enhancement in Audio-Visual SegmentationMengyuan Zhu, Yunzhi Zhuge, Sitong Gong, Lu Zhang 0053, Huchuan Lu. 1-6 [doi]
- FedMPQ: Secure and Efficient Federated Learning with Multi-codebook Product QuantizationXu Yang, Zhuo Tang, Boyao Hao, Xiong Xiao, Jiapeng Zhang. 1-6 [doi]
- CLIP-EBC: CLIP Can Count Accurately through Enhanced Blockwise ClassificationYiming Ma 0003, Victor Sanchez, Tanaya Guha. 1-6 [doi]
- Confidence-Aware Self-Distillation for Multimodal Sentiment Analysis with Incomplete ModalitiesYanqian Luo, Shijin Wang, Zhongxing Xu, Yulong Li, Feilong Tang, Jionglong Su. 1-6 [doi]
- Time-Frequency Domain Fusion Transformer for Cross-Subject Motor Imagery ClassificationZijian Xia, Jianfeng Li, Jiahui Pan. 1-6 [doi]
- Localization Hints Exploration for Object MattingYu Qiao 0001, Tianyu Meng, Huilin Ge, Xinning Wang, Jiayue Zhao, Qianchen Xia, Xin Yang 0011. 1-6 [doi]
- SKL-CLIP: Learning Skeleton-Based Action Representations via Language SupervisionKun Wang 0057, Jiuxin Cao, Jiawei Ge 0002, Chang Liu 0113, Bo Liu 0004. 1-6 [doi]
- Video Label Refinement for Temporal LocalizationJennifer Piane, Thiruvarangan Ramaraj, Jacob D. Furst, Daniela Raicu. 1-6 [doi]
- Distributed Cloud-Edge Scheduling for Multimedia Data Requests: A MARL ApproachChong Geng, Zhen Liu, Yannan Wang, Yiran Li. 1-6 [doi]
- VectorPainter: Advanced Stylized Vector Graphics Synthesis Using Stroke-Style PriorsJuncheng Hu, XiMing Xing, Jing Zhang, Qian Yu. 1-6 [doi]
- DATE: Dual Asymmetric Textual Embedding guided Person Re-IdentificationPengqi Yin, Hantao Yao, Changsheng Xu. 1-6 [doi]
- Counterfactual-Augmented Representation Learning based Event PredictionCheng Hu, Fangfang Yuan, Cong Cao 0001, Pu Li, Guangjie Zeng, Yanbing Liu 0007, Hao Peng 0001, Philip S. Yu. 1-6 [doi]
- Mixture-of-Modality-Experts for Unified Image Aesthetic Assessment with Multi-Level AdaptationFei Gao 0006, Jiaqi Shi, Yuhao Lin, Xiaodan Zhang 0005, Lihuo He, Nannan Wang 0001. 1-6 [doi]
- CAP: An Advanced No-Reference Quality Assessment Method for AI-Generated 3D MeshesYingjie Zhou 0003, Farong Wen, Zicheng Zhang, Yanwei Jiang, Jun Jia, Xiaohong Liu 0001, Xiongkuo Min, Guangtao Zhai. 1-6 [doi]
- Efficient Binarized Neural Network Intellectual Property ProtectionBowen Chen, Jiehua Zhang, YuChen Sun, Li Liu. 1-6 [doi]
- Social Optimum Assisted Gradient Modulation for Imbalanced Multimodal LearningDisen Hu, Xun Jiang 0001, Zhe Sun 0009, Hao Yang, Chong Peng, Peng Yan, Xing Xu 0001. 1-6 [doi]
- MSDet: Receptive Field Enhanced Multiscale Detection for Tiny Pulmonary NoduleGuohui Cai, Ruicheng Zhang, Hongyang He, Zeyu Zhang 0006, Daji Ergu, Yuanzhouhan Cao, Jinman Zhao, Binbin Hu, Zhibin Liao, Yang Zhao 0019, Ying Cai 0002. 1-6 [doi]
- HMSformer: Hierarchical Multi-Scale Transformer for Multivariate Long-Term Series ForecastingXinyu Li, Yunqi Cai, Hao Xu, Xinyu Sun, Zhiheng Yang, Hong Lu, Xin Wang, Jin Zhao. 1-6 [doi]
- GEST: Dual Structured Exploration with Graph ODE for Spatio-Temporal Dynamic System ModelingYonghao Li, Xiangyu Zhao, Ping Ye, QingXuan Jia. 1-6 [doi]
- OSLLM: A Retrieve-Reason-Refine Framework for Multi-Domain Relation Extraction with Large Language ModelsJie Zhou 0032, Yongxue Shan, Meihan Wu, Fei Hu 0005, Li Zheng, Xiaodong Wang 0002. 1-6 [doi]
- MoMuSE: Momentum Multi-modal Target Speaker Extraction for Real-time Scenarios with Impaired Visual CuesJunjie Li, Ke Zhang, Shuai Wang 0016, Kong-Aik Lee, Man-Wai Mak, Haizhou Li 0001. 1-6 [doi]
- InvoxSVC: Any-to-any Zero-shot Singing Voice Conversion with In-Context Learning in Latent Flow MatchingWangjin Zhou, Tianjiao Du, Wenhao Guan, Meng Xiao, Chenglin Xu, Yi Zhao 0006, Tatsuya Kawahara. 1-6 [doi]
- DCSA-UNet: Lightweight UNet with Dual Cross-Shaped Attention For Skin Lesion SegmentationBoyu Chen, Lu Han, Zherui Zhang, Li Guo 0004, Shibiao Xu. 1-6 [doi]
- Enhancing Multimodal Chain-of-Thought Reasoning with Tree-Searched Self-TrainingYiwen Luo, Tao Wei, Yong Luo, Zengmao Wang. 1-6 [doi]
- DCGNet: Detail and Context Guided Small Object Detection Network with Decoupled Detection HeadYixin Qiao, Shiyong Lan, Wenwu Wang, Haohan Chen, Yao Li, Guonan Deng. 1-6 [doi]
- SGAD: An Unsupervised Secondary-Guided Diffusion Model for Industrial Anomaly DetectionWenze Kang, Yuanming Zhang, Libo Weng, ZhenBo Cheng, Fei Gao 0014. 1-6 [doi]
- Mutual Guidance and Residual Integration for Image EnhancementKun Zhou, Xinyu Lin. 1-6 [doi]
- DynaSplat: Dynamic-Static Gaussian Splatting with Hierarchical Motion Decomposition for Scene ReconstructionJunli Deng, Ping Shi 0001, Qipei Li, Jinyang Guo. 1-6 [doi]
- Visual Feature Learning from Randomized EEG Trials for Object RecognitionXiaoya Fan, Haixiao Xue, Yufan Feng, Qi Zhao 0008, Zheng Zhao, Zhong Wang. 1-6 [doi]
- Can MLLMs Tell Jokes Based on Images? A Visual Context-Driven Humor Generation FrameworkMeixuan Chen, Chen Wang, Liu Hui, Yujun Wu, Ying Sha. 1-6 [doi]
- Domain Generalization via Discrete Codebook LearningShaocong Long, Qianyu Zhou 0001, Xi Jiang 0009, Chenhao Ying 0001, Lizhuang Ma, Yuan Luo 0003. 1-6 [doi]
- Controllable Continual Test-Time AdaptationZiqi Shi, Fan Lyu, Ye Liu, Fanhua Shang, Fuyuan Hu, Wei Feng 0005, Zhang Zhang 0001, Liang Wang 0001. 1-6 [doi]
- Instruction-Aligned Visual Attention for Mitigating Hallucinations in Large Vision-Language ModelsBin Li, Dehong Gao, Yeyuan Wang, Linbo Jin, Shanqing Yu, Xiaoyan Cai, Libin Yang. 1-6 [doi]
- Reinforcement Learning-based Token Pruning in Vision Transformers: A Markov Game ApproachChenglong Lu, Shen Liang, Xuewei Wang, Wei Wang. 1-6 [doi]
- DAG-AFL: Directed Acyclic Graph-based Asynchronous Federated LearningShuaipeng Zhang, Lanju Kong, Yixin Zhang, Wei He 0020, Yongqing Zheng, Han Yu 0001, LiZhen Cui. 1-6 [doi]
- Multi-Resolution Infrared-Visible Image Fusion using Multi-Scale Residual QuantizationHonglin Wu, Jun-Jie Huang, Huibin Tan, Wanrong Huang, Yuhua Tang, Xueqiong Li. 1-6 [doi]
- KMoP: Knowledge-injected Mixture-of-Prefix for Joint Multimodal Aspect-Based Sentiment AnalysisXinzhong Wang, Lingyong Fang, Jidong Li, Yichen Zhou, Gongshen Liu. 1-6 [doi]
- GC-ConsFlow: Leveraging Optical Flow Residuals and Global Context for Robust Deepfake DetectionJiaxin Chen, Miao Hu, Dengyong Zhang, Jingyang Meng. 1-6 [doi]
- SI23DCQA: Perceptual Quality Assessment of Single Image-to-3D ContentKang Fu, Huiyu Duan, Zicheng Zhang, Xiaohong Liu 0001, Xiongkuo Min, Jia Wang 0004, Guangtao Zhai. 1-6 [doi]
- SAG-KeyNet: Scale-Adaptive Keypoint Gaussian Heatmap Regression Network for Oriented SAR Ship DetectionXu Wang 0041, Yan Fu, Yanxia Wu 0001, Dan Lin, Ye Yuan 0011, Xue Zhang, Zhirou Ma. 1-6 [doi]
- Expansive Supervision for Neural Radiance FieldsWeixiang Zhang, Wei Yao, Shuzhao Xie, Shijia Ge, Chen Tang, Zhi Wang 0001. 1-6 [doi]
- TGSR: Template-Guided Semantic Resampling against Adversarial Tracking AttacksXuhong Ren, Jianlang Chen, Wanli Xue, Lei Ma 0003, Qing Guo 0005, Jianjun Zhao 0001, Shengyong Chen. 1-6 [doi]
- Enhanced Multimodal Chain-of-Thought with Visual Self-Contrastive DistillationGuangmin Zheng, Jun Kong, Jin Wang, Xuejie Zhang. 1-6 [doi]
- SAM2-Cap: Segment Anything 2 with using Parts and Object Spatial Hierarchical Relationships for Image SegmentationXiufeng Liu 0005, Zhongqiu Zhao, Yi Yang 0001, Donghui Hu, Zhao Zhang 0001. 1-6 [doi]
- Graph-based Meta-Learning and Feature Disentanglement for Domain Generalization Crowd CountingYang Qu, Zhencai Shen, Yingyi Chen, Ping Zhong 0003. 1-6 [doi]
- MSPoint-Gait: Multi-Scale Point Cloud Analysis for 3D Gait Recognition via Cross-Modal LearningXinzhu Li, Yi Yang, Yikun Chen, Guanghui Yue 0001, Wei Zhou 0021, Ruomei Wang 0001, Xudong Mao, Juepeng Zheng, Fan Zhou 0001, Ziqi Qiu, Baoquan Zhao. 1-6 [doi]
- CFPER: Coarse-to-Fine Part-Experts Retrieval for Efficient Person Re-identificationShiyu Wang, Mingming Lu. 1-6 [doi]
- Determined Multi-Label Learning via Similarity-Based PromptMeng Wei 0006, Zhongnian Li, Peng Ying, Ridong Han, Tongfeng Sun, Xinzheng Xu. 1-6 [doi]
- SIR: Multi-view Inverse Rendering with Decomposable Shadow Under Indoor Intense LightingXiaokang Wei, Zhuoman Liu, Ping Li 0016, Yan Luximon. 1-6 [doi]
- Multimodal Causal Reasoning-Guided Intrinsic Goals for Efficient Task Completion in Reinforcement LearningTong Wu, Yi Wen, Guangchun Luo, Lingfu Wang, Qiuran Li, Dayong Zhu. 1-6 [doi]
- Weaponizing Tokens: Backdooring Text-to-Image Generation via Token RemappingJiaming He, Wenbo Jiang 0001, Guanyu Hou, QiYang Song, Ji Guo, Hongwei Li 0001. 1-6 [doi]
- CI-MER: A Novel Causal Intervention Framework For Micro-Expression RecognitionXiqiao Fang, Qingfeng Wu, Lu Cao. 1-6 [doi]
- SimCast: Enhancing Precipitation Nowcasting with Short-to-Long Term Knowledge DistillationYifang Yin, Shengkai Chen, Yiyao Li, Lu Wang, Ruibing Jin, Wei Cui 0002, Shili Xiang. 1-6 [doi]
- EMGPose: An Efficient Multi-Granularity Representation for Human Pose EstimationGuonan Deng, Shiyong Lan, Wenwu Wang 0007, Yixin Qiao, Yao Li, Haohan Chen, Hongyu Yang. 1-6 [doi]
- An Enhanced Palmprint Adversarial Attack Against Visible and Invisible FeaturesJinrong Cui, Qiuli Zhang, Ziqi Wang, Jinghua Wang, Qi Zhu. 1-6 [doi]
- ZeroPose: Leveraging Diffusion Models and Large Language Models for Advanced Multi-Hypothesis 3D Construction Workers' Pose EstimationGaowei Zhang, Wei Wang, Yi Wang. 1-6 [doi]
- MSC-Net: Multi-Scale Cross-Modal Network for Point Cloud CompletionYan Zhang, Zhenjiang Du, Lei Zhang, Zhitao Liu, Mingda Tang, Feng Tian, Ning Xie 0003. 1-6 [doi]
- Subjective Quality Assessment for Point Clouds of Digital Humans with Shaded RenderingAmar Tious, Toinon Vigier, Vincent Ricordel. 1-6 [doi]
- Exploring Compression Strategies for Blendshape-Based Avatar Facial Animation: Subjective and Objective AnalysisAnthony Trioux, Wei Zhang 0072, Giuseppe Valenzise, FuZheng Yang 0001. 1-6 [doi]
- Synthesize Large-scale in situ Darkfield Images for Training Marine Plankton Detection AlgorithmsZhenping Li, Jianping Li. 1-6 [doi]
- MdCoT: Medical Diagnosis Chain-of-Thought with Self-Diagnostic Refinement for Alzheimer's DiseaseChunlin Lu, Yongheng Zhang 0001, Peng Wang 0168, Wenpeng Lu, Libo Qin 0001. 1-6 [doi]
- DBE: Dual Branch re-Extraction for Unseen Diffusion-Generated Image DetectionShixiang Cai, Liangzhen Liu, Zhirui Kuai, Li Kuang, Lingyan Zhang. 1-6 [doi]
- LV-VTON: Long-Video Virtual Try-On via Enhanced Visual Autoregressive ModelingLulu Tian, Hongxun Yao, Ming Li 0073. 1-6 [doi]
- SLGN: Spatiotemporal Language-Guided Graph Network for Referring Video SegmentationRongrong Lian, Xiangdong Li, Zhenkai Wu, Mengting Ma, Wei Zhang 0243. 1-6 [doi]
- Consistency Change Detection Framework for Unsupervised Remote Sensing Change DetectionYating Liu, Yan Lu. 1-6 [doi]
- DisentTalk: Cross-lingual Talking Face Generation via Semantic Disentangled Diffusion ModelKangwei Liu, Junwu Liu, Yun Cao, Jinlin Guo, Xiaowei Yi. 1-6 [doi]
- MP-FIRE: An End-to-End Cross-Modal Framework for Complex Multi-Page Document Question AnsweringYongqi Yu, Jinxu Zhang, Yu Zhang. 1-6 [doi]
- A Progressive Generation Framework with Speech Pre-trained Model for Expressive Voice ConversionTianrui Wang, Meng Ge, Zhikang Niu, Cheng Gong, Chunyu Qiang, Haoyu Wang, Zikang Huang, Ziyang Ma 0001, Xiaobao Wang, Xie Chen 0001, Longbiao Wang, Jianwu Dang 0001. 1-6 [doi]
- Utilizing Contrastive Learning for Locating Network Anomalies in Real-time Conferencing ApplicationsTeng Ma 0006, Dongbiao He, Zhongxing Ming, Junhao Xu, Laizhong Cui, Yunpeng Chai. 1-6 [doi]
- Adaptive Gradient Quantization with Bit Allocation for Distributed Deep LearningFei Gao 0019, Xingyu Yan, Jian Jin, Wenhan Yang, Lingyu Duan, Zhuo Chen 0006. 1-6 [doi]
- BEAR: A Video Dataset For Fine-grained Behaviors Recognition Oriented with Action and Environment FactorsChengyang Hu, Yuduo Chen, Lizhuang Ma. 1-6 [doi]
- MambaPose: Efficient 2D Human Pose Estimation with Pose-Prior Guided State Space ModelYalong Xu, Mengting Jiang, Yang Gao, Junlong Mu, Di Wang 0011, Lin Zhao 0003. 1-6 [doi]
- Unimodal-driven Distillation in Multimodal Emotion Recognition with Dynamic FusionJiagen Li, Rui Yu, Huihao Huang, Huaicheng Yan 0001. 1-6 [doi]
- SASG: Semantic-Aware Salient Guidance for Day-to-Night Domain Adaptive Object DetectionWei Yan, Xiaoman Zhao. 1-6 [doi]
- Towards Robust Image Restoration: A Multi-Type Degradation Dataset for Outdoor ScenesYongheng Zhang, Danfeng Yan. 1-6 [doi]
- RoGA: Towards Generalizable Deepfake Detection through Robust Gradient AlignmentLingyu Qiu, Ke Jiang, Xiaoyang Tan. 1-6 [doi]
- Adaptive Weighted Parameter Fusion with CLIP for Class-Incremental LearningJuncen Guo, Xiaoguang Zhu, Liangyu Teng, Hao Yang, Jing Liu, Yang Liu, Liang Song. 1-6 [doi]
- TACOS: Open Tagging and Comparative Scoring for Instruction Fine-Tuning Data SelectionXixiang He, Hao Yu, Qiyao Sun, Ao Cheng, Tailai Zhang, Cong Liu, Shuxuan Guo. 1-6 [doi]
- Recovering Human Mesh from Videos by 2D and 3D Deformable AttentionsYulei Kang, Teng-Yue Chen, Xiaotong Lin 0002, Siyu Jiang, Jian-Fang Hu. 1-6 [doi]
- Global-Local Aware Scene Text EditingFuxiang Yang, Tonghua Su, Donglin Di, Yin Chen, Xiangqian Wu, Zhongjie Wang 0003, Lei Fan 0007. 1-6 [doi]
- DMDM: Photorealistic Face Age Transformation by Dual-Modal Collaborative Attention using Diffusion ModelsZepeng Su, Zhulin Liu, Zongyan Zhang, Tong Zhang 0015, C. L. Philip Chen. 1-6 [doi]
- Spatial 3D-LLM : Exploring Spatial Awareness in 3D Vision-Language ModelsXiaoyan Wang, Zeju Li, Yifan Xu, Jiaxing Qi, Zhifei Yang 0004, Ruifei Ma, Xiangde Liu, Chao Zhang. 1-6 [doi]
- TRR-LGF: a Simple yet Efficient Classification NetworkZhen Long, Qingqing Cao, Hu Yao, Yipeng Liu, Le Zhang, Ce Zhu. 1-6 [doi]
- Trustworthy Localized Corrections-guided Mutual Learning for Multi-View LearningQiuran Li, Yi Luo, Yan Sun, Tong Wu, Aiguo Chen. 1-6 [doi]
- D2AD: Diffusion Distillation for Unsupervised Image Anomaly DetectionYuheng Shao, Zhangkai Ni, Qinyuan Liu. 1-6 [doi]
- Towards Advanced Emotional Care: Embodied Emotional Care System for Humanoid RobotsYang Chang, Aoxing Li, Yuxuan Lin, Jianan Wang, Lizheng Liu, Yang Liu, Jing Liu, Liang Cao, Yan Wang, Zhongxue Gan, Wenqiang Zhang. 1-6 [doi]
- DAGait: Generalized Skeleton-Guided Data Alignment for Gait RecognitionZhengxian Wu, Chuanrui Zhang, Hangrui Xu, Peng Jiao, Haoqian Wang. 1-6 [doi]
- AnoCLIP: Text-Guided Zero-shot Anomaly Localization via Self-Supervised AdaptationHanqiu Deng, Zhaoxiang Zhang 0003, Jinan Bao, Xingyu Li. 1-6 [doi]
- FedAdamZO: a Zeroth-order Adaptive Momentum Method for Memory-efficient Fine-tuning of Federated Large Language ModelsBo Ma, Yongqiang Gao, Yongmei Liu. 1-6 [doi]
- OcSplats: Rendering Occluded Humans with Prior KnowledgeJie Zhang 0002, Qiongjie Cui, XuLei Yang, Na Zhao 0004. 1-6 [doi]
- Towards End-to-End Neuromorphic Voxel-based 3D Object Reconstruction Without Physical PriorsChuanzhi Xu, Langyi Chen, Haodong Chen, Vera Chung, Vincent Qu. 1-6 [doi]
- TCFI: Topology-Consistent Pruning with Fisher Information for Efficient Medical Image SegmentationYi Wang, Renda Han, Yihao Chen. 1-6 [doi]
- Open-Scene Understanding-oriented 3D Scene Graph GenerationYuansu Hao, Fei Yu, Yanhao Wang, Yuehua Li, Quan Deng, Yuan Yu, Chen Huang, Nan Che. 1-6 [doi]
- Boosting Road Event Detection with Adaptive Multi-Modal ModelsLinkai Liu, Xiaoyan Xiao, Yijian Yang, Yuchen Zhou 0002, Zipeng Guo, Chao Gou. 1-6 [doi]
- VividPose: Vividly 3D-driven Stable Pose Diffusion of High Facial FidelityQilin Wang, Zhengkai Jiang 0001, Chengming Xu 0001, Jiangning Zhang, Yabiao Wang, Xinyi Zhang, Yun Cao 0002, Weijian Cao, Chengjie Wang, Zhanxiong Wang, Yanwei Fu 0001. 1-6 [doi]
- Boosting the Transferability of Audio Adversarial Examples with Acoustic Representation OptimizationWeifei Jin, Junjie Su, Hejia Wang, Yulin Ye, Jie Hao 0001. 1-6 [doi]
- VLCO:A Dual-Optimization Framework for Precise Camouflaged Object Localization and SegmentationMaosheng Su, Shuo Wang, Zhichuan Wang, Jun Luo. 1-6 [doi]
- Unfolding Framework with Complex-Valued Deformable Attention for High-Quality Computer-Generated Hologram GenerationHaomiao Zhang, Zhangyuan Li, Yanling Piao, Zhi Li, Xiaodong Wang, Miao Cao, Xiongfei Su, Qiang Song, Xin Yuan. 1-6 [doi]
- MPCSFL: A Privacy-Preserving Split Federated Learning Framework in Edge NetworkJianfeng Guan, Haoyang Meng, Yizhong Hu, Pengcheng Wang, Kexian Liu. 1-6 [doi]
- Semantic Alignment and Hard Sample Retraining for Visible-Infrared Person Re-IdentificationJingchen Ni, Keyu Lyu, Yu Guo, Chun Yuan. 1-6 [doi]
- Local Model Trajectory Matching for Data Heterogeneity in Federated LearningMan Zhao, Tingting Leng, Jun Zhou. 1-6 [doi]
- Zero-Shot Speech Perception Decoding via Advancing Representation ConsistencyYi Xiao, Xuyi Qiao, Yu-Xuan Zhang, Xianchuan Yu. 1-6 [doi]
- AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-Speech SynthesisDan Luo, Chengyuan Ma, Weiqin Li, Jun Wang, Wei Chen 0071, Zhiyong Wu 0001. 1-6 [doi]
- GE-Talker: Generalizable and Efficient Neural Rendering for Talking Head GenerationZixuan Wang, Li Fang, Fei Hu, Long Ye. 1-6 [doi]
- FoodWeight1.4M: A Large-scale Multi-modal Dataset for Weight EstimationLu Yuan, Zhenbo Xu, Dehua Ma, Jinghan Yang, Liuyu Xiang, Huijia Wu, Zhaofeng He. 1-6 [doi]
- GameMLD: A Game-Sourced Motion-Language Dataset for Stylized Motion GenerationYiyu Fu, Ziming Cheng, Yihao Liao, Jiangfeiyang Wang, Ruomei Wang 0001, Guanghui Yue 0001, Chenlei Lv, Baoquan Zhao. 1-6 [doi]
- Dynamic Weighting Loss for Decision Boundary Adjustment based on Robust Distance in Adversarial TrainingYiqun Xu, Zhen Wei, Zhehao Li 0001, Xing Wei 0002, Yang Lu 0015. 1-6 [doi]
- Magnetic Framelet-Based Graph Contrastive Learning for Signed-Directed GraphYuting Chu, Yanfeng Sun, Fujiao Ju, Junbin Gao, Shaofan Wang, Baocai Yin. 1-6 [doi]
- Multimodal Conversatioal Emotion Analysis with Robustness to Incomplete Modality DetailsSidharth Anand, Chaitanya Sai Chandu Yendru, Sreyasee Das Bhattacharjee, Junsong Yuan 0001. 1-6 [doi]
- USGT: A Unified Syntax-Guided Transformer Framework for Sentiment Classification and Aspect Term ExtractionXiaohong Xiang, Zhe Zhang, Yi Zhou, Xin Deng. 1-6 [doi]
- Text-to-Image Diffusion Models are AI-Generated Image Quality ScorersXiangfei Sheng, Weidong Zou, Pengfei Chen 0003, Li Cai, Chao He, Leida Li. 1-6 [doi]
- Missing Pieces, Complete Picture: Navigating Micro-Video Popularity with Flexible Mixture of Modality ExpertsYang Liu 0245, Zhangtao Cheng, Bin Chen, Yan Liu, Xing He, Ting Zhong, Fan Zhou 0002. 1-6 [doi]
- PatchSegDet: Attack-Agnostic Detection of Physical Adversarial Patches in Face Recognition SystemsZhiqiang Shen, Qinfeng Li, Xuhong Zhang 0002, Yuxiang Cai, Xiaochu Chen, Ping An, Haiqin Weng, Yang Liu 0003. 1-6 [doi]
- SODMAMBA-DETR:A Small Object DETR Detector Based on a Mamba EncoderYiheng Sun, Xiaopeng Hu, Fan Wang, Xinrong Wu, Ying Zhou, Jie Zhao, Rongqi Zhu. 1-6 [doi]
- Zero-shot Face Editing via ID-Attribute Decoupled InversionYang Hou, Minggu Wang, Jianjun Zhao. 1-6 [doi]
- Guiding Yourself with Your Own Insights: Student-Driven Knowledge DistillationDacheng Qi, Huayu Zhang, Yufeng Wang 0004, Shuangkang Fang, Zehao Zhang, Zesheng Wang 0002, Wenrui Ding. 1-6 [doi]
- Unlocking Instance Semantic Awareness for Domain Adaptive Semantic SegmentationFan Li, Xuan Wang, Min Qi, Zhaoxiang Zhang 0002, Chengming Xu, Yuelei Xu. 1-6 [doi]
- Beyond Multimodal Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference OptimizationZhiyuan Zhao, Bin Wang 0065, Linke Ouyang, Xiaoyi Dong, Jiaqi Wang 0003, Conghui He. 1-6 [doi]
- Make Multi-source Task Greater Again: Adaptive Causal Diffusion StrategyZiyun Cai, Yawen Huang, Jie Song 0014, Chang-Hui Hu 0001, Tengfei Zhang 0001. 1-6 [doi]
- HeteroGNN: A Heterogeneous Stage Division Based GNN Training Framework to Maximize CPU-GPU ParallelismXiangrui Yang 0001, ZhiHao Zeng, Jiawei Yang, Yekang Zhan, Qiang Cao 0001, Jie Yao 0001. 1-6 [doi]
- Achieving Seamless Camouflage: Attention Fusion Diffusion Model for Image SynthesisHao Xi, Meiqin Liu, Zechen Yang, Ping Wei. 1-6 [doi]
- 2Scan: A Lightweight Dual-Temporal Constrained Scanpath Prediction Model for Omnidirectional ImagesNana Zhang, Qian Liu, Dandan Zhu 0001, Kun Zhu 0024, Xiongkuo Min, Guangtao Zhai. 1-6 [doi]
- Achieving Zero-Glance Unlearning with Data-Free Inversion and Selective Parameters SuppressionPuwei Lian, Xiao Ke, Zhou Tan, Jianping Cai, Ximeng Liu. 1-9 [doi]
- Mitigating Object Hallucination in Large Vision-Language Models via Visual Attention Direct Preference OptimizationYixiao He, Haifeng Sun 0001, Qi Qi 0001, Zirui Zhuang, Pengfei Ren 0001, Huazheng Wang, Yafeng Nan, Jing-Yu Wang 0001. 1-6 [doi]
- CLIP-based Robust Pedestrian Attribute Recognition via Attribute Localization and Data AugmentationYunpeng Zhou, Qiwen Liang, Xin Li, Jianping Ren, Liujinxiang Zhu, Shuhua Liu. 1-6 [doi]
- InterID: Improving Multi-ID Interaction for Personalized Image GenerationSiting Chen, Weijie Chen, Jiji Tang, Rongsheng Zhang, Xiaoshuai Sun. 1-6 [doi]
- Unsupervised Domain Adaptation for Fetal R-peak Detection at Trans-Pregnancy Stages based on Multiview MixingYiwei Lin, Yuying Bao, Tao Yu, Zhenqin Chen, Xu Cheng, Jinshan Xu. 1-6 [doi]
- TalkFashion: Intelligent Virtual Try-On Assistant Based on Multimodal Large Language ModelYujie Hu, Xuanyu Zhang, Weiqi Li, Jian Zhang. 1-6 [doi]
- Multi-branch Strong Perturbation Contrastive Learning for Semi-supervised Medical Image SegmentationFeng Xiao. 1-6 [doi]
- Frequency-guided Camouflaged Object Detection with Perceptual Enhancement and Dynamic BalanceYuetong Li, Yilin Zhao, Qing Zhang, Qiangqiang Zhou, Yanjiao Shi. 1-6 [doi]
- NLOSdiffuser: Generalized Steady-State Non-Line-of-sight Imaging toward Indoor ScenariosXian Gao, Luyang Wang, Jiacheng Ruan, Yuyang Zhang, Zongyun Zhang, Ting Liu 0016, Yuzhuo Fu. 1-6 [doi]
- VGAT: A Cancer Survival Analysis Framework Transitioning from Generative Visual Question Answering to Genomic ReconstructionZizhi Chen, Minghao Han, Xukun Zhang, Shuwei Ma, Tao Liu 0050, Xing Wei, Lihua Zhang 0002. 1-6 [doi]
- Exploring Part-Informed Visual-Language Learning for Person Re-IdentificationYin Lin, Yehansen Chen, Baocai Yin, Jinshui Hu, Bing Yin, Cong Liu 0006, Zengfu Wang. 1-6 [doi]
- LM-net: Integrating Linear Temporal Features and Multi-Scale Attention for Crop Yield EstimationHu Li, Long Long, Lin Cheng, Zichen Liu, Jing Wang, Yucheng Zhang, Feng Dai. 1-6 [doi]
- Teochew-Wild: The First In-the-wild Teochew Dataset with Orthographic AnnotationsLinrong Pan, Chenglong Jiang, Gaoze Hou, Ying Gao 0004. 1-6 [doi]
- EmoHead: Emotional Talking Head via Manipulating Semantic Expression ParametersXuli Shen, Hua Cai, Dingding Yu, Weilin Shen, Qing Xu 0017, Xiangyang Xue 0001. 1-6 [doi]
- Visual Relationships Are Different: Appropriate Way To Predict Each RelationshipZhenhua Lei, Xuemei Xie. 1-6 [doi]
- Perception-Oriented Latent Coding for High-Performance Compressed Domain Semantic InferenceXu Zhang 0006, Ming Lu, Yan Chen, Zhan Ma 0001. 1-6 [doi]
- Compositional Text-Modality Completion Model for Partially Relevant Video RetrievalYi Pan, Yujia Zhang, Xiaoguang Zhao. 1-6 [doi]
- Scene Text Image Super-Resolution with Visual Text Cues Transfer and EnhancementMingjun Li, Zeming Zhuang, Feng Su. 1-6 [doi]
- Controllable Expressive 3D Facial Animation via Diffusion in a Unified Multimodal SpaceKangwei Liu, Junwu Liu, Xiaowei Yi, Jinlin Guo, Yun Cao. 1-6 [doi]
- Complementary Multi-dimensional Variance Attention Learning for 3D Human Mesh Reconstruction from VideosTuo Xiong, Suping Wu, Xiang Zhang, Ruijie Peng, Bing Wang, Xitie Zhang, Zhijian Duan 0003. 1-6 [doi]
- Dual-Branch Attention Network for Salient Object Detection in Optical Remote Sensing ImagesYaqian Wang, Chunyang Ma, Yumei Tong, Liejun Wang, Panpan Zheng. 1-6 [doi]
- Spatial-Spectral Aware Learning with Deformable Affinity for Weakly Supervised Semantic SegmentationYuzhen Zhou, Pan Gao, Li Yu. 1-6 [doi]
- FAST: Facial Avatar Animation via Spatial-Temporal AggregationGangyi Hong, Ming Lu, Senmao Tian, Xiangyi Chen, Hui Zhang. 1-6 [doi]
- Content-Adaptive Motion Compensated Temporal Filter for Versatile Video CodingYunrui Jian, Yi Xue, Yue Huang, Xueli Cheng, Weilun Feng, Zhenan Lin, Chao Zhou. 1-6 [doi]
- Spatial-Temporal Prior Knowledge Guidance for Long-term Action AnticipationYiming Li 0008, Miao Ji, Sisi You, Bing-Kun Bao. 1-6 [doi]
- Scanpath Prediction via Utilizing Peripheral Information of the Human Visual SystemKepei Zhang, Ge Tong, Xuetao Zhang. 1-6 [doi]
- GraphDEH: Graph Diffusion Enhanced Hypergrpah Method for Class-Imbalanced Node ClassificationLiu Yang, Mengni Chen, Tingxuan Chen, Jinqi Hu, Zidong Wang. 1-6 [doi]
- 4: A Two-Stage Framework for Detecting and Grounding Multi-Modal Media ManipulationJunjie Wu 0005, Yumeng Fu, Nan Yu, Chen Gong 0004, Guohong Fu. 1-6 [doi]
- Large Language Models Meet Contrastive Learning: Zero-Shot Emotion Recognition Across LanguagesHeqing Zou, Fengmao Lv, Desheng Zheng, Eng Siong Chng, Deepu Rajan. 1-6 [doi]
- CCUP: A Controllable Synthetic Data Generation Pipeline for Pretraining Cloth-Changing Person Re-Identification ModelsYujian Zhao, Chengru Wu, Yinong Xu, Xuanzheng Du, Ruiyu Li, Guanglin Niu. 1-6 [doi]
- DRTNet: Diffusion Reconstruction Texture Network for AI-generated Image DetectionQian Yao, Jun-Jie Huang, Yongjun Wang, Zihan Chen. 1-6 [doi]
- AVENet: Disentangling Features by Approximating Average Features for Voice ConversionWenyu Wang, Yiquan Zhou, Jihua Zhu, Hongwu Ding, Jiacheng Xu, Shihao Li. 1-6 [doi]
- Semantic Communication Using Intent-guided Coarse- and Fine-grained Codec with Pre-trained Diffusion ModelsRui Tang, Dahua Gao, Minxi Yang. 1-6 [doi]
- AMS-Counter: Text-Guided Zero-shot Object Counting via Adaptive Multi-view Similarity-mapCheng Qian, Jiwu Cao, Ying Mao, Kai Liu, Peng Zhu, Jun Sang. 1-6 [doi]
- CASA: Class-Agnostic Shared Attributes in Vision-Language Models for Efficient Incremental Object DetectionMingyi Guo, Yuyang Liu, Zhiyuan Yan, Zongying Lin, Peixi Peng, Yonghong Tian 0001. 1-6 [doi]
- VADMamba: Exploring State Space Models for Fast Video Anomaly DetectionJiahao Lyu 0001, Minghua Zhao, Jing Hu 0005, Xuewen Huang, Yifei Chen 0006, Shuangli Du. 1-6 [doi]
- Soften the Mask: Adaptive Temporal Soft Mask for Efficient Dynamic Facial Expression RecognitionMengzhu Li, Quanxing Zha, Hongjun Wu. 1-6 [doi]
- WCG-Net: Warping Consistency Compensation Guided Multi-Feature Fusion For Stereo MatchingYan Hong, Chao He, Zhibo Rao, Zhen Chen 0004, Nan Li, Congxuan Zhang. 1-6 [doi]
- CDIQA: Collaborative Learning with Diffusion Extension for Semi-supervised Blind Image Quality AssessmentXudong Wang. 1-6 [doi]
- Toward Uncontrolled Palmprint Recognition via Multi-View Block Diagonal Structure LearningShuping Zhao, Chongli Zhuang, Li Yang, Yanling Zhong, Yanping Li, Yonghan Chen. 1-6 [doi]
- TSTMotion: Training-free Scene-aware Text-to-motion GenerationZiyan Guo, Haoxuan Qu, Hossein Rahmani 0001, De Wen Soh, Ping Hu 0001, Qiuhong Ke, Jun Liu 0036. 1-6 [doi]
- Quality Control For HEVC: A Deep Reinforcement Learning ApproachYichen Guo, Rui Ding, Mai Xu, Lai Jiang 0004, Shengxi Li, Xin Deng 0002. 1-6 [doi]
- Diffusion-Based Hierarchical Image SteganographyYoumin Xu, Xuanyu Zhang, Xiandong Meng, Chong Mou, Jian Zhang. 1-6 [doi]
- 2: Alternate Reconstruction and Recognition for Non-Line-of-Sight UnderstandingYi Wang, Ruixu Geng, Jiarui Zhang, Xiaolong Du, Yan Chen, Yang Hu. 1-6 [doi]
- HSS-IAD: A Heterogeneous Same-Sort Industrial Anomaly Detection DatasetQishan Wang 0002, Shuyong Gao, Junjie Hu, Jiawen Yu, Xuan Tong, You Li, Wenqiang Zhang. 1-6 [doi]
- Generative Adversarial Network-based Image and Tabular Data Generation with Differential PrivacyJiming Yang, Xu Wang 0053, Yi Jin 0001, Yidong Li, Hui Yu 0001. 1-6 [doi]
- Dancing with Noise: Advancing Generative Speech Enhancement with Distribution AugmentationYue Lei, Siqi Yang, Wenxin Tai, Xueting Liu 0005, Ting Zhong, Fan Zhou 0002. 1-6 [doi]
- Multi-soft-label Guided Supervised Contrastive Learning for Gait Emotion RecognitionChengju Zhou, Mengxin Xu, Xiaotong Fan, Liangyu Lu, Jiahui Pan, Lewei He. 1-6 [doi]
- A Multi-stage and Multi-target Knowledge Distillation Framework for Multimodal Conversational Emotion RecognitionTaiyu Niu, Geng Tu, Hui Wang 0030, Bing Qin 0001, Ruifeng Xu 0001. 1-6 [doi]
- Adversarial Attacks and Robust Defenses in Speaker Embedding based Zero-Shot Text-to-Speech SystemZe Li, Yao Shi, Yunfei Xu, Ming Li. 1-6 [doi]
- OptiDiff: Unsupervised Deep-Sea Image Enhancement via Optical Priors Guided Stable DiffusionWenhui Wu, Yuemiao Wang, Hua Li, Yuanhao Gong. 1-6 [doi]
- LKPM: Large Kernel Point Mamba for 3D Point CloudsSong Zhao, Shuhua Wang, Xiaobing Zhou. 1-6 [doi]
- Leveraging Hierarchical Spatio-Temporal Distribution Prompt for Zero-Shot Species RecognitionTie Liu, Yue Yang, Peng Chen, Qijun Zhao. 1-6 [doi]
- A GAN Framework for Asymmetric Embedding Costs Learning in JPEG SteganographyBohong Li, Weiqi Luo 0001, Peijia Zheng, Shunquan Tan, Jiwu Huang. 1-6 [doi]
- Adaptive Mobile Agent for Dynamic InteractionsYanda Li, Chi Zhang 0007, Wenjia Jiang, Wanqi Yang, Bin Fu, Pei Cheng, Xin Chen 0040, Meng Fang, Ling Chen 0006, Yunchao Wei. 1-6 [doi]
- Harmony in Chaos: A Progressive Noise-Resilient Network for Robust Fake News Video DetectionXiangzheng Kong, Zhi Zeng, Chenxi Zhu, Zihan Ma 0001, Minnan Luo. 1-6 [doi]
- Center-Oriented Prototype Contrastive ClusteringShihao Dong, Xiaotong Zhou, Yuhui Zheng, Huiying Xu, Xinzhong Zhu. 1-6 [doi]
- Exploring Active Learning for Label-Efficient Training of Semantic Neural Radiance FieldYuzhe Zhu, Lile Cai, Kangkang Lu 0001, Fayao Liu, XuLei Yang. 1-6 [doi]
- DreamScape: 3D Scene Creation via Gaussian Splatting joint Correlation ModelingYueming Zhao, Xuening Yuan, Hongyu Yang, Di Huang 0001. 1-6 [doi]
- Towards Practical Real-Time Low-Latency Music Source SeparationJunyu Wu, Jie Liu 0040, Tianrui Pan, Jie Tang 0006, Gangshan Wu. 1-6 [doi]