Abstract is missing.
- Dynamic Speech Generation to Enhance Intelligibility in Noisy EnvironmentsOlympia Simantiraki, Maria E. Markaki, Yannis Pantazis. 1-5 [doi]
- Username-Password Models Beyond Traditional Password Guessability AssessmentJiahong Yang, Wenting Li, Haibo Cheng, Ping Wang. 1-5 [doi]
- Efficient Defocus Deblurring Networks based on Diffusion ModelsKang Chen, Yuanjie Liu. 1-5 [doi]
- Role of the Pretraining and the Adaptation data sizes for low-resource real-time MRI video segmentationMasoud Thajudeen Tholan, Vinayaka Hegde, Chetan Sharma, Prasanta Kumar Ghosh. 1-5 [doi]
- A2GP-SF: Enhancing Few-shot Class Incremental Learning via Attribute Generative Prompting and Adaptive Sharpness FlatteningZhiming Chen, Desen Wang, Sisi Fu, Congcong Wen, Hui Lin, Bingzhi Chen. 1-5 [doi]
- Recursive Feature Learning from Pre-Trained Models for Spoofing Speech DetectionYu Guan, Yang Ai, Zuoliang Li, Shengyu Peng, Wu Guo. 1-5 [doi]
- Reliable Imputed-Sample Assisted Vertical Federated LearningYaopei Zeng, Lei Liu 0049, Shaoguo Liu, Hongjian Dou, Baoyuan Wu, Li Liu 0036. 1-5 [doi]
- Self-Supervised Learning and Image-Prompt Fusion for AIGC Image Quality AssessmentYan Zhao, Qingbing Sang, Zhaohong Deng, Xiaojun Wu 0001. 1-5 [doi]
- SLiCK: Exploiting Subsequences for Length-Constrained Keyword SpottingKumari Nishu, Minsik Cho, Devang Naik. 1-5 [doi]
- HATTM: A Novel Hybrid Attention Model for Ethereum Phishing Scams DetectionBo Cui, Zhenyu Zhang, Wenhan Hou. 1-5 [doi]
- Similarity-based Accent Recognition with Continuous and Discrete Self-supervised Speech RepresentationsJun-You Wang, Sheng Li 0010, Li-An Lu, Sydney Chia-Chun Kao, Jyh-Shing Roger Jang. 1-5 [doi]
- Robust Target Speaker Direction of Arrival EstimationZixuan Li, Shulin He, Xueliang Zhang. 1-5 [doi]
- Multi-Context Temporal Consistent Modeling for Referring Video Object SegmentationSun-Hyuk Choi, Hayoung Jo, Seong-Whan Lee. 1-5 [doi]
- Dual-Path Model for Pulmonary Artery SegmentationLu Shen, Yingwen Chen 0001, Changjian Wang, Zhengbo Zhang, Shuyi Zhou. 1-5 [doi]
- Mixed-Precision Graph Neural Quantization for Low Bit Large Language ModelsWanlong Liu, Yichen Xiao, Dingyi Zeng, Hongyang Zhao, Wenyu Chen 0001, Malu Zhang. 1-5 [doi]
- LNeRV: Learnable Hierarchical Encoding Improve Neural Representation Video CodecJiahong Chen, Xiang Liu, Bin Chen 0011, Baoyi An, Tao Dai 0001, Shu-Tao Xia. 1-5 [doi]
- META-CAT: Speaker-Informed Speech Embeddings via Meta Information Concatenation for Multi-talker ASRJinhan Wang, Weiqing Wang, Kunal Dhawan, Taejin Park, Myungjong Kim, Ivan Medennikov, He Huang, Nithin Rao Koluguri, Jagadeesh Balam, Boris Ginsburg. 1-5 [doi]
- MVCBRec: Multi-View Contrastive Learning for Bundle RecommendationMengmeng Li, Jinlong Tian, Yongqiang Zhao, Hongmei Li, Xudong Fang. 1-5 [doi]
- Predicting local fMRI activations from EEG: a Feasibility Study Using Both Classical and Modern Machine Learning PipelinesTomer Amit, Taly Markovits, Guy Gurevitch, Talma Hendler, Lior Wolf. 1-5 [doi]
- Learning Rate Optimization for Deep Neural Networks Using Lipschitz BanditsPadma Priyanka, Sheetal Kalyani, Avhishek Chatterjee. 1-5 [doi]
- A Distillation-based Future-aware Graph Neural Network for Stock Trend PredictionZhiPeng Liu, Peibo Duan, Mingyang Geng, Bin Zhang 0001. 1-5 [doi]
- Non-Autoregressive Image Captioning with Multi-Label Classification and Self-Critical Sequence TrainingYuanqiu Liu, Hong Yu 0005, Hui Li, Xiaotong Zhang 0003, Xin Han, Han Liu 0008. 1-5 [doi]
- Leveraging IPA and Articulatory Features as Effective Inductive Biases for Multilingual ASR TrainingJaeyoung Lee, Masato Mimura, Tatsuya Kawahara. 1-5 [doi]
- ConSinger: Efficient High-Fidelity Singing Voice Generation with Minimal StepsYulin Song, Guorui Sang, Jing Yu 0016, Chuangbai Xiao. 1-5 [doi]
- SOA: A Sparsity-Oriented Activation on Sub-layers of FFN of TransformersYulong Meng, Yuan Li, Binhan Chen, Yi Kang. 1-5 [doi]
- DCFormer: Divide-and-Conquer in 3D Human Pose Estimation TasksTianyi Ma, Muqing Wu, Zijian Zhang. 1-5 [doi]
- The USTC System for EEG-Music Emotion Recognition ChallengeJiaxin Chen, Yiming Wang, Yin-Long Liu, Rui Feng, Jiahong Yuan, Zhen-Hua Ling. 1-2 [doi]
- Feature Refinement Decomposition and Relation Preference Enhancement for Remote Sensing Change DetectionWenqi Zheng, Jianing Chen, Junze Yang, Chuhao Chen 0002, Wei Li 0109, Rahul Yadav, Xiangxu Meng. 1-5 [doi]
- Accompaniment Prompt Adherence: A measure for evaluating music accompaniment systemsMaarten Grachten, Javier Nistal. 1-5 [doi]
- Quickest Change Detection of Unknown Mean-Shifts using the James-Stein EstimatorTopi Halme, Venugopal V. Veeravalli, Visa Koivunen. 1-5 [doi]
- AP-Net: Semi-Supervised Ultrasound Cardiac Segmentation Using Enhanced Anatomical PriorYuhuan Lu, Jintang Li, Jianxin Lin, Ying Yuan, Jagath C. Rajapakse, Ningbo Zhu, Chunlian Wang, Kenli Li 0001. 1-5 [doi]
- Class Relevance Learning for Out-of-Distribution DetectionLiguang Zhou, Butian Xiong, Tin Lun Lam, Yangsheng Xu. 1-5 [doi]
- MutualForce: Mutual-Aware Enhancement for 4D Radar-LiDAR 3D Object DetectionXiangyuan Peng, Huawei Sun, Kay Bierzynski, Anton Fischbacher, Lorenzo Servadei, Robert Wille. 1-5 [doi]
- Unified Arbitrary-Time Video Frame Interpolation and PredictionXin Jin, Longhai Wu, Jie Chen, Ilhyun Cho, Cheul-Hee Hahm. 1-5 [doi]
- ACRL-10K: A Dataset for Air Conditioner Refrigerant Leak Smoke DetectionYuhan Jiang, Zhongyuan Wang 0001, Jie Hua 0005, Jinbi Liang, Zengmin Xu. 1-5 [doi]
- SABiT-MNet: Scale-Adaptive Autoencoder with BiT-M Model for Identifying AMD GradesNiveen Nasr El-Den, Mohamed Elsharkawy, Mohammed Ghazal, Ali Mahmoud 0001, Harpal Sandhu, Hani Mahdi 0001, Ayman El-Baz. 1-5 [doi]
- TD-RD: A Top-Down Benchmark with Real-Time Framework for Road Damage DetectionXi Xiao, Zhengji Li, Wentao Wang, Jiacheng Xie, Houjie Lin, Swalpa Kumar Roy, Tianyang Wang, Min Xu. 1-5 [doi]
- Latent Watermarking of Audio Generative ModelsRobin San Roman, Pierre Fernandez, Antoine Deleforge, Yossi Adi, Romain Serizel. 1-5 [doi]
- On Decentralized Learning with Stochastic Subspace DescentShivangi Dubey Sharma, Pranay Sharma, Ketan Rajawat. 1-5 [doi]
- Feature Disentangling Dual-stream Network for User Bias Alleviation in Social Media PredictionWenhao Hu, Weilong Chen, WeiMin Yuan, Xiaolu Chen, Han Yang, Yanru Zhang, Zhu Han 0001. 1-5 [doi]
- Track-MDP: Reinforcement Learning for Target Tracking with Controlled SensingAdarsh M. Subramaniam, Argyrios Gerogiannis, James Zachary Hare, Venugopal V. Veeravalli. 1-5 [doi]
- Exploring Group Theory for Optimal Cognitive Radar Waveform DesignJonathan Monsalve, Kumar Vijay Mishra, A. Robert Calderbank. 1-5 [doi]
- IPP-Net: A Generalizable Deep Neural Network Model for Indoor Pathloss Radio Map PredictionBin Feng, Meng Zheng 0001, Wei Liang 0001, Lei Zhang. 1-2 [doi]
- KARLM: Enhancing LLM-based Recommendation Systems with Knowledge BasesZe Song, Dehong Chen, Xiaoyi Shen, Xiangyu Zhou, Ji Qi, Yi Zhou. 1-5 [doi]
- AuscMLLM: Bridging Classification and Reasoning in Heart Sound Analysis with a Multimodal Large Language ModelZihan Zhao, Pingjie Wang, Liudan Zhao, Ya Zhang 0002, Kun Sun, Xin Sun, Xin Zhou, Yanfeng Wang 0001, Yu Wang 0027. 1-5 [doi]
- Everyone-Can-Sing: Zero-Shot Singing Voice Synthesis and Conversion with Speech ReferenceShuqi Dai, Yunyun Wang, Roger B. Dannenberg, Zeyu Jin. 1-5 [doi]
- Improving Continuous Sign Language Recognition via Cross-Frame Interactions in Expanded Contextual SpacesYiheng Yu, Sheng Liu, Yuan Feng, Min Xu, Zhelun Jin, Xuhua Yang. 1-5 [doi]
- KABON: Knowledge Aggregation with Vision-Language Model for Black-Box Open-Set Domain AdaptationZhixin Zeng, Yusen Zhang, Ji Wang 0001. 1-5 [doi]
- FedTG: Text-guided Federated Domain GeneralizationYiming Chen, Nan He, Lifeng Sun. 1-5 [doi]
- Credible and Detailed 3D Face Reconstruction in Large PoseXinyu Li, Xitie Zhang, Suping Wu, Ruijie Peng, Kehua Ma, Xiang Zhang. 1-5 [doi]
- Sampling Nonsmooth Log-Concave Densities: A Comparative Study of Primal-Dual Based Proposal DistributionsJuliette Chevallier, Gersende Fort. 1-5 [doi]
- Epigraph Based Multilevel Optimization (EMO) for Enhancing Chain-of-Thought Reasoning CapabilitiesSongtao Lu, Yanna Ding, Lior Horesh, Jianxi Gao, Malik Magdon-Ismail. 1-5 [doi]
- Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-SpeechShuwei He, Rui Liu 0008. 1-5 [doi]
- Keeping the Balance: Anomaly Score Calculation for Domain GeneralizationKevin Wilkinghoff, Haici Yang, Janek Ebbers, François G. Germain, Gordon Wichern, Jonathan Le Roux. 1-5 [doi]
- Slungt: Even Faster Spoken Language Understanding with N-Grams and TriesDaniel Bermuth, Wolfgang Reif. 1-5 [doi]
- ILDiff: Generate Transparent Animated Stickers by Implicit Layout DistillationTing Zhang, Zhiqiang Yuan, Yeshuang Zhu, Jie Zhou 0016, Jinchao Zhang. 1-5 [doi]
- SPRGAN: Streamlined Progressive Refinement for Adversarial Point Cloud Video UpsamplingZhiyong Zhang, Ruyu Liu, Chaochao Wang, Xianchao Zhang 0003, Jianhua Zhang, Xiufeng Liu. 1-5 [doi]
- SYKI-SVC: Advancing Singing Voice Conversion with Post-Processing Innovations and an Open-Source Professional TestsetYiquan Zhou, Wenyu Wang, Hongwu Ding, Jiacheng Xu, Jihua Zhu, Xin Gao, Shihao Li. 1-5 [doi]
- Point-UMAE: Unet-like Masked Autoencoders for Point Cloud Self-supervised LearningHongliang Zeng, Ping Zhang, Fang Li, Tingyu Ye, Jiahua Wang, Xianbo Yang. 1-5 [doi]
- Improving Cross-Lingual Phonetic Representation of Low-Resource Languages Through Language Similarity AnalysisMinu Kim, Kangwook Jang, Hoirin Kim. 1-5 [doi]
- Investigating Numerical Translation with Large Language ModelsWei Tang 0013, Jiawei Yu, Yuang Li, Yanqing Zhao, Weidong Zhang, Wei Feng, Min Zhang, Hao Yang. 1-5 [doi]
- RF Distillation Diffusion Model: An Efficient RFF Data Augmentation MethodKai Lin, Caidan Zhao, Jingqian Chen, Liang Xiao 0003. 1-5 [doi]
- Sequential Diffusion-Guided Deep Image Prior for Medical Image ReconstructionShijun Liang, Ismail Alkhouri, Qing Qu 0001, Rongrong Wang, Saiprasad Ravishankar. 1-5 [doi]
- Frequency Agnostic Tissue Characterization in Ultrasound Imaging using Backscattered Signal StatisticsAbhinav Gadge, Abhishek Kumar, Debdoot Sheet. 1-5 [doi]
- 2T: Adversarial Distortion Domain Translation for Robust Watermarking against Non-differentiable DistortionsChengxin Zhao, Hefei Ling, Jiazhong Chen, Han Fang, Zongyi Li, Sijing Xie. 1-5 [doi]
- Delving Into Coarse-Fine Feature Interaction Alignment for UAV Object DetectionYanchao Bi, Yang Ning, Xiushan Nie. 1-5 [doi]
- One Shot is Enough for Sequential Infrared Small Target SegmentationBingbing Dan, Meihui Li, Tao Tang 0005, Jing Zhang. 1-5 [doi]
- RQTalker: Speech-driven 3D Facial Animation via Region-aware Vector QuantizationYingYing Fan, Kaisiyuan Wang, Hang Zhou 0009, Shengyi He, Yu Wu 0001. 1-5 [doi]
- Automotive Radar Target Detection in Widely Separated and Distributed Aperture Radar SystemsMoein Ahmadi, Björn E. Ottersten, Bhavani Shankar M. R., Thomas Stifter. 1-5 [doi]
- CPT-Boosted Wav2vec2.0: Towards Noise Robust Speech Recognition for Classroom EnvironmentsAhmed Adel Attia, Dorottya Demszky, Tolúlopé Ògúnremí, Jing Liu, Carol Y. Espy-Wilson. 1-5 [doi]
- GEGA: Graph Convolutional Networks and Evidence Retrieval Guided Attention for Enhanced Document-level Relation ExtractionYanxu Mao, Xiaohui Chen, Peipei Liu, Tiehan Cui, Zuhui Yue, Zheng Li. 1-5 [doi]
- LLMProto: A Hardware-Efficient Finetuning Model for Few-Shot Relation Extraction with Large Language ModelLongyi Ye, Huaping Zhang. 1-5 [doi]
- AUIED3K: A New Andaman Underwater Image Enhancement Dataset for Deep Learning-Driven Image Enhancement with Minimum Loss DehazingPraveen Saini, Navjot Singh, Anshu S. Anand. 1-5 [doi]
- MixHD: A Method for Detecting Hallucinations Based on the Internal State and Output Probability of Large Language ModelsChuang Li, Bingnan Xing, Dongdong Huo, Qihui Zhou, Zhen Xu 0009, Yu Wang 0134. 1-5 [doi]
- Larger Language Models Don't Care How You Think: Why Chain-of-Thought Prompting Fails in Subjective TasksGeorgios Chochlakis, Niyantha Maruthu Pandiyan, Kristina Lerman, Shrikanth Narayanan. 1-5 [doi]
- SepMamba: State-Space Models for Speaker Separation Using MambaThor Højhus Avenstrup, Boldizsár Elek, István László Mádi, András Bence Schin, Morten Mørup, Bjørn Sand Jensen, Kenny Olsen. 1-5 [doi]
- XAI for Gender Representation in Media AnalysisFrançois Buet, Camille Guinaudeau, Cyril Grouin, Sahar Ghannay, Shin'ichi Satoh. 1-5 [doi]
- DoA-Aided MMSE Channel Estimation for Wireless Communication SystemsFranz Weißer, Nurettin Turan, Wolfgang Utschick. 1-5 [doi]
- Spatial Frequency-Aware Self-Distillation for Weakly-Supervised Semantic SegmentationJingyuan Fang, Yang Ning, Xiushan Nie. 1-5 [doi]
- Radar2ECG: Multi-Scale Bottleneck Fusion and Cross-modal Semantic Distillation for Conditional Electrocardiogram Generation from Radar Heart SoundJinye Li, Aidong Men, Yang Liu 0105, Pengda Han, Qingchao Chen. 1-5 [doi]
- Audio-Visual Representation Learning For Lip-Sync Estimation Through Ranking Augmented Contrastive TrainingBhavin Jawade, Ravi Teja Gadde, Christophe Bejjani, Yinghong Lan. 1-5 [doi]
- MotionComposer: Enhancing Rhythmic Music Generation with Adaptive Retrieval ReferenceJinting Wang, Li Liu 0036, Jun Wang. 1-5 [doi]
- Infrared and Visible Image Fusion with Hierarchical Human PerceptionGuang Yang, Jie Li 0001, Xin Liu, Zhusi Zhong, Xinbo Gao 0001. 1-5 [doi]
- Trustworthy Recommendation for Consumer Electronics Using HypernetworksYicheng Di, Song Shen, Jiayu Bao, Yuan Liu. 1-5 [doi]
- Cooperative Neural Radiance Field for Dynamic Scene DeblurringDong Liu, Zhiyong Wang, Linlin Guo. 1-5 [doi]
- Hard Sample Aware Robust Contrastive Learning for Multi-View ClusteringYuanzhe Cai, Zhikui Chen, Jing Gao 0007, Peng Li 0027, Jianing Zhang. 1-5 [doi]
- Boosting the Transferability of Adversarial Examples via Local Mixup and Adaptive Step SizeJunlin Liu, Chenyu Zhang, Xinchen Lyu. 1-5 [doi]
- Enhancing Whisper's Accuracy and Speed for Indian Languages through Prompt-Tuning and TokenizationKumud Tripathi, Raj Gothi, Pankaj Wasnik. 1-5 [doi]
- Dynamic Soft Contrastive Learning for Time Series Anomaly DetectionYifan Song, Yu Liu, Shaolong Shu. 1-5 [doi]
- Vision Transformers for X-ray Diffraction Patterns AnalysisTitouan Simonnet, Mame Diarra Fall, Sylvain Grangeon, Bruno Galerne. 1-5 [doi]
- Optimization of Beamwidth in Automotive Radars Based on Statistics of Street GeometryMohammad Taha Shah, Gourab Ghatak, Shobha Sundar Ram. 1-5 [doi]
- Post-Hoc Adversarial Stickers Against Micro-Expression LeakagePei-Sze Tan, Sailaja Rajanala, Yee Fan Tan, Arghya Pal, Chun-Ling Tan, Raphaël C.-W. Phan, Huey Fang Ong. 1-5 [doi]
- Critically-Damped Third-Order Langevin DynamicsBenjamin Sterling, Mónica F. Bugallo. 1-5 [doi]
- Alignment-Free Training for Transducer-based Multi-Talker ASRTakafumi Moriya, Shota Horiguchi, Marc Delcroix, Ryo Masumura, Takanori Ashihara, Hiroshi Sato, Kohei Matsuura, Masato Mimura. 1-5 [doi]
- V-Phanton: Voltage-Based Physically-Triggered Backdoor Attack Against Facial RecognitionYan Jiang, Ruishan Li, Yushi Cheng, Xiaoyu Ji 0001, Wenyuan Xu 0001. 1-5 [doi]
- SS-BRPE: Self-Supervised Blind Room Parameter Estimation Using Attention MechanismsChunxi Wang, Maoshen Jia, Meiran Li, Changchun Bao, Wenyu Jin 0003. 1-5 [doi]
- Generative Model based Optical Response Prediction for Plasmonic SensingAnish Datta, Soma Bandyopadhyay, Subhasri Chatterjee, Tapas Chakravarty, Arpan Pal 0001. 1-5 [doi]
- Adaptive Receptive Field Convolution for Top-view Fisheye Images SegmentationWenwei Lin, Gang Chen 0023, Changcai Li. 1-5 [doi]
- BS-Breath: Respiration Sensing with Cell-free Massive MIMOHaoqiu Xiong, Robbert Beerten, Zhuangzhuang Cui, Yang Miao, Sofie Pollin. 1-5 [doi]
- Fast DPCNs for Feature Extraction without LabelsWenqian Xue, Chi Ding, José C. Príncipe. 1-5 [doi]
- Adaptive Lossless Compression for Genomics Data by Multiple (s, k)-mer Encoding and XLSTMHui Sun 0002, Yanfeng Ding, Liping Yi, Huidong Ma, Haonan Xie, Gang Wang 0001, Xiaoguang Liu 0001. 1-5 [doi]
- Subpart Suppression Network for Few-Shot Object CountingLanxin Liu, Xinyan Liu, Guorong Li. 1-5 [doi]
- InstAD: Instance-aware Segmentation Framework for Zero-shot Multi-instance Anomaly DetectionCheng-Yu Ho, Shang-Hong Lai. 1-5 [doi]
- EFL-PEFT: A communication Efficient Federated Learning framework using PEFT sparsification for ASRMohamed Nabih Ali, Daniele Falavigna, Alessio Brutti. 1-5 [doi]
- Shifting Spotlight for Co-supervision: A Simple yet Efficient Single-branch Network to See Through CamouflageYang Hu, Jinxia Zhang, Kaihua Zhang, Yin Yuan, Jiale Huang, Zechao Zhan, Xin Wang. 1-5 [doi]
- Linking Known and Unknown: Generalized Cross-Instance Feature Helps Category DiscoveryYuanhao Zuo, Yichao Liu, Xiwei Liu, Tingzhang Luo. 1-5 [doi]
- Multi-scale Spectral Mixture Neural OperatorFengrui Jing, Hongzhen Ding, TaoSong. 1-5 [doi]
- Enhancing Autonomous Vehicle Planning With a Robust Fault-Tolerant Mechanism for Action-Induced Agent DetectionZheng Fu, Hezhe Lin, Kangan Qian, Tuopu Wen, Hao Gao 0005, Zhihua Zhong, Diange Yang. 1-5 [doi]
- DICS: Find Domain-Invariant and Class-Specific Features for Out-of-Distribution GeneralizationQiaowei Miao, Yawei Luo, Yi Yang 0001. 1-5 [doi]
- Domain Connection based Unsupervised Domain Adaptation for Semantic SegmentationChunze Yang, Xiaodong Zhang 0036, PeiYuan Tang, Haoran Yuan, Haojie Xin, Zijiang James Yang. 1-5 [doi]
- Robust Identifiability for Symbolic Recovery of Differential EquationsHillary Hauger, Philipp Scholl 0003, Gitta Kutyniok. 1-5 [doi]
- Boosting Text-To-Image Generation via Multilingual Prompting in Large Multimodal ModelsYongyu Mu, Hengyu Li, Junxin Wang, Xiaoxuan Zhou, Chenglong Wang, Yingfeng Luo, Qiaozhi He, Tong Xiao, Guocheng Chen, Jingbo Zhu. 1-5 [doi]
- Enhancing Multimodal Analogical Reasoning Through Triplet InteractionHua Cai, Xuli Shen, Shuaishuai Li, Weilin Shen, Qing Xu 0017. 1-5 [doi]
- Preference Alignment Improves Language Model-Based TTSJinchuan Tian, Chunlei Zhang, Jiatong Shi, Hao Zhang 0112, Jianwei Yu 0001, Shinji Watanabe 0001, Dong Yu 0001. 1-5 [doi]
- Efficient Visual Storytelling through Descriptive Words Distillation and Dynamic DecodingZixuan Xiong, Hai Lin, Lichen Bai, Yinghui Li, Hai-Tao Zheng 0002, Hong-Gee Kim. 1-5 [doi]
- Boosting Large Language Model for Speech Synthesis: An Empirical StudyHongkun Hao, Long Zhou, Shujie Liu 0001, Jinyu Li 0001, Shujie Hu, Rui Wang 0073, Furu Wei. 1-5 [doi]
- Reducing the Sensitivity of Neural Physics Simulators to Mesh Topology via PretrainingNathan Vaska, Justin Goodwin, Robin Walters 0001, Rajmonda Sulo Caceres. 1-5 [doi]
- What Does an Audio Deepfake Detector Focus on? A Study in the Time DomainPetr Grinberg, Ankur Kumar, Surya Koppisetti, Gaurav Bharaj. 1-5 [doi]
- Linguistics-Vision Monotonic Consistent Network for Sign Language ProductionXu Wang, Shengeng Tang, Peipei Song, Shuo Wang 0008, Dan Guo 0001, Richang Hong. 1-5 [doi]
- Support Recovery in 1-Bit Compressed Sensing with Burst Sparse NoiseSaikiran Bulusu, Venkata Gandikota, Pramod K. Varshney. 1-5 [doi]
- Monte Carlo Score Matching for Image GenerationNishanth Shetty, Chandra Sekhar Seelamantula. 1-5 [doi]
- Visual Entity-Centric Prompting for Knowledge Retrieval in Knowledge-based VQAJiaming Yang, Jiuxiang You, Ziyue Qiu, Guobo Xie, Yi Yu, Zhenguo Yang. 1-5 [doi]
- Enhancing EEG-based Covert Speech Decoding through Knowledge TransferZhiwei Guo, Muyun Jiang, Chenyu Liu, Min Wu 0008, Jia Lu, Balázs Gulyás, Cuntai Guan. 1-5 [doi]
- Compressing a Flow-Based Privacy Protection Model via a Novel Joint Distilling and Pruning MethodSissi Xiaoxiao Wu, Zhicong Liang, Zehong Huang. 1-5 [doi]
- Attention-Enhanced Feature Fusion Network for No-Reference Image Quality AssessmentJiliang Ma, Yihua Chen 0001, Pengsheng Huang, Zhenjun Tang. 1-5 [doi]
- Prompting DirectSAM for Semantic Contour Extraction in Remote Sensing ImagesShiyu Miao, Delong Chen, Fan Liu, Chuanyi Zhang, Yanhui Gu, Shengjie Guo, Jun Zhou 0011. 1-5 [doi]
- FlashSR: One-step Versatile Audio Super-resolution via Diffusion DistillationJaekwon Im, Juhan Nam. 1-5 [doi]
- iReWindColor: Vision Transformer with Residual Embedding and Window Encoder for Point-Interactive Image ColorizationHideyuki Ogura, Masaaki Ikehara. 1-5 [doi]
- Deep Joint Source-Channel Coding for Wireless Point Cloud TransmissionCixiao Zhang, Mufan Liu, Wenjie Huang, Yin Xu 0001, Yiling Xu, Dazhi He. 1-5 [doi]
- SSAST-Adapter: A Parameter-efficient Incremental Learning Algorithm for Underwater Acoustic Target RecognitionQian Zhu, Qisheng Xu, Boqing Zhu, Zijian Gao, Lingbin Zeng, Kele Xu. 1-5 [doi]
- PRimuS: Pretraining IMU Encoders with Multimodal Self-SupervisionArnav M. Das, Chi Ian Tang, Fahim Kawsar, Mohammad Malekzadeh. 1-5 [doi]
- Modular Gaussian Splatting: Instance Decomposable Learning and Adaptive Rendering of 3D Scenes via Mixture of ExpertsJiansong Sha, Haoyu Zhang, Qiangjuan Huang, Guang Kou. 1-5 [doi]
- MedFocusCLIP: Improving few shot classification in medical datasets using pixel wise attentionAadya Arora, Vinay P. Namboodiri. 1-5 [doi]
- Self-Prompting Driven SAM2 for 3D Medical Image SegmentationSheng Wei, Song Qiu, Mei Zhou, He Zhang, Yan Wang 0033, Qingli Li. 1-5 [doi]
- Crowdsourced Homophily Ties Based Graph Annotation Via Large Language ModelYu Bu, Yulin Zhu, Kai Zhou 0001. 1-5 [doi]
- MILE: Multi-Instance Learning for Document Event Argument ExtractionJiaxian Wang, Yong Zhang, Xiang Peng. 1-5 [doi]
- Double Domain Converter Transformer For Improving EEG-Based Emotion Recognition from Video to Game ScenariosJun-Yu Pan, Hao-Long Yin, Wei-Long Zheng. 1-5 [doi]
- A Self-Evolving Framework for Multi-Agent Medical Consultation Based on Large Language ModelsKai Chen, Ji Qi, Jing Huo, Pinzhuo Tian, Fanyu Meng, Xi Yang, Yang Gao. 1-5 [doi]
- Towards Time-Frequency Deformation Stability Bounds for Deep Convolutional Neural NetworksAlbert Chua. 1-5 [doi]
- DEFormer: DCT-driven Enhancement Transformer for Low-light Image and Dark VisionXiangchen Yin, Zhenda Yu, Xin Gao, Xiao Sun. 1-5 [doi]
- Edge-aware Laplacian Pyramid Network for Efficient Image DeblurringZhe Xu, Zhipei Lei, Dingyong Gou, Yanlin Wu, Liwen Zhang, Cong Li. 1-5 [doi]
- LKA-ReID: Vehicle Re-Identification with Large Kernel AttentionXuezhi Xiang, Zhushan Ma, Lei Zhang 0093, Denis Ombati, Himaloy Himu, Xiantong Zhen. 1-5 [doi]
- Robust Sensor Selection By Deep UnfoldingYuvraj Singh, Jahnvi Singh Rohela, Kaushani Majumder, Satish Mulleti. 1-5 [doi]
- A novel multimodal personality prediction method based on pretrained models and graph relational transformer networkRongquan Wang, Xianyu Xu, Hao Yang, Lin Wei, Huimin Ma 0001. 1-5 [doi]
- A Modified Nonlinear Matched Filter for Skewed Noise Based on the Gram-Charlier ExpansionArie Yeredor. 1-4 [doi]
- Grouping-Based Crowding Differential Evolution Approaches for Multimodal Feature SelectionJunliu Zhu, Zong-Gan Chen, Jian-Yu Li, Yuncheng Jiang, Zhi-hui Zhan, Jun Zhang 0003. 1-5 [doi]
- Sparse Generation: Making Pseudo Labels Sparse for Point Weakly Supervised Object Detection on Low Data VolumeChuyang Shang, Tian Ma, Wanzhu Ren, Yuancheng Li, Jiayi Yang. 1-5 [doi]
- Generating Vocals from Lyrics and Musical AccompanimentGeorg Streich, Luca A. Lanzendörfer, Florian Grötschla, Roger Wattenhofer. 1-5 [doi]
- Data-driven Processing using Parametric Neural Network for Improved Bluetooth Channel Sounding Distance EstimationAndrii Tsemko, Avik Santra, Oleg Kapshii, Ashutosh Pandey. 1-5 [doi]
- Advancing Streaming ASR with Chunk-wise Attention and Trans-chunk Selective State SpacesMasato Mimura, Takafumi Moriya, Kohei Matsuura. 1-5 [doi]
- A Fast Saturation Based Dehazing Framework with Accelerated Convolution and Attention BlockShuocheng Wang, Jiaming Liu, Yilian Zhong, Ruoxi Zhu, Jiazheng Lian, Hao Zhang, Yibo Fan. 1-5 [doi]
- Vision Mamba-Based Approach for Incomplete Boundary Document Image RectificationWeihao Zhang, Xin Xia, Maopeng Li, Yunbo Zhao. 1-5 [doi]
- Graph Contrastive Learning with Decoupled AugmentationShihao Gao, Caoshuo Li, Cunli Mao, Xulong Zhang 0001, Xiaoyang Qu, Taisong Jin, Jianzong Wang. 1-5 [doi]
- DTR: Dynamic Tree-Ring Watermarking Framework for Diffusion-Based Video GenerationShunyang Zeng, Linlin Yang, Jin Yang, Yezhen Wang, Tianyu Gao. 1-5 [doi]
- Spatially-variant Blur Degradation Model Based on Depth EstimationRui Xie, Shuzhan Guo, Li Zou, Jiaxiong Liu, Qian Wang, Jun Zhou. 1-5 [doi]
- ADC-GS: Pose-Free 3D Gaussian Splatting with Adaptive Depth ConsistencyRunling Liu, Shihe Shen, Na Jiang, Jiahao Wu, Guanhua Wu, Zhiyan Wang, Lu Xiao, Zhanke Wang, Ronggang Wang. 1-5 [doi]
- Joint Task Offloading and Routing in Wireless Multi-hop Networks Using Biased Backpressure AlgorithmZhongyuan Zhao 0002, Jake B. Perazzone, Gunjan Verma, Kevin S. Chan, Ananthram Swami, Santiago Segarra. 1-5 [doi]
- Guided Speaker EmbeddingShota Horiguchi, Takafumi Moriya, Atsushi Ando, Takanori Ashihara, Hiroshi Sato, Naohiro Tawara, Marc Delcroix. 1-5 [doi]
- Causal fMRI-Mamba: Causal State Space Model for Neural Decoding and Brain Task States RecognitionWeihao Deng, Fei Han 0001, Qinghua Ling, Qing Liu 0010, Henry Han. 1-5 [doi]
- Low-Resource Text-to-Speech Synthesis Using Noise-Augmented Training of ForwardTacotronKishor Kayyar Lakshminarayana, Frank Zalkow, Christian Dittmar, Nicola Pia, Emanuël A. P. Habets. 1-5 [doi]
- RefleXGen: The unexamined code is not worth usingBin Wang, Hui Li, Aofan Liu, Botao Yang, Ao Yang, YiLu Zhong, Weixiang Huang, Runhuai Huang, Weimin Zeng, Yanping Zhang. 1-5 [doi]
- Toward Comprehensive Semantic Prompt for Region Contrastive Learning Underwater Image EnhancementXiao Wang, Yongsheng Fu, Wei Wang, Wei Liu. 1-5 [doi]
- Prototype based Masked Audio Model for Self-Supervised Learning of Sound Event DetectionPengfei Cai, Yan Song 0001, Nan Jiang 0022, Qing Gu 0002, Ian McLoughlin 0001. 1-5 [doi]
- Egocentric Speaker Diarization with Vision-Guided Clustering and Adaptive Speech Re-detectionHe Huang, Haoyuan Yu, Daibo Liu, Haowen Chen, Minjie Cai. 1-5 [doi]
- LOFI: Harnessing Attention Dynamics for Facial Expression Recognition with Noisy LabelsJinglin Zhang, Qiangchang Wang, Xinxin Zhang 0004, Yilong Yin. 1-5 [doi]
- HCLTS: Mining Customers' Consumption Patterns in Natural Gas Time Series with Hierarchical Contrastive LearningYuhang Niu, Jiaqi Ye, Shubao Zhao, Zhaoxiang Hou, Chengyi Yang, Zengxiang Li, Yanlong Wen, Xiaojie Yuan. 1-5 [doi]
- Pruning then Reweighting: Towards Data-Efficient Training of Diffusion ModelsYize Li, Yihua Zhang, Sijia Liu 0001, Xue Lin 0001. 1-5 [doi]
- Learned Video Compression With Refined Adaptive Flow Pyramid And Coordinate-Aware AttentionQian Huang, Wenting Liu, Xin Li, Yiming Wang. 1-5 [doi]
- CSD: Weather forecasting with graph neural network based on cross-scale diffusivityJinrun Li, Gaowei Zhang, Wei Wang 0353, Yi Wang 0013. 1-5 [doi]
- Fast and Robust High Resolution Frequency Estimation of Damped SignalsQi Dai, Ruiming Guo, Thierry Blu. 1-5 [doi]
- Multi-channel Speaker Counting for EEND-VC-based Speaker Diarization on Multi-domain ConversationNaohiro Tawara, Atsushi Ando, Shota Horiguchi, Marc Delcroix. 1-5 [doi]
- Robust and Efficient Adversarial Defense in SNNs via Image Purification and Joint DetectionWeiran Chen, Qi Xu. 1-5 [doi]
- Multi-Scale Dehaze Network Based on Frequency Domain Assistance and Detailed Brightness Information GuidancePengwei Yang, Lei Wang. 1-5 [doi]
- Clutter Resilient Occlusion Avoidance for Tightly-Coupled Motion-Assisted DetectionZhixuan Xie, Jianjun Chen, Guoliang Li, Shuai Wang 0004, Kejiang Ye, Yonina C. Eldar, Chengzhong Xu 0001. 1-5 [doi]
- Wasserstein Heterogeneous Graph Neural Networks for Uncertainty-Aware Anomaly DetectionChen Chen 0094, Yunchun Li, Boxuan Jiao, Guorui Zhao, Wei Li. 1-5 [doi]
- MMTP: Meta-learning-based Multi-Textual Prompt Tuning for Visual-Language ModelsFangtong Sun, JunJie Zhu, Zunlin Fan, Yiying Li, Zhiyuan Wang, Ke Yang. 1-5 [doi]
- Robust Adversarial Training for Industrial Defect Classification with Long-Tailed DataShuchun Xu, Jiguang Lyu, Dapeng Man, Hengheng Xiong, Tao Liu, Wu Yang 0001. 1-5 [doi]
- Threshold Sensitivity in Two-Channel Modulo ADCs: Analysis and Robust ReconstructionWenyi Yan, Lu Gan 0005, Yimin D. Zhang. 1-5 [doi]
- Self-Supervised Image Harmonization via Holistic Feature FusionChenyang Tian, Qing Zhang. 1-5 [doi]
- Mamba Meets Financial Markets: A Graph-Mamba Approach for Stock Price PredictionAli Mehrabian, Ehsan Hoseinzade, Mahdi Mazloum, Xiaohong Chen. 1-5 [doi]
- Bridge-SR: Schrödinger Bridge for Efficient SRChang Li, Zehua Chen, Fan Bao, Jun Zhu. 1-5 [doi]
- Optimizing Dual-Mode UAV-Assisted Remote IoT Data Collection with Decentralized DRL in 6G-Enabled Space-Air-Ground NetworksRu Jin, Rongheng Lin. 1-5 [doi]
- Using Ear-EEG to Decode Auditory Attention in Multiple-speaker EnvironmentHaolin Zhu, Yujie Yan, Xiran Xu, Zhongshu Ge, Pei Tian, Xihong Wu, Jing Chen 0019. 1-5 [doi]
- Guitar-TECHS: An Electric Guitar Dataset Covering Techniques, Musical Excerpts, Chords and Scales Using a Diverse Array of HardwareHegel Pedroza, Wallace Abreu, Ryan M. Corey, Irán R. Román. 1-5 [doi]
- Few-shot Keyword-incremental Learning Using Compositional InformationIlseok Kim, Ju-Seok Seong, Joon-Hyuk Chang. 1-5 [doi]
- Diffusion based Text-to-Music Generation with Global and Local Text based ConditioningJisi Zhang, Pablo Peso Parada, Md Asif Jalal, Karthikeyan Saravanan. 1-5 [doi]
- Prototypical Graph Alignment for Text-based Person SearchYu Huang, Canlong Zhang, Zhixin Li, Zhiwen Wang, Chunrong Wei. 1-5 [doi]
- Spectral-Aware Low-Rank Adaptation for Speaker VerificationZhe Li, Man-Wai Mak, Mert Pilanci, Hung-yi Lee, Helen Meng. 1-5 [doi]
- NCDI-Diffusion: Neural Contextual and Directional Inversion for Novel View Synthesis through Diffusion ModelsWenpeng Xing, Jie Chen 0026, Zaifeng Yang, Xin Tong, Changting Lin, Meng Han. 1-5 [doi]
- KCE-Unet: A novel music denoising method with KANConv ECA UnetShijie Zhang, Yulun Wu, Ganghui Ru, Yi Yu, Wei Li. 1-5 [doi]
- C3D-VIT: Consistency-Aware 3D Vision Transformer for Face Forgery DetectionJingyi Zhang, Peng Zhang, Jingjing Wang. 1-5 [doi]
- AS-Net: Adaptive Style-aware Network for Handwritten Text GenerationYiming Wang, Hongxi Wei, Heng Wang. 1-5 [doi]
- Rethinking Adversarial Attacks in Reinforcement Learning from Policy Distribution PerspectiveTianyang Duan, Zongyuan Zhang, Zheng Lin, Yue Gao 0001, Ling Xiong, Yong Cui 0001, Hongbin Liang, Xianhao Chen, Heming Cui, Dong Huang 0005. 1-5 [doi]
- Secure Analog Beamforming Design for Wireless Communication Systems With Movable AntennasWeijie Xiong, Kai Zhong, Zhiling Xiao, Jingran Lin, Qiang Li 0017. 1-5 [doi]
- Get Large Language Models Ready to Speak: A Late-fusion Approach for Speech GenerationMaohao Shen, Shun Zhang, Jilong Wu, Zhiping Xiu, Ehab A. AlBadawy, Yiting Lu, Mike Seltzer, Qing He. 1-5 [doi]
- Classifying Music-Induced Emotion Using Multi-Modal Ensembles of EEG and Audio Feature ModelsPhilipp Paukner, Marisa Ripoll, Dilvan Sabir, Deniz Onat Erdogan, Luca Sacchetto, Klaus Diepold. 1-5 [doi]
- Efficient and Expandable Token-Level Approach for Multi-Domain Sensitive Information ClassificationHongyi Li, Jiawei Ye, Jie Wu 0003, Lijun Zu. 1-5 [doi]
- Point Clean-label Backdoor Attack for Specific Classes via Feature EntanglementYang Wang, Wei Li, Shengbo Chen, Hong Rao, Azman Mohammad. 1-5 [doi]
- Novel Deep Gaussian Process Structures with Flexible DepthsYuanqing Song, Yuhao Liu 0002, Petar M. Djuric. 1-5 [doi]
- Adversarial Learning For End-To-End Cochlear Speech Denoising Using Lightweight Deep Learning ModelsTom Gajecki, Waldo Nogueira. 1-5 [doi]
- CSS: Overcoming Pose and Scene Challenges in Crowd-Sourced 3D Gaussian SplattingRunze Chen, Mingyu Xiao 0003, Haiyong Luo, Fang Zhao 0003, Fan Wu 0006, Hao Xiong, Qi Liu, Meng Song. 1-5 [doi]
- CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Downstream Vision Tasks Under Unknown DegradationsYuwei Zhang, Yan Wu, Yanming Liu, Xinyue Peng. 1-5 [doi]
- A Reinforcement Learning Agent Controlled Multi-branch Small Object Detection FrameworkJunkun Hong, Yitian Long, Yueyi Luo, Liujie Hua, Jun Long, Qianqian Qi 0009. 1-5 [doi]
- Diffusion Model Based Image Reconstruction in Lensless ImagingAshish Verma, Vivek Boominathan, Ashok Veeraraghavan, Chandra Sekhar Seelamantula. 1-5 [doi]
- Out-of-Distribution Radar Detection in Compound Clutter and Thermal Noise through Variational AutoencodersY. A. Rouzoumka, Eugénie Terreaux, Christèle Morisseau, Jean Philippe Ovarlez, C. Ren. 1-5 [doi]
- SegAug: CTC-Aligned Segmented Augmentation For Robust RNN-Transducer Based Speech RecognitionKhanh Le, Tuan Vu Ho, Dung Tran, Duc Thanh Chau. 1-5 [doi]
- Multi-Objective Large Language Model UnlearningZibin Pan, Shuwen Zhang, Yuesheng Zheng, Chi Li, Yuheng Cheng, Junhua Zhao 0001. 1-5 [doi]
- BAD: Bidirectional Auto-Regressive Diffusion for Text-to-Motion GenerationSeyed Rohollah Hosseyni, Ali Ahmad Rahmani, Seyed Jamal Seyed-Mohammadi, Sanaz Seyedin, Arash Mohammadi 0001. 1-5 [doi]
- Subdomain Uncertainty Optimization for Cross-Speed Fault DiagnosisJianbo Zheng, Lida Huang, Tairui Zhang, Bin Jiang 0006, Chao Yang 0015. 1-5 [doi]
- TokenSynth: A Token-based Neural Synthesizer for Instrument Cloning and Text-to-InstrumentKyungsu Kim, Junghyun Koo, Sungho Lee, Haesun Joung, Kyogu Lee. 1-5 [doi]
- TopoRefine: Iterative Refinement with Reasoning Topology as High-Level FeedbackHaoran Liao, Shaohua Hu, Zhihao Zhu, Hao He 0007, Yaohui Jin. 1-5 [doi]
- Rethinking Encoder-Decoder Flow Through Shared StructuresFrederik Laboyrie, Mehmet Kerim Yucel, Albert Saà-Garriga. 1-5 [doi]
- DPC: Large Model Alignment Method based on Decoding Probability CorrectionYanyi Huang, Yuying Liu, Lei Tian, Yue Zhang, Xuechen Zhao, Bin Zhou 0004, Guodong Ma. 1-5 [doi]
- Sentiment Analysis for Live Video Comments with Diffused Fusion of Cross-modal RepresentationsChangfan Luo, Ling Fang, Bensheng Qiu. 1-5 [doi]
- RSM: Refined Saliency Map For Explainable 3D Object TrackingRiran Cheng, Xupeng Wang 0001, Ferdous Sohel, Hang Lei. 1-5 [doi]
- CabiNet: A Deep Learning Framework for Multiclass Medical Image Segmentation from Multiple Single Class DatasetsAman Soni, Ishita Maiti, Nirmalya Ghosh. 1-5 [doi]
- PulmoScan: A Practical Pulmonary Disease Pre-Screening SystemBaixu Yan, Shijia Ge, Meizi Lu, Weixiang Zhang, Shuzhao Xie, Zhi Wang. 1-5 [doi]
- SpeechCaps: Advancing Instruction-Based Universal Speech Models with Multi-Talker Speaking Style CaptioningChien-Yu Huang, Min-Han Shih, Ke-Han Lu, Chi-Yuan Hsiao, Hung-yi Lee. 1-5 [doi]
- Exploring Acoustic Similarity in Emotional Speech and Music via Self-Supervised RepresentationsYujia Sun, Zeyu Zhao, Korin Richmond, Yuanchao Li. 1-5 [doi]
- Enhancing Boundary-Handling Strategies for Convolutional Sparse Representation Model with 46 × 46 Convolution-Multiplication PropertiesYuto Tsukiashi, Yoshimitsu Kuroki. 1-5 [doi]
- Inter- and Intra-Sentence Cuer-Invariant Representation Learning for Generalizable Cued Speech RecognitionTianxin Xie, Li Liu. 1-5 [doi]
- Mixture of Experts Fusion for Fake Audio Detection Using Frozen wav2vec 2.0Zhiyong Wang, Ruibo Fu, Zhengqi Wen, Jianhua Tao 0001, Xiaopeng Wang, Yuankun Xie, Xin Qi, Shuchen Shi, Yi Lu, Yukun Liu, Chenxing Li, Xuefei Liu, Guanjun Li. 1-5 [doi]
- Single-View Clothed Human Reconstruction using Symmetric FeatureYanping Fu, Zhuangzhuang Zhao, Hao Geng, Haifeng Zhao. 1-5 [doi]
- Sample Efficient Reinforcement Learning via Large Vision Language Model DistillationDonghoon Lee, Tung Minh Luu, Younghwan Lee, Chang D. Yoo. 1-5 [doi]
- GENIE: Socially Unbiased Generative Text-to-Image EditingJulia Kaiwen Lau, Raphaël C.-W. Phan, Sailaja Rajanala, Ingemar J. Cox, Arghya Pal. 1-5 [doi]
- Dual-energy CT metal artifact reduction by combined material decomposition and projection domain threshold segmentationKai Chen, Tianling Lyu, Jean-Louis Coatrieux, Yang Chen 0008. 1-5 [doi]
- MambaCPU: Enhanced Correlation Mining with State Space Models for CPU Performance PredictionXiaoman Liu. 1-5 [doi]
- Hyperbolic PHATE: Visualizing Continuous Hierarchy of Latent Differentiation StructuresMasahiro Nakano, Hiroki Sakuma, Ryo Nishikimi, Kenji Komiya, Tomoharu Iwata, Kunio Kashino. 1-5 [doi]
- Enhancing Chest X-ray Classification through Knowledge Injection in Cross-Modality LearningYang Yan, Bingqing Yue, Qiaxuan Li, Man Huang, Jingyu Chen, Zhenzhong Lan. 1-5 [doi]
- CardioFlow: Learning to Generate ECG from PPG with Rectified FlowYuta Nambu, Masahiro Kohjima, Ryuji Yamamoto. 1-5 [doi]
- DSDIR: A Two-Stage Method for Addressing Noisy Long-Tailed Problems in Malicious Traffic DetectionGuoliang Li, Ruiqi Zhang, Zhe Sun, Lingkai Xing, Yu Zhang 0009. 1-5 [doi]
- Raw Audio Deep Learning Filter Banks for Acoustic Scene ClassificationDaniele Salvati. 1-5 [doi]
- AGR: Age Group fairness Reward for Bias Mitigation in LLMsShuirong Cao, Ruoxi Cheng, Zhiqiang Wang. 1-5 [doi]
- Large-Scale Recurrent Neural Networks with Fully Homomorphic Encryption for Privacy-Enhanced Speaker IdentificationVele Tosevski, Glenn Gulak. 1-5 [doi]
- O-EENC-SD: Efficient Online End-to-End Neural Clustering for Speaker DiarizationElio Gruttadauria, Mathieu Fontaine 0002, Jonathan Le Roux, Slim Essid. 1-5 [doi]
- Boosting Lightweight Camouflaged Object Detection with Multi-Scale Context and Boundary AwarenessZihan Xu, Zheng Wang 0008, Haoyu Wang, Cheng Liu, Yan Zhou, Meijun Sun. 1-5 [doi]
- Generative Speech Foundation Model Pretraining for High-Quality Speech Extraction and RestorationPin-Jui Ku, Alexander H. Liu, Roman Korostik, Sung-Feng Huang, Szu-Wei Fu, Ante Jukic. 1-5 [doi]
- Local Adaptive Time-Frequency Bidirectional Synchrosqueezing TransformCuiwentong Xu, Yuhe Liao. 1-5 [doi]
- Interference-Resilient Hybrid Multi-Antenna ARQNicholas D. Sidiropoulos, Yuanyuan Tang. 1-5 [doi]
- LAVCap: LLM-based Audio-Visual Captioning using Optimal TransportKyeongha Rho, Hyeongkeun Lee, Valentio Iverson, Joon Son Chung. 1-5 [doi]
- FADEL: Uncertainty-aware Fake Audio Detection with Evidential Deep LearningJu Yeon Kang, Ji Won Yoon, Semin Kim, Min Hyun Han, Nam Soo Kim. 1-5 [doi]
- Bilevel Learning for Low-Light Image Enhancement and DetectionWei Wang, Bojian Song, Xi Chen, Yaxin Gao, Weimin Lei. 1-5 [doi]
- Popularity and Interest Signal Detection for Sequential Recommendation DenoisingXuewei Li 0001, Kunyi Yang, Yue Zhao, Tianyi Xu, Jian Yu 0003, Mei Yu 0004, Mankun Zhao. 1-5 [doi]
- Data-Free Post-Training Quantization with Block-wise Enhanced Sample GenerationRuiyao Zhang, Zhiwei Dong, Long Huang, Shutong Ti, Songlu Chen, XuCheng Yin. 1-5 [doi]
- SCNN: Spike Coupling Neural Network for Multimodal Brain Network AnalysisShaolong Wei, Jiashuang Huang, Mingliang Wang, Shu Jiang, Weiping Ding 0001. 1-5 [doi]
- Generating Apoptosis-Inducing Anticancer Peptides Targeting BCL-xL Using Latent Diffusion Models on Small DatasetsTiara Natasha Binte Sayuti, Conghao Wang, Jagath C. Rajapakse. 1-5 [doi]
- Diverse Collaboration in Multi-Agent Reinforcement Learning via Self-Adaptive MethodXiang Xue, Quan Liu 0004, Meilong Shi, Yuchao Jin. 1-5 [doi]
- Controllable Forgetting Mechanism for Few-Shot Class-Incremental LearningKirill Paramonov, Mete Ozay, Eunju Yang, Jijoong Moon, Umberto Michieli. 1-5 [doi]
- Energy Consumption Trends in Sound Event Detection SystemsConstance Douwes, Romain Serizel. 1-5 [doi]
- Delving into Transformer-based Network Architecture for Guided Depth Super-ResolutionXinchen Ye, Aokai Zhang, Rui Xu, Haojie Li. 1-5 [doi]
- Test Time Prompt Tuning for Domain Adaptive Gaze EstimationJingjing Wang, Pengwei Yin. 1-5 [doi]
- The First Indoor Pathloss Radio Map Prediction ChallengeStefanos Bakirtzis, Çagkan Yapar, Kehai Qiu, Ian J. Wassell, Jie Zhang 0003. 1-2 [doi]
- Loudspeaker Beamforming to Enhance Speech Recognition Performance of Voice Driven ApplicationsD. de Groot, Baturalp Karslioglu, Odette Scharenborg, J. Martinez. 1-5 [doi]
- ST-HCSS: Deep Spatio-Temporal Hypergraph Convolutional Neural Network for Soft SensingHwa Hui Tew, Fan Ding, Gaoxuan Li, Junn Yong Loo, Chee-Ming Ting, Ze Yang Ding, Chee Pin Tan. 1-5 [doi]
- SVTNet: Dual Branch of Swin Transformer and Vision Transformer for Monocular Depth EstimationShuwen Jia, Yongxiong Wang, Han Chen, Shuai Huang. 1-5 [doi]
- A first-order DirAC-based parametric Ambisonic coder for immersive communicationsGuillaume Fuchs, Florin Ghido, Dominik Weckbecker, Oliver Thiergart. 1-5 [doi]
- Multi-Objective Reinforcement Learning for Cognitive Radar Resource ManagementZiyang Lu, Subodh Kalia, Mustafa Cenk Gursoy, Chilukuri K. Mohan, Pramod K. Varshney. 1-5 [doi]
- Positioning and transmission in cell-free networks: ambiguity function, and MRC/MRT array gainsLuc Vandendorpe, Laurence Defraigne, Guillaume Thiran, Thomas Pairon, Christophe Craeye. 1-5 [doi]
- Evidential-TTS: High Fidelity Zero-Shot Text-to-Speech Using Evidential Deep LearningMyeonghun Jeong, Minchan Kim, Semin Kim, Nam Soo Kim. 1-5 [doi]
- A-PeARCNN: a Physics-encoded AutoRegressive Convolutional Neural Network with AttentionNet for Solving Partial Differential EquationsYibo Han, Ruixuan Ren, Tiejun Li, Jingyi Chen, Jianmin Zhang. 1-5 [doi]
- Distillation and Pruning for Scalable Self-Supervised Representation-Based Speech Quality AssessmentBenjamin Stahl, Hannes Gamper. 1-5 [doi]
- Hypergradient-free Training for Deep Equilibrium ModelsYuhan Lin, Shengxiang Deng, Xudong Li. 1-5 [doi]
- Subtractive Training for Music Stem Insertion Using Latent Diffusion ModelsIvan Villa-Renteria, Mason Long Wang, Zachary Shah, Zhe Li, Soohyun Kim, Neelesh Ramachandran, Mert Pilanci. 1-5 [doi]
- WebSurfer: Enhancing LLM Agents with Web-Wise Feedback for Web NavigationDie Hu 0004, Jingguo Ge, Weitao Tang, Guoyi Li, Liangxiong Li, Bingzhen Wu. 1-5 [doi]
- DSINet: Towards Real-Time Target Speaker Extraction with Dynamic Speaker Information FusionFengyuan Hao, Andong Li, Xiaodong Li 0002, Chengshi Zheng. 1-5 [doi]
- Fully Connected Tensor Network based Brain Structural Feature Extraction for Early Alzheimer's Disease DetectionFei He, Xinyue Li, Ce Zhu, Fan Zhang, Yipeng Liu. 1-5 [doi]
- Singing Voice Conversion with Accompaniment Using Self-Supervised Representation-Based Melody FeaturesWei Chen 0071, Binzhu Sha, Jing Yang, Zhuo Wang, Fan Fan, Zhiyong Wu. 1-5 [doi]
- Compressive Imaging Reconstruction via Conditional Diffusion Model With Augmented MeasurementsEmmanuel Martínez 0002, Leon Suarez, Romario Gualdrón-Hurtado, Roman Jacome, Henry Arguello. 1-5 [doi]
- A Bandwidth Efficient Dual Function Radar Communication System Based on a MIMO Radar Using OTFS WaveformsKailong Wang 0003, Athina P. Petropulu. 1-5 [doi]
- Multi-modal Salient Object Detection via a Unified Diffusion ModelShuo Zhang 0013, Jiaming Huang, Wenbing Tang 0001, Lili Tian, Yuang Wei, Jing Liu 0012. 1-5 [doi]
- Codec-ASV: Exploring Neural Audio Codec For Speaker Representation LearningYuke Lin, Fulin Zhang, Yingying Gao, Shilei Zhang, Ming Li. 1-5 [doi]
- Dual-Pyramid Attention Collaborative Network for Oracle Bone Inscription ClassificationJiaying Gao, Fausto Giunchiglia, Tongyu Zhao, Chuntao Li, Hao Xu 0012. 1-5 [doi]
- NTC-KWS: Noise-aware CTC for Robust Keyword SpottingYu Xi, Haoyu Li, Hao Li, Jiaqi Guo, Xu Li, Wen Ding, Kai Yu. 1-5 [doi]
- Enhancing the Robustness of LiDAR-based Object Detection under Disappearing AttacksHuiying Wang, Lisong Zhang, Wenbo Wang, Yu Wen 0001. 1-5 [doi]
- Doppler Single-Photon LidarRuangrawee Kitichotkul, Joshua Rapp, Yanting Ma, Hassan Mansour. 1-5 [doi]
- DOA Estimation Based on Enhanced SRP-MVDR Using Kronecker Product Decomposition for Large Rectangular Microphone ArraysYichen Zeng, Jilu Jin, Gongping Huang, Jingdong Chen, Jacob Benesty. 1-5 [doi]
- Subjective Voice Quality of the IVAS CodecAnssi Rämö, Henri Toukomaa. 1-5 [doi]
- Deep Variational Sequential Monte Carlo for High-Dimensional ObservationsWessel L. van Nierop, Nir Shlezinger, Ruud J. G. van Sloun. 1-5 [doi]
- An LSTM Feature Imitation Network for Hand Movement Recognition from sEMG SignalsChuheng Wu, Seyed Farokh Atashzar, Mohammad M. Ghassemi, Tuka Alhanai. 1-5 [doi]
- Path Signatures are Unsupervised Time Series Anomaly ExtractorsRuiqi Wang, Zhenwei Zhang, Yuantao Gu. 1-5 [doi]
- TriFP-NGram: Integrating Three Complementary Fingerprint and N-Gram Features for Enhanced Drug-Target Affinity PredictionJiao Wang, Ge Kong, Juan Wang. 1-5 [doi]
- A Multi-Modal Information Fusion Model for Automatic Sleep StagingXuhui Wang, Yuanyuan Zhu, Xiaodong Jia. 1-5 [doi]
- Digital Twin-Driven Bearing-Fault Detection in Induction Motor and Drives using Graph Sampling and Aggregation NetworkHaraprasad Badajena, Suryanarayan Majhi, Bivash Chakraborty, Mamata Jenamani, Aurobinda Routray, Ronit Dutta. 1-5 [doi]
- MFANet: Multi-Feature Aggregation Network for Multi-focus Image FusionLibo Zhao, Xiaoli Zhang, Bo Huang, Mingjie Tian, Zeyu Wang. 1-5 [doi]
- LEP: Leveraging Local Entropy Pruning for Sparsity in Large Language ModelsYuli Chen, Bo Cheng, Yingying Zhang, Shuhao Zhang, Zhixuan Wu, Fanshen Meng. 1-5 [doi]
- FuzzyMIL: Decoupling Pathological Phenotypes through Deep Fuzzy Clustering for Efficient Whole Slide Image AnalysisAnran Liu, Tong Li, Jing Cai, Srinivasa Sampath Veer Vajrala. 1-5 [doi]
- Complex Coprime Frequency Sum Based Signal Representation for Period EstimationShaik Basheeruddin Shah, Nazar T. Ali, Ahmed Altunaiji, Vijay Kumar Chakka, Mohamed I. AlHajri. 1-5 [doi]
- Mining Scene Structural Guidance for Thermal Images in Self-Supervised Monocular Depth EstimationXinchen Ye, Xia Mao, Rui Xu, Haojie Li. 1-5 [doi]
- Relative Localization of Asynchronous Agents Based on Hybrid Active-Passive Two-Way RangingJianing Zhang, Wei Wang, Shiyu Dong, Feng Pan, Baoguo Yu, Xin Li. 1-5 [doi]
- StableQuant: Layer Adaptive Post-Training Quantization for Speech Foundation ModelsYeona Hong, Hyewon Han, Woo Jin Chung, Hong-Goo Kang. 1-5 [doi]
- WinStega: An Adaptive Robust Enhancement Framework for Generative Linguistic SteganographyKaiyi Pang, Minhao Bai, Jinshuai Yang, Wei-Qiang Zhang, Minghu Jiang, Yongfeng Huang 0001. 1-5 [doi]
- SAR Ship Detector Using Cross-stage Feature Fusion and Decoupled Head with Mutual GuidanceYixin Qiao, Xiaoxiao Yin, Xinyuan Zhou, Shiyong Lan, Guonan Deng. 1-5 [doi]
- Two-stream Semantic Alignment Networks for Multi-label Image ClassificationWenlan Kuang, Zhixin Li 0001. 1-5 [doi]
- Gaze-GZ: Generalized Gaze Estimation with Multi-scale Gaze Zone PredictionZheng Gao, Puneet Kumar 0003, Hao Zou, Xiaobai Li. 1-5 [doi]
- Few-shot Image Classification based on Attribute Prediction and SelectionXin Sun, Boqian Liu, Xinchen Ye, Guanqiao Chen, Rui Xu, Haojie Li. 1-5 [doi]
- Leveraging Boolean Directivity Embedding for Binaural Target Speaker ExtractionYichi Wang 0001, Jie Zhang 0042, Chengqian Jiang, Weitai Zhang, Zhongyi Ye, Lirong Dai 0001. 1-5 [doi]
- Multichannel-to-Multichannel Target Sound Extraction Using Direction and Timestamp CluesDayun Choi, Jung-Woo Choi. 1-5 [doi]
- Agentic Copyright Watermarking against Adversarial Evidence Forgery with Purification-Agnostic Curriculum Proxy LearningErjin Bao, Ching-Chun Chang, Hanrui Wang 0005, Isao Echizen. 1-5 [doi]
- Quality and Complexity Tradeoffs for DNN-Based Binaural Speech Enhancement in Hearing AidsParth Mishra, Deepak Kadetotad, Eric A. Durant, Terence Betlehem, Martin F. McKinney. 1-5 [doi]
- Multi-Style Facial Sketch Synthesis through Masked Generative ModelingBowen Sun, Guo Lu, Shibao Zheng. 1-5 [doi]
- Separate Estimation of Angular Velocity and Angle for Digital Array RadarTsubasa Terada, Toshihiro Ito, Ryuhei Takahashi. 1-5 [doi]
- Dimensionality-Reduced Spatial Bipartite Graph Clustering for Hyperspectral and LiDAR DataZhe Cao, Haonan Xin, Bo Yan, Jinping Sui, Rong Wang 0001. 1-5 [doi]
- Hierarchical Multimodal Decoupling-Fusion Framework for offline Multiple Appropriate Facial Reaction GenerationQincheng Lv, Xiaofeng Liu 0006, Jie Li 0009, Rongrong Ni, Pujun Xue, Siyang Song. 1-5 [doi]
- Mora-Level Prosody Prediction for Text-to-Speech Using Japanese BERT Without Accentual LabelsTadashi Ogura, Takuma Okamoto, Yamato Ohtani, Erica Cooper, Tomoki Toda, Hisashi Kawai. 1-5 [doi]
- MSACC: A Unified Multimodal Sentiment Analysis Framework for High Interpretability and Zero-shot PerformanceYi Liang, Turdi Tohti, Bo Kong, Dongfang Han, Tianwei Yan 0001, Askar Hamdulla. 1-5 [doi]
- Cross-Component Residual Prediction for Geometry-Based Point Cloud CompressionBharath Vishwanath, Yingzhan Xu, Kai Zhang, Li Zhang. 1-5 [doi]
- Adaptive Decoding for Efficient Automatic Speech RecognitionXiangnan Ma, Peizhuo Liu, Yuhao Zhang, Kaiqi Kou, Chenghao Gao, Tong Xiao, Jingbo Zhu. 1-5 [doi]
- CoGAP: A Personalized Federated Learning Method Using Collaborative Optimization for Medical Image ClassificationShenhai Zheng, Congyu Li, Sian Wen, Xi Gao, Lei Yu. 1-5 [doi]
- AudioBERT: Audio Knowledge Augmented Language ModelHyunjong Ok, Suho Yoo, Jaeho Lee. 1-5 [doi]
- A Multi-scenario Attention-based Generative Model for Personalized Blood Pressure Time Series ForecastingCheng Wan, Chenjie Xie, Longfei Liu, Dan Wu, Ye Li. 1-5 [doi]
- Robust Kernel Sparse Subspace ClusteringIvica Kopriva. 1-5 [doi]
- Foreground-aware Prototypical Network for Prohibited Item Detection from X-ray ScansXu Yang, Yufei Li, Long Tian, Haonan Shi, Ting Lan, Xiyang Liu. 1-5 [doi]
- LLM supervised Pre-training for Multimodal Emotion Recognition in ConversationsSoumya Dutta, Sriram Ganapathy. 1-5 [doi]
- Ultra low-compute complex spectral masking for multichannel speech enhancementAshutosh Pandey 0004, Juan Azcarreta. 1-5 [doi]
- The First VoicePrivacy Attacker ChallengeNatalia A. Tomashenko, Xiaoxiao Miao, Emmanuel Vincent 0001, Junichi Yamagishi. 1-2 [doi]
- AAD-DCE: An Aggregated Multimodal Attention Mechanism for Early and Late Dynamic Contrast Enhanced Prostate MRI SynthesisDivya Bharti, Sriprabha Ramanarayanan, Sadhana S, Kishore Kumar M, Keerthi Ram, Harsh Agarwal, Ramesh Venkatesan, Mohanasankar Sivaprakasam. 1-5 [doi]
- SFE-Net: Harnessing Biological Principles of Differential Gene Expression for Improved Feature Selection in Deep Learning NetworksYuqi Li, Yuanzhong Zheng, Yaoxuan Wang, Jianjun Yin, Haojun Fei. 1-5 [doi]
- Mobile-friendly Image de-noising: Hardware Conscious Optimization for Edge ApplicationSrinivas Soumitri Miriyala, Sowmya Vajrala, Hitesh Kumar, Sravanth Kodavanti, Vikram Nelvoy Rajendiran. 1-5 [doi]
- SPSinger: Multi-Singer Singing Voice Synthesis with Short Reference PromptJunchuan Zhao, Chetwin Low, Ye Wang 0007. 1-5 [doi]
- 3D TDOA-AOA Quaternion Based Acoustic SLAM for Drone Localization and Source MappingHala Abualsaud, Peter Gerstoft. 1-5 [doi]
- Audio-Faces Intra-Frame Alignment with Graph Attention Networks for Active Speaker DetectionYongkang Yin, Xusheng Yang, Liming Liang, Xu Li, Yuexian Zou. 1-5 [doi]
- Cloth-debiasing with Stable Diffusion in Cloth-changing Person Re-identificationHaiyang Zhang, Xinshuang Wang. 1-5 [doi]
- Masked Image Pretraining on Language Assisted RepresentationZejiang Hou, Sun-Yuan Kung. 1-5 [doi]
- Attribute Conditional Diffusion-Augmented Person Re-IdentificationShijie Nie, Ziqiang Shi, Rujie Liu, Song Guo, Meng Zhang, Mengjiao Wang 0001, Kazuki Osamura, Lina Septiana, Narishige Abe. 1-5 [doi]
- Energy-based Model Guided Self-Supervised Learning for Speaker VerificationYaqian Hao, Chenguang Hu, Chong Bian, Junlan Feng, Yingying Gao, Shilei Zhang. 1-5 [doi]
- Exploiting Wavelet Scattering Transform & Squeeze-Excitation Blocks with Cross-Modal Attention for Multi-modal Emotion RecognitionJunchen Liu, Jesin James, Karan Nathwani. 1-5 [doi]
- Prototype Alignment with LoRA Fusion for Class-Incremental LearningWei Zhang, Yuan Xie, Zhizhong Zhang, Xin Tan. 1-5 [doi]
- Decentralized Federated Dataset Dictionary Learning for Multi-Source Domain AdaptationRebecca Clain, Eduardo Fernandes Montesuma, Fred Ngolè Mboula. 1-5 [doi]
- Integrating Multi-Scale Compression Attention with Edge Detection for Ultrasound Tumor SegmentationRuoshi Li, Hao Qi, Xing Chen, Yinran Chen. 1-5 [doi]
- MFMamba: A Multimodal Fusion State Space Model for Depression RecognitionJingyi Liu, Yuanyuan Shang, Mengyuan Yang, Zhuhong Shao, Jiaxi Lu, Tie Liu. 1-5 [doi]
- A Cost-effective Solution for Remote Sensing Image Segmentation via Train/Test-Time AdaptationWei Chen 0009, Xin Luo 0009, Yulin He, Tianrui Liu, Di Wu, Tianhang Guo, Yuhang Li, Yuhua Tang. 1-5 [doi]
- Advancing NAM-to-Speech Conversion with Novel Methods and the MultiNAM DatasetNeil Kumar Shah, Shirish S. Karande, Vineet Gandhi. 1-5 [doi]
- GaussianEnhancer: A General Rendering Enhancer for Gaussian SplattingChen Zou 0007, Qingsen Ma, Jia Wang, Ming Lu, Shanghang Zhang, Zhaofeng He. 1-5 [doi]
- GREST: Ghost Targets Removal Algorithm Using Multipath Angle EstimationRyuhei Takahashi, Pu Wang 0004. 1-5 [doi]
- RecNet: Optimization for Dense Object Detection in Retail Scenarios Based on View RectificationJunhao Xiao, Yi Chen, Xiao Feng, Ruoyu Wang, Zhiyu Wu. 1-5 [doi]
- Leveraging Audio-Only Data for Text-Queried Target Sound ExtractionKohei Saijo, Janek Ebbers, François G. Germain, Sameer Khurana, Gordon Wichern, Jonathan Le Roux. 1-5 [doi]
- Discrete Unit-based Low-latency Multi-lingual Speech Synthesis for LIMMITS'25 ChallengeYu Jiang, Cheng Gong, Tianrui Wang, Chunyu Qiang, Haoyu Wang, Qiuyu Liu, Yuheng Lu, Xiaobao Wang, Xiaolei Zhang, Longbiao Wang, Jianwu Dang 0001. 1-2 [doi]
- Efficient Modeling and Low Complexity Implementation of Rate Estimation in Versatile Video CodingYanze Wang, Tianyi Sun, Hui Zhao, Jun Sun. 1-5 [doi]
- Efficient MDCT-Based Multi-Channel Coding with Perceptual Whitening and Broadband ILD CompensationGoran Markovic, Eleni Fotopoulou, Jan Frederik Kiene, Christian R. Helmrich. 1-5 [doi]
- Estimating the Number and Locations of Boundaries in Reverberant Environments with Deep LearningToros Arikan, Luca M. Chackalackal, Fatima Ahsan, Konrad Tittel, Andrew C. Singer, Gregory W. Wornell, Richard G. Baraniuk. 1-5 [doi]
- Enhancing DETR Efficiency with Inter-Object Relationship and Semantic Spectral Decomposition-Based DistillationZiyu Huang. 1-5 [doi]
- Advancing Paired Image-Mask Synthesis for Automated Nanoparticle PhenotypingXiaoqin Tang, Chaohui Liu, Guoqiang Xiao 0001. 1-5 [doi]
- Wave-Spectrogram Cross-Modal Aggregation for Audio Deepfake DetectionZehui Jin, Linlong Lang, Biao Leng. 1-5 [doi]
- Training Dialogue Systems by AI Feedback for Improving Overall Dialogue ImpressionKai Yoshida, Masahiro Mizukami, Seiya Kawano, Canasai Kruengkrai, Hiroaki Sugiyama, Koichiro Yoshino. 1-5 [doi]
- Time-domain Beamforming for Room Acoustics Analysis based on Reverberant Field EstimationTatiana Gelvez-Barrera, Quentin Leclere, Barbara Nicolas, Jérôme Antoni, Adrian Basarab. 1-5 [doi]
- Improving Pronunciation and Accent Conversion through Knowledge Distillation And Synthetic Ground-Truth from Native TTSTuan Nam Nguyen, Seymanur Akti, Ngoc-Quan Pham, Alexander Waibel. 1-5 [doi]
- Scaling Bioacoustic Signal Pre-training with Million Samples Via Mask-ModelingXuyao Deng, Tianjiao Wan, Kele Xu, Tian Gao, Peng Qiao, Dawei Feng, Yong Dou. 1-5 [doi]
- Learning-Based Utility Estimation with Application to Speech Enhancement of a Moving SpeakerJie Zhang 0042, Chengqian Jiang, Yichi Wang 0001, Haoyin Yan, Miao Sun. 1-5 [doi]
- GraphVCM: Virtual Center Mixing with Distance-Aware Regulation for Class Imbalanced Node ClassificationYixiao Ren, Yunfei Han, Yi Wang, Zhengdong Luo, Jinlong Liu, Yupeng Ma. 1-5 [doi]
- Stable Control Visual AutoRegressive Model: Precise and Efficient Image Generation via Scale AlignmentFeng Xie, Dahua Gao, Ruichao Liu, Minxi Yang, Yibo Zhang, Wenlong Wang. 1-5 [doi]
- Communication-efficient Verifiable and Oblivious Aggregation with Client DropoutsZhangshuang Guan, Yulin Zhao, Longyun Yang, Zhiguo Wan, Jinsong Han. 1-5 [doi]
- Indoor Sensing with MeasurementsVijaya Yajnanarayana, Philipp Geuer, Satyam Dwivedi. 1-5 [doi]
- Audio Explanation Synthesis with Generative Foundation ModelsAlican Akman, Qiyang Sun, Björn W. Schuller. 1-5 [doi]
- Resource Allocation for Semantic Segmentation Tasks in Autonomous Driving: A Likelihood Active Inference ApproachXinxin Zhu, F. Richard Yu, Ying He 0006, Biao He. 1-5 [doi]
- Multi-Task Joint 3D Swin Transformer Learning for Segmentation and Classification of Hyperspectral Medicine ImagesDong Zhang, Meijun Sun. 1-5 [doi]
- Cross-Layer Cache Aggregation for Token Reduction in Ultra-Fine-Grained Image RecognitionEdwin Arkel Rios, Jansen Christopher Yuanda, Vincent Leon Ghanz, Cheng-Wei Yu, Bo-Cheng Lai, Min-Chun Hu 0001. 1-5 [doi]
- Algorithm Design for Continual Learning in IoT NetworksShugang Hao, Lingjie Duan. 1-5 [doi]
- UniPlane: Unified Plane Detection and Reconstruction from Posed Monocular VideosYuzhong Huang, Fred Morstatter. 1-5 [doi]
- LAVViT: Latent Audio-Visual Vision Transformers for Speaker VerificationR. Gnana Praveen, Jahangir Alam 0001. 1-5 [doi]
- A Parametric Non-Negative Coupled Canonical Polyadic Decomposition Algorithm for Hyperspectral Super-ResolutionXi-Yuan Liu, Xiao-Feng Gong, Lei Wang, Wei Feng, Qiu-Hua Lin. 1-5 [doi]
- Harnessing Content and Structure in ID for Multimodal RecommendationYuting Liu 0003, Enneng Yang, Yizhou Dang, Guibing Guo, Qiang Liu 0006, Yuliang Liang, Linying Jiang, Xingwei Wang 0001. 1-5 [doi]
- Hybrid Feature Collaborative Reconstruction Network for Few-Shot Fine-Grained Image ClassificationShulei Qiu, Wanqi Yang, Ming Yang 0014. 1-5 [doi]
- Iterative Operator Sketching Framework for Large-Scale Imaging Inverse ProblemsJunqi Tang, Subhadip Mukherjee, Carola-Bibiane Schönlieb. 1-5 [doi]
- SAM Adaptation with Refocused Attention and Diverse Prompts for Medical Image SegmentationLiangshan Zhu, Xing Wu, Chengliang Wang, Haidong Wang. 1-5 [doi]
- Bone Conducted Signal Guided Speech Enhancement For Voice Assistant on EarbudsJens Heitkaemper, Joe Caroselli, Max McKinnon, Arun Narayanan, Nathan Howard. 1-5 [doi]
- An Attribute-Enriched Dataset and Auto-Annotated Pipeline for Open DetectionPengfei Qi, Yifei Zhang, Wenqiang Li, Youwen Hu, Kunlong Bai. 1-5 [doi]
- Spatiotemporal Causal Decoupling Model for Air Quality ForecastingJiaming Ma, Guanjun Wang, Sheng Huang, Kuo Yang, Binwu Wang, Pengkun Wang 0001, Yang Wang 0015. 1-5 [doi]
- RadDet: A Wideband Dataset for Real-Time Radar Spectrum DetectionZi Huang, Simon Denman, Akila Pemasiri, Terrence Martin, Clinton Fookes. 1-5 [doi]
- Correlated Attention in Transformers for Multivariate Time SeriesQuang Minh Nguyen, Lam M. Nguyen, Subhro Das. 1-5 [doi]
- BDGAN: Boundary and Diversity-aware Generative Adversarial Network for Imbalanced Medical Image AugmentationHongwei Ding, Qi Tao, Nana Huang. 1-5 [doi]
- Keypoint Aware Masked Image ModellingMadhava Krishna, A. V. Subramanyam. 1-5 [doi]
- LkSFocalNets: Video Action Recognition With Large Kernel Selective Focal NetworksJian Xiao, Ping Shi 0001, Qipei Li. 1-5 [doi]
- Exploring the Distribution of Cell Subpopulations in Pancreatic Ductal Adenocarcinoma Slides by Joint Spatial Transcriptomics and Pathology DataYaqi Deng, Wenjie Cai, Bentao Song, Bin Yang, Lingming Kong, Qingfeng Wang, Jun Huang. 1-5 [doi]
- DiffCSS: Diverse and Expressive Conversational Speech Synthesis with Diffusion ModelsWeihao Wu, Zhiwei Lin, Yixuan Zhou 0002, Jingbei Li, Rui Niu, Qinghua Wu, Songjun Cao, Long Ma, Zhiyong Wu. 1-5 [doi]
- Multi-Degradation Oriented Deep Unfolding Model for Hyperspectral Image ReconstructionXianhua Han, Jian Wang 0004. 1-5 [doi]
- Enhancing Few-Shot Out-of-Distribution Detection with Gradient Aligned Context OptimizationBaoshun Tong, Kaiyu Song, Hanjiang Lai. 1-5 [doi]
- MSRFormer: Hybrid Scale Self-Attention and Local Fast Convolution Transformer for Facial Expression RecognitionZi-Qiang Shen, Yu-Yi Tang, Jun-feng Yan, Yang Li, Guo-Ying Zhao. 1-5 [doi]
- Multichannel Modulo Sampling with Unlimited NoiseDorian Florescu. 1-5 [doi]
- Extending Whisper for Emotion Prediction Using Word-level Pseudo LabelsKwok Chin Yuen, Sheng Li 0010, Jia Qi Yip, Chenhui Chu, Tatsuya Kawahara, Eng Siong Chng. 1-5 [doi]
- Effective Techniques for Scaling Audio Encoder PretrainingByeonggeun Kim, Andrew Bydlon, Qingming Tang, Huy Phan, Chieh-Chi Kao, Tao Zhang, Chao Wang. 1-5 [doi]
- Massive MIMO Channel-aware Decision Fusion Aided by Reconfigurable Intelligent SurfacesDomenico Ciuonzo, Alessio Zappone, Marco Di Renzo, Linlong Wu. 1-5 [doi]
- Can AI See What We Can't? Leveraging Deep Learning and Multi-Temporal Satellite Data to Revolutionize Crop Type Mapping and Yield PredictionGautam Siddharth Kashyap, Harsh Joshi, Manaswi Kulahara, Rajkumar Dhakar, Atul Sajjanhar, Jiechao Gao, Sarthak Jain, Shahab Saquib Sohail. 1-5 [doi]
- Cross-Channel Unlabeled Sensing over a Union of Signal SubspacesTaulant Koka, Manolis C. Tsakiris, Benjamín Béjar Haro, Michael Muma. 1-5 [doi]
- Learning Simultaneous Facial Canonical Correlation Representation for Face HallucinationYun-Hao Yuan 0001, Jin Li, Jipeng Qiang, Yi Zhu, Xiaobo Shen 0001, Yun Li. 1-5 [doi]
- DASSL: Domain Agnostic Self-Supervised Learning with Multiple Missing Information Reconstruction BranchesJiang Fang, Haonan He, Chen Guo, Jiyan Sun, Zhaorui Guo, Chao Xu, Mohan Su, YinLong Liu, Wei Ma. 1-5 [doi]
- Hyperspectral Image Reconstruction with Unseen Material DetectionYe Ma, Songnan Lin, Bihan Wen. 1-5 [doi]
- HDA-GS: Hierarchical Density-Controlled for Anisotropic 3D Gaussian SplattingZhanke Wang, Guanhua Wu, Zhiyan Wang, Lu Xiao, Runling Liu, Jiahao Wu, Ronggang Wang. 1-5 [doi]
- Enhancing Extrapolation Reasoning on Temporal Knowledge Graphs with Logic Rules and QueriesTingxuan Chen, Liu Yang, Zidong Wang 0005, Shuai Luo, Jun Long. 1-5 [doi]
- Reinforced Domain Selection for Continuous Domain AdaptationHanbing Liu, Huaze Tang, Yanru Wu, Yang Li 0104, Xiao-Ping Zhang 0002. 1-5 [doi]
- Identifying Bots on Social Media through Coordinated Group PerceptionBoyu Qiao, Kun Li, Wei Zhou 0019, Shilong Li, Qianqian Lu, Songlin Hu 0001. 1-5 [doi]
- Large Language Model Should Understand Pinyin for Chinese ASR Error CorrectionYuang Li, Xiaosong Qiao, Xiaofeng Zhao, Huan Zhao, Wei Tang 0013, Min Zhang 0042, Hao Yang 0006. 1-5 [doi]
- Efficient Hierarchical Domain Adaptive Thermal Infrared TrackingQiao Li, Kanlun Tan, Qiao Liu, Di Yuan, Xin Li, Yunpeng Liu. 1-5 [doi]
- Enhancing Autonomous Driving through Dual-Process Learning with Behavior and Reflection IntegrationXiao Zhang, Kangsheng Wang, Tianyu Hu, Huimin Ma 0001. 1-5 [doi]
- SPEA: Large-Scale Entity Alignment via Self-PartitioningWeiguo Chen, Changjian Wang, Kele Xu, Yuan Yuan 0034, Wei Chen, Zixuan Dong. 1-5 [doi]
- Unveiling Performance Bias in ASR Systems: A Study on Gender, Age, Accent, and MoreMaliha Jahan, Priyam Mazumdar, Thomas Thebaud, Mark Hasegawa-Johnson, Jesús Villalba 0001, Najim Dehak, Laureano Moro-Velázquez. 1-5 [doi]
- Rapid Online Bayesian Learning for Deep ReceiversYakov Gusakov, Osvaldo Simeone, Tirza Routtenberg, Nir Shlezinger. 1-5 [doi]
- Autoregressive Language Model with Historical Context Re-encodingYimeng Zhuang. 1-5 [doi]
- Signaling Endothelial Reactivity Induced by Acute Hand Grip: Analyzing Vessel Diameter, Flow Velocity, and EMG ResponsesNimmi Sudarsan, Smit Shah 0001, Raj Kiran V, P. M. Nabeel, Dinu S. Chandran, Jayaraj Joseph. 1-5 [doi]
- Object-Centric Discriminative Learning for Text-Based Person RetrievalHaiwen Li, Delong Liu, Fei Su, Zhicheng Zhao. 1-5 [doi]
- Efficient Anchor Graph Clustering Through Enhanced Within-Cluster HomogeneityFangyuan Xie, Lin Zhao 0003, Jingjing Xue, Feiping Nie 0001, Weizhong Yu, Xuelong Li 0001. 1-5 [doi]
- Multi-level Encoder with Global Topic for Task-oriented Dialogue SummarizationZhuoqi He, Peijie Huang, Yuhong Xu, Youming Peng, Mingzhi Xu, Xinyang Lin. 1-5 [doi]
- Padnet: a Patch-Based Anomaly Detection Framework for Industrial Pipeline Damage DetectionErofili Alexaki, Christos Papaioannidis, Vasileios Mygdalis, Ioannis Pitas. 1-5 [doi]
- Modifying Flow Matching for Generative Speech EnhancementRoman Korostik, Rauf Nasretdinov, Ante Jukic. 1-5 [doi]
- Learning Music Audio Representations With Limited DataChristos Plachouras, Emmanouil Benetos, Johan Pauwels. 1-5 [doi]
- Palm-vein images reconstruction against adversarial attacksLunke Fei, Jiacheng Yang, Wai-Keung Wong, Shuping Zhao, Anne Toomey, Jiehang Deng. 1-5 [doi]
- MemKD: Memory-Discrepancy Knowledge Distillation for Efficient Time Series ClassificationNilushika Udayangani, Kishor Nandakishor, Marimuththu Palaniswami. 1-5 [doi]
- GraphDAE-PU: Graph Denosing Auto-Encoder for Arbitrary-Scale Point Cloud UpsamplingYuzhong Deng, Zhiheng Su, Penghui Shang, Dongzhen Liu, Di Wu, Jianxiao Zou, Shicai Fan. 1-5 [doi]
- Defying Imbalanced Forgetting for Class Incremental LearningYu Feng, Linqiang Deng, Huizi Yan, Xiaofan Li, Wenhao Wang, Yan Chang, Fuzhong Li, Weiqi Zhang. 1-5 [doi]
- Multilingual Speaker-Invariant Dysarthria Severity Assessment Using Adversarial Domain Adaptation and Self-Supervised LearningLauren Stumpf, Balasundaram Kadirvelu, A. Aldo Faisal. 1-5 [doi]
- Blind Estimation of Sub-band Acoustic Parameters from Ambisonics Recordings using Spectro-Spatial Covariance FeaturesHanyu Meng, Jeroen Breebaart, Jeremy Stoddard, Vidhyasaharan Sethu, Eliathamby Ambikairajah. 1-5 [doi]
- CPSNet: Comprehensive Enhancement Representation for Polyp Segmentation TaskJiati Cai, Xiaogang Liu, Hongjie Yang, Yi Ding, Ting Zhong, Zhen Qin. 1-5 [doi]
- A Transmitter-Model Unaware Generative Image Compression Framework for Semantic CommunicationRongcan Zheng, Xiaodan Song, Xuguang Zuo, Minxi Yang, Dahua Gao, Xuemei Xie. 1-5 [doi]
- Covariance Change Point Detection for Graph SignalsEven Matencio, Charles Truong, Laurent Oudre. 1-5 [doi]
- Advancing Dark Action Recognition via Modality Fusion and Dark-to-Light Diffusion ModelYuxuan Wang, Zhen Xing, Zuxuan Wu. 1-5 [doi]
- Particle-based Data-driven Nonlinear State Estimation of Model-free Process from Nonlinear MeasurementsAnubhab Ghosh, Yonina C. Eldar, Saikat Chatterjee. 1-5 [doi]
- MUPO-Net: A Multilevel Dual-domain Progressive Enhancement Network with Embedded Attention for CT Metal Artifact ReductionXiaoli Yao, Jia Tan, Zijian Deng, Deng Xiong, Qijun Zhao, Min Wu. 1-5 [doi]
- DARNet: A Dual Attention Residual Network for Medical Image ClassificationAo Zhang, Zhenghua Guan, Tengda Zhang, Wenzheng Hu, Yi Liu, Baiying Lei. 1-5 [doi]
- Wave-U-Mamba: An End-To-End Framework For High-Quality And Efficient Speech Super ResolutionYongjoon Lee, Chanwoo Kim. 1-5 [doi]
- On The Role of Prompt Construction In Enhancing Efficacy and Efficiency of LLM-Based Tabular Data GenerationBanooqa H. Banday, Kowshik Thopalli, Tanzima Z. Islam, Jayaraman J. Thiagarajan. 1-5 [doi]
- Point Cloud Registration via Reconstruction with Local Geometry Information AggregationZewei Pan, Linsen Li, Tongxin Yuan, Jiarong Yang. 1-5 [doi]
- Animation Anycolor: Enhancing Line Drawing Colorization with Keypoint MatchingLiyao Wang, Zuzeng Lin, Danni Wu, Zihao Yu, Suzhe Zhang, Zixian Wu, Feng Wang 0015. 1-5 [doi]
- PIER: A Novel Metric for Evaluating What Matters in Code-SwitchingEnes Yavuz Ugan, Ngoc-Quan Pham, Leonard Bärmann, Alex Waibel. 1-5 [doi]
- MusicEval: A Generative Music Dataset with Expert Ratings for Automatic Text-to-Music EvaluationCheng Liu, Hui Wang 0075, Jinghua Zhao 0004, Shiwan Zhao, Hui Bu, Xin Xu, Jiaming Zhou, Haoqin Sun, Yong Qin. 1-5 [doi]
- Right Label Context in End-to-End Training of Time-Synchronous ASR ModelsTina Raissi, Ralf Schlüter, Hermann Ney. 1-5 [doi]
- MEIJU - The 1st Multimodal Emotion and Intent Joint Understanding ChallengeRui Liu 0008, Xiaofen Xing, Zheng Lian, Haizhou Li 0001, Björn W. Schuller, Haolin Zuo. 1-2 [doi]
- TDMER: A Task-Driven Method for Multimodal Emotion RecognitionQian Xu, Yu Gu 0015, Chenyu Li, He Zhang, Hai Xiang Lin, Linsong Liu. 1-5 [doi]
- Enhancing Zero-Shot Relation Extraction through Staged Interaction with Large Language ModelsYifang Zhang, Pengfei Duan, Yiwen Yang, Shengwu Xiong 0001. 1-5 [doi]
- MacST: Multi-Accent Speech Synthesis via Text Transliteration for Accent ConversionSho Inoue, Shuai Wang 0016, Wanxing Wang, Pengcheng Zhu 0004, Mengxiao Bi, Haizhou Li 0001. 1-5 [doi]
- HiFi-SR: A Unified Generative Transformer-Convolutional Adversarial Network for High-Fidelity Speech Super-ResolutionShengkui Zhao, Kun Zhou 0003, Zexu Pan, Yukun Ma, Chong Zhang 0003, Bin Ma 0001. 1-5 [doi]
- MTDA-HSED: Mutual-Assistance Tuning and Dual-Branch Aggregating for Heterogeneous Sound Event DetectionZehao Wang, Haobo Yue, Zhicheng Zhang, Da Mu, Jin Tang 0007, Jianqin Yin. 1-5 [doi]
- Dynamic-static Feature Fusion with Multi-scale Attention for Continuous Blood Glucose PredictionJing Gao 0007, Chenhua Guo, Yingshu Liu, Peng Li 0027, Jianing Zhang, Meng Liu. 1-5 [doi]
- VoicePrompter: Robust Zero-Shot Voice Conversion with Voice Prompt and Conditional Flow MatchingHa-Yeong Choi, Jaehan Park. 1-5 [doi]
- Schoenberg Kernel Loss for Spiking Neural Network TrainingLiyang Ru, Kan Li 0002, José C. Príncipe. 1-5 [doi]
- Beyond Point Annotation: A Weakly Supervised Network Guided by Multi-Level Labels Generated from Four-Point Annotation for Thyroid Nodule Segmentation in Ultrasound ImageJianning Chi, Zelan Li, Huixuan Wu, Wenjun Zhang, Ying Huang. 1-5 [doi]
- IOR: Inversed Objects Replay for Incremental Object DetectionZijia An, Boyu Diao, Libo Huang, Ruiqi Liu, Zhulin An, Yongjun Xu 0001. 1-5 [doi]
- Exploring Spectral Signatures of Chinese liquor using Machine Learning and SHapley Additive exPlanationsDanlei Chen, Yun Wang, Linruize Tang, Zhengqiao Zhao, Jie Chen, Jingdong Chen. 1-5 [doi]
- Person-In-Bed Detection using Frequency Domain Features and GLR-based CuSumG. Dhinesh Chandran, Srikrishna Bhashyam, Srinivas Reddy Kota. 1-2 [doi]
- Kronecker-structured Sparse Vector Recovery with Application to IRS-MIMO Channel EstimationYanbin He, Geethu Joseph. 1-5 [doi]
- Class-wise Adaptive Logits Distillation with Meta-LearningXiao Huang, Wu Chen, Wei Zhou. 1-5 [doi]
- Trusted Mamba Contrastive Network for Multi-View ClusteringJian Zhu, Xin Zou 0001, Lei Liu 0029, Zhangmin Huang, Ying Zhang, Chang Tang, Li-Rong Dai 0001. 1-5 [doi]
- HyperKAN: Hypergraph Representation Learning with Kolmogorov-Arnold NetworksXiangfei Fang, Boying Wang, Chengying Huan, Shaonan Ma, Heng Zhang, Chen Zhao. 1-5 [doi]
- A Method for Removing Reflections from Water Surface Images Based on Pre-trained Image RestorationMinghua Zhao, Rui Zhi, Shuangli Du, Jing Hu 0005, Cheng Shi 0002, Lin Wang. 1-5 [doi]
- AccentBox: Towards High-Fidelity Zero-Shot Accent GenerationJinzuomu Zhong, Korin Richmond, Zhiba Su, Siqi Sun. 1-5 [doi]
- Self-Geometry-Guided Direct Pose Regression Based on Dual Perspective Fusion for 2D-3D Cross Dimensional Spinal Surgery NavigationJing Ling, Zhengyang Wu 0002, Changqing Li, Weisheng Li, Chao Zhang, Yucheng Shu. 1-5 [doi]
- A Singing Melody Extraction Network Via Self-Distillation and Multi-Level SupervisionYing Hu, Jiabo Jing, Fan Li, Lijun He, Li Lin, Wenzhong Yang. 1-5 [doi]
- An Efficient Hybrid Quantum Variational Classifier With Matrix Product StateWanqi Sun, Jungang Xu, Chenghua Duan. 1-5 [doi]
- MRI2Speech: Speech Synthesis from Articulatory Movements Recorded by Real-time MRINeil Kumar Shah, Ayan Kashyap, Shirish S. Karande, Vineet Gandhi. 1-5 [doi]
- Adapting Without Seeing: Text-Aided Domain Adaptation for Adapting CLIP-like Models to Novel DomainsLouis Hémadou, Héléne Vorobieva, Ewa Kijak, Frédéric Jurie. 1-5 [doi]
- LCFed: An Efficient Clustered Federated Learning Framework for Heterogeneous DataYuxin Zhang, Haoyu Chen, Zheng Lin, Zhe Chen 0015, Jin Zhao 0001. 1-5 [doi]
- RefCap: Zero-shot Video Corpus Moment Retrieval Based on Refined Dense Video CaptioningYi Pan, Yujia Zhang 0001, Michael Kampffmeyer, Xiaoguang Zhao. 1-5 [doi]
- Quantum Neural Networks: A Path to Lower Emissions Through Fuel Consumption Prediction in ShippingSo Fong Chien, Julien J. M. Hermans, Austin A. Kana, Charilaos C. Zarakovitis, Stathis Zavvos, H. S. Lim. 1-5 [doi]
- Ultrasound-Guided Registration Pseudo-Labels for Semi-Supervised Brachial Plexus SegmentationJia Luo, Ze Zhang, Yi Ding, Yiqian Wang, Jian Zhang. 1-5 [doi]
- Edge First: Edge-Guided Geometry for Superior 3D Roof Wireframe ReconstructionQiaoqiao Hao, Ting Han 0001, Yujun Liu 0005, Shangfeng Huang, Duxin Zhu, Jinhe Su, Yundong Wu, Guorong Cai. 1-5 [doi]
- Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition ModelsAdriana Fernandez-Lopez, Shiwei Liu 0003, Lu Yin 0006, Stavros Petridis, Maja Pantic. 1-5 [doi]
- Adaptive Spatiotemporal Augmentation for Improving Dynamic Graph LearningXu Chu 0001, Hanlin Xue, Bingce Wang, Xiaoyang Liu, Weiping Li, Tong Mo, Tuoyu Feng, Zhijie Tan. 1-5 [doi]
- Optimizing Multimodal Image Fusion: A Novel Approach with Nystrom Attention Mechanisms in Transformer ModelsYuqin Zeng, Ze Wen, Shuqian Fan. 1-5 [doi]
- SimulTron: On-Device Simultaneous Speech to Speech TranslationAlex Agranovich, Eliya Nachmani, Oleg Rybakov, Yifan Ding, Ye Jia, Nadav Bar, Heiga Zen, Michelle Tadmor Ramanovich. 1-5 [doi]
- DSFormer: Deformable Pointformer for 3D Salient Object DetectionYibo Hu, Lai Wei, Yanzhe Wang, Qiuyu Huang, Yanding Wei, Qiang Fang. 1-5 [doi]
- Overcoming Uncertain Incompleteness for Robust Multimodal Sequential Diagnosis Prediction via Curriculum Data Erasing Guided Knowledge DistillationHeejoon Koo. 1-5 [doi]
- Generalized Linear Models with 1-Bit Measurements: Asymptotics of the Maximum Likelihood EstimatorJaimin Shah, Martina Cardone, Cynthia Rush, Alex Dytso. 1-5 [doi]
- LawDNet: Enhanced Audio-Driven Lip Synthesis via Local Affine Warping DeformationJunli Deng, Yihao Luo, Xueting Yang, Siyou Li, Wei Wang, Jinyang Guo, Ping Shi. 1-5 [doi]
- Improved Motion Plane Adaptive 360-Degree Video Compression Using Affine Motion ModelsMarina Ritthaler, Andy Regensky, André Kaup. 1-5 [doi]
- Spy Inside: Scalable Verification of Dependable Transformers for Event Time Series SystemsHaodong Deng, Qi Qi 0001, Lu Lu 0015, Zirui Zhuang, Xingyu Zeng, Jinguang Wang, Bo He, Wei Li, Jingyu Wang 0001. 1-5 [doi]
- HFedPFS: Heterogeneous Federated Learning with Personalized Data Feature SharingJingxian Xu, Liping Yi, Gang Wang, Xiaoguang Liu. 1-5 [doi]
- HyperMST: Multi-scale Spatio-Temporal Hypercorrelation Network for POI RecommendationZeyun Zhao, Changjian Wang, Kele Xu, Zhen Huang, Gaojin He, Xu Liu. 1-5 [doi]
- Concentrating Harder for Faster Audio TransformerLorenz P. Schmidt, Nils Peters. 1-5 [doi]
- Multi-Layer Knowledge Distillation for Continual Semantic SegmentationYan Wang, Pengju Xu, Bingye Wang, Haiying Zhao. 1-5 [doi]
- WaveSpect: A Hybrid Approach to Synthetic Audio Detection via Waveform and Spectrogram AnalysisDong Chen, Fan Huang, Zhengxuan Song, Wei Zhu, Yin Yang, Kun Zeng. 1-5 [doi]
- SLIDE: Integrating Speech Language Model with LLM for Spontaneous Spoken Dialogue GenerationHaitian Lu, Gaofeng Cheng, Liuping Luo, Leying Zhang, Yanmin Qian, Pengyuan Zhang. 1-5 [doi]
- Medical Image Segmentation with Auxiliary Points Prediction of Lesion Location and BoundaryYu Fan, Zhihui Lai 0001, Heng Kong, Tianying Feng. 1-5 [doi]
- LLM-Guided Dual-Branch Diffusion Model for Fine-Grained Motion SynthesisWenlong Wang, Dahua Gao, Xinyu Liu. 1-5 [doi]
- CASleepNet: A Cross Attention-based multimodal fusion approach for sleep staging with EEG and EOGWei Yu, Jinlong Yang. 1-5 [doi]
- Efficient Long Speech Sequence Modelling for Time-Domain Depression Level EstimationShuanglin Li, Zhijie Xie, Syed Mohsen Naqvi. 1-5 [doi]
- A Comparative Analysis of Generalised Echo and Interference Cancelling and Extended Multichannel Wiener Filtering for Combined Noise Reduction and Acoustic Echo CancellationArnout Roebben, Toon van Waterschoot, Marc Moonen. 1-5 [doi]
- Neural Architecture Search for Ultra-low Memory Blood Glucose Forecasting on the EdgeHadi Al Zein, Nick van de Waterlaat, Tunç Alkanat. 1-5 [doi]
- Face Relighting with Ratio Function for Explicit Geometric RepresentationYiyang Hu, Zequn Zhang, Hui Zhang, Guquan Jing, Peng Gao. 1-5 [doi]
- ViM-Disparity: Bridging the Gap of Speed, Accuracy and Memory for Disparity Map GenerationMaheswar Bora, Tushar Anand, Saurabh Atreya, Aritra Mukherjee, Abhijit Das 0001. 1-5 [doi]
- DRCap: Decoding CLAP Latents with Retrieval-Augmented Generation for Zero-shot Audio CaptioningXiquan Li, Wenxi Chen, Ziyang Ma 0001, Xuenan Xu, Yuzhe Liang, Zhisheng Zheng, Qiuqiang Kong, Xie Chen 0001. 1-5 [doi]
- A Riemannian Approach to Ground Metric Learning for Optimal TransportPratik Jawanpuria, Dai Shi, Bamdev Mishra, Junbin Gao. 1-5 [doi]
- SAF: Local Shape-aware Face-based Garment Collision Handling via Neural SDFsMinzhe Tang, Ruisheng Yuan, Dongliang Kou, Mingyang Sun, Lihua Zhang. 1-5 [doi]
- Elevating Robust ASR By Decoupling Multi-Channel Speaker Separation and Speech RecognitionYufeng Yang, Hassan Taherian, Vahid Ahmadi Kalkhorani, DeLiang Wang. 1-5 [doi]
- VisQ2SQL: Towards SQL-Driven Data Visualization via LLMs-Grounded Preference LearningShengze Shi, Tao Ren, Jun Hu. 1-5 [doi]
- DEGSTalk: Decomposed Per-Embedding Gaussian Fields for Hair-Preserving Talking Face SynthesisKaijun Deng, Dezhi Zheng, Jindong Xie, Jinbao Wang, Weicheng Xie, LinLin Shen, Siyang Song. 1-5 [doi]
- SpecViT: A Custom Vision-Transformer based Approach for Audio Deepfake DetectionSharmistha Modak, Arnab Kumar Das, Ruchira Naskar. 1-5 [doi]
- Fine-tuning TitaNet-Large Model for Speaker Anonymization Attacker SystemsCandy Olivia Mawalim, Aulia Adila, Masashi Unoki. 1-2 [doi]
- Enhancing Privacy in Radar-Based Vital Sign Monitoring Via Non-Linear FMCW WaveformsZhihao Tao, Athina P. Petropulu. 1-5 [doi]
- DETCP: Self-Detoxifying Language Models With Contrastive PairsDianqing Liu, Yi Liu, Junbo Guo, Zhendong Mao 0001. 1-5 [doi]
- Less Yet Robust: Crucial Region Selection for Scene RecognitionJianqi Zhang, Mengxuan Wang, Jingyao Wang, Lingyu Si, Changwen Zheng, Fanjiang Xu. 1-5 [doi]
- MINR: Efficient Implicit Neural Representations for Multi-Image EncodingWenyong Zhou, Taiqiang Wu, Zhengwu Liu, Yuxin Cheng, Chen Zhang, Ngai Wong. 1-5 [doi]
- Detecting Neurodegenerative Diseases using Frame-Level Handwriting EmbeddingsSarah Laouedj, Yuzhe Wang, Jesús Villalba 0001, Thomas Thebaud, Laureano Moro-Velázquez, Najim Dehak. 1-5 [doi]
- Commonality Augmented Disentanglement for Multimodal Crowdfunding Success PredictionJiayang Li, Xovee Xu, Yili Li, Ting Zhong, Kunpeng Zhang 0001, Fan Zhou 0002. 1-5 [doi]
- Enhancing Image Generation Fidelity via Progressive PromptsZhen Xiong, Yuqi Li, Chuanguang Yang, Tiao Tan, Zhihong Zhu, Siyuan Li, Yue Ma. 1-5 [doi]
- Enhancing Out-of-Distribution Detection through Dynamic Activation FunctionYingrui Ji, Yao Zhu, Zhigang Li, Jiansheng Chen, Yunlong Kong, Jingbo Chen. 1-5 [doi]
- Enhancing Fairness in Gaussian Mixture Clustering through Impact FactorZhijing Yang, Chuan Qian, Junjie Zheng, Yiding Tang, Boyang Yan, Hui Zhang 0055. 1-5 [doi]
- A Proximal Variable Smoothing for Nonsmooth Minimization Involving Weakly Convex Composite with MIMO ApplicationKeita Kume, Isao Yamada. 1-5 [doi]
- Mitigating Hallucinations on Object Attributes using Multiview Images and Negative InstructionsZhijie Tan, Yuzhi Li, Shengwei Meng, Xiang Yuan, Weiping Li, Tong Mo, Bingce Wang, Xu Chu 0001. 1-5 [doi]
- LMAC-TD: Producing Time Domain Explanations for Audio ClassifiersEleonora Mancini, Francesco Paissan, Mirco Ravanelli, Cem Subakan. 1-5 [doi]
- DirichNet Model for Detection of TMS-Induced Speech Errors in Patients Undergoing Epilepsy SurgeryKodali Radha, Shalini Narayana. 1-5 [doi]
- GEE Maximization in UAV-Aided Mobile IoT Networks Using Deep Reinforcement LearningAditya Singh, Rajesh M. Hegde. 1-5 [doi]
- Learning Hierarchical Attribute Prompt for Vision-Language ModelsJun Liang 0002, Yang Peng, Rui Luo, Yunyu Zou, Yalong Cheng, Bingzhi Chen. 1-5 [doi]
- Enhancing Vision: Harmonizing Frequency for Imaging Quality and Perception AccuracyHongyang Chen, Kaisheng Ma. 1-5 [doi]
- A Joint Time-Frequency Attention for Leakage Detection in Water Distribution Networks Using Time Series DecompositionJuan Luo 0001, Yiyang Chen, Jielong Yang, Xionghu Zhong. 1-5 [doi]
- MSE-based Sampling of Bandlimited Product Graph Signals via Joint Low-pass Impulse ResponsesFen Wang, Baoyi Xu, Xuyao Kang, Peng Ren, Long Yang 0002. 1-5 [doi]
- Convolutional Prompting for Broad-Domain Retinal Vessel SegmentationQijie Wei, Weihong Yu, Xirong Li 0001. 1-5 [doi]
- Ultra Lightweight Singing Melody Extraction via Combination of Convolution and MLPJun Liu, Kangjie Dong, Qiubo Huang, Shuai Yu 0002, Wei Li 0012. 1-5 [doi]
- CiGA: A Cross-Layer Fine-Grained Attention Correction Method for Large Language ModelDuo Li, Jing Zhao. 1-5 [doi]
- A Unified Model for Oral Reading Fluency and Student ProsodyYihao Wang, Zhongdi Wu, Joseph Nese, Akihito Kamata, Vedant Nilabh, Eric C. Larson. 1-5 [doi]
- Constructing Datasets From Public Police Body Camera FootageJamie Rosas-Smith, Martijn Bartelds, Ruizhe Huang, Leibny Paola García-Perera, Karen Livescu, Dan Jurafsky, Anjalie Field. 1-5 [doi]
- Vision-text Enhancement Network For Weakly Supervised Video Anomaly DetectionYiheng Chen, Shuai Fu, Niantong Qin, Xinning Du, Jianping Ren, Shuhua Liu. 1-5 [doi]
- VAGeo: View-specific Attention for Cross-View Object Geo-LocalizationZhongyang Li, Xin Yuan, Wei Liu, Xin Xu. 1-5 [doi]
- Zero-shot Stance Detection with Logically Consistent Data AugmentationBowen Zhang 0005, Xu Li, Jun Ma, Xi Zhang, Genan Dai, Jianhua Ye. 1-5 [doi]
- PhysID: Physics-based Interactive Dynamics from a Single-view ImageSourabh Vasant Gothe, Ayon Chattopadhyay, Gunturi Venkata Sai Phani Kiran, Pratik, Vibhav Agarwal, Jayesh Rajkumar Vachhani, Sourav Ghosh, Parameswaranath VM, Barath Raj KR. 1-5 [doi]
- Controllable Generative Model for Brain EvolutionGengshuo Liu, Nikhil N. Chaudhari, Nikos Kanakaris, Chenzhong Yin, Paul Bogdan, Andrei Irimia. 1-5 [doi]
- Cauchy-Schwarz Divergence Transfer EntropyZhaozhao Ma, Shujian Yu. 1-5 [doi]
- Rethinking Decoding in Multi-intent Spoken Language UnderstandingYing Xia, Zhen Xiong, Kefan Shen, Zhihong Zhu, Shaorong Xie, Wei Liu. 1-5 [doi]
- Enhancing Time Series Prediction with Evolutionary Algorithm-based Optimization of LSTMJingyu Sun, Hanting Zhang, Jianfeng Wang. 1-5 [doi]
- Controlling the Number of Sample-Contributive Vertices in Generalized Sampling of Graph SignalsKeitaro Yamashita, Kazuki Naganuma, Shunsuke Ono. 1-5 [doi]
- An Information-Theoretic Analysis of Thompson Sampling with Infinite Action SpacesAmaury Gouverneur, Borja Rodríguez Gálvez, Tobias J. Oechtering, Mikael Skoglund. 1-5 [doi]
- Selective Consistency Gradient Attack: Resolving Multi-Target Gradient Conflicts in Object DetectionDong Huang, Tianrun Jia, Pengyu Zhang, Ruihang Ji, Shuzhi Sam Ge. 1-5 [doi]
- When SparseMoE Meets Noisy Interactions: An Ensemble View on Denoising RecommendationWeipu Chen, Zhuangzhuang He, Fei Liu. 1-5 [doi]
- Classification Inconsistency Alignment Network for Cross-corpus Speech Emotion RecognitionXiaoyan Zhou, Jiajie Li, Qida Yu, Quan Wu. 1-5 [doi]
- Adversarial Knowledge Transfer for Black-Box Model Inversion AttackXinhao Liu 0008, Zetao Lin, Yingzhao Jiang, Qiao Yan. 1-5 [doi]
- Dynamic Structure Hypergraph for Document-level Event ExtractionQi Ren, Weihua Wang, Jie Yu, Guanglai Gao. 1-5 [doi]
- Improving Compressive Imaging Recovery via Measurement AugmentationRomario Gualdrón-Hurtado, Roman Jacome, Leon Suarez-Rodriguez, Emmanuel Martínez 0002, Henry Arguello. 1-5 [doi]
- Optimal Device Selection and Resource Allocation in Federated LearningDeepali Kushwaha, Rajesh M. Hegde. 1-5 [doi]
- SemiGPS: GraphGPS-based Semi-supervised Graph Learning for Sector-Specific GDP MappingJinzhou Cao, Xiangxu Wang, Jiashi Chen, Bowen Zhang, Yahan Ma, Tianhong Zhao. 1-5 [doi]
- Anchor-Prompt-based Segmentation and Embedding ModelShuman Li, Zhipeng Lin, Haotian Wang, Wenjing Yang, Hengzhu Liu. 1-5 [doi]
- A Geometry-Based Node Activation Method for Relative LocalizationLicheng Wang, Yi Li, Hanying Zhao, Yuan Shen 0001. 1-5 [doi]
- CGEDN: Approximation of Graph Edit Distance with Path Generation via Learning Node MatchingLiu Yang, Qiankun Zheng, Zidong Wang. 1-5 [doi]
- PicoAudio: Enabling Precise Temporal Controllability in Text-to-Audio GenerationZeyu Xie, Xuenan Xu, Zhizheng Wu 0001, Mengyue Wu. 1-5 [doi]
- Generalized Approximate Message-Passing for Compressed Sensing with Sublinear SparsityKeigo Takeuchi. 1-5 [doi]
- Efficient Large-Scale Scene Point Cloud Upsampling with Implicit Neural Networks and Spatial HashingZhiyong Zhang, Yunrui Zhu, Ruyu Liu, Xu Cheng 0003, Jianhua Zhang 0002, Xiufeng Liu. 1-5 [doi]
- Speech Few-Shot Learning for Language Learners' Speech RecognitionJian Cheng, Sam Nguyen. 1-5 [doi]
- Non-negative Weighted DAG Structure LearningSamuel Rey, Seyed Saman Saboksayr, Gonzalo Mateos. 1-5 [doi]
- Music Tagging with Classifier Group ChainsTakuya Hasumi, Tatsuya Komatsu, Yusuke Fujita. 1-5 [doi]
- A Multi-Stage Feature Pipeline on Timestamped Speech Transcriptions for Dementia AssessmentBernhard Thallinger, Laurin Wagner, Theresa Bloder, Mario Zusag. 1-2 [doi]
- FAST: Fast Audio Spectrogram TransformerAnugunj Naman, Gaibo Zhang. 1-5 [doi]
- Speech-N-LlaMA: Improving Speech LLMs with Multi-Pass TrainingAmit Kumar Singh Yadav, Gil Keren, Desh Raj, Wei Zhou, Junteng Jia, Ke Li 0018, Ying Xu, Chunyang Wu, Jay Mahadeokar, Ozlem Kalinli. 1-5 [doi]
- A2B: Neural Rendering of Ambisonic Recordings to BinauralIsrael D. Gebru, Todd Keebler, Jake Sandakly, Steven Krenn, Dejan Markovic, Julia Buffalini, Samuel Hassel, Alexander Richard. 1-5 [doi]
- BrainChat: Interactive Semantic Information Decoding from fMRI Using Large-Scale Vision-Language Pretrained ModelsWanqiu Huang, Ke Ma, Tingyu Xie, Hongwei Wang 0001. 1-5 [doi]
- DA-LIF: Dual Adaptive Leaky Integrate-and-Fire Model for Deep Spiking Neural NetworksTianqing Zhang, Kairong Yu, Jian Zhang, Hongwei Wang. 1-5 [doi]
- CompMTL: Layer-Wise Competitive Multi-Task LearningTiancong Cheng, Ying Zhang 0047, Rajiv Ratn Shah, Roger Zimmermann, Zhiwen Yu 0001, Bin Guo 0001. 1-5 [doi]
- Temporal-Frequency State Space Duality: An Efficient Paradigm for Speech Emotion RecognitionJiaqi Zhao, Fei Wang 0067, Kun Li 0008, Yanyan Wei, Shengeng Tang, Shu Zhao, Xiao Sun 0003. 1-5 [doi]
- Efficient Long Document Ranking via Adaptive Token Pruning with Query-Document AlignmentBeiya Dai, Meilin Chen, Peng Yu, Xinbing Wang, Chenghu Zhou, Zhouhan Lin. 1-5 [doi]
- Effective Integration of KAN for Keyword SpottingAnfeng Xu, Biqiao Zhang, Shuyu Kong, Yiteng Huang, Zhaojun Yang, Sangeeta Srivastava, Ming Sun. 1-5 [doi]
- Adaptive Multi-Scale Local Correction for Semi-Supervised 3D Medical Image SegmentationXinqiang Wang, Wenhuan Lu, Ke Zheng, Junhai Xu. 1-5 [doi]
- BID-Net: Balanced Incremental Distillation Network for Fair Dermatological Disease DiagnosisYiqin Luo, Tianlong Gu, Fengrui Hao, Liang Chang 0003. 1-5 [doi]
- Super Capacity SRS Design for 5G and beyond using Channel In-paintingUsman Akram, Fan Zhang, Shawn Ma, Yang Li, Haris Vikalo. 1-5 [doi]
- Improved Techniques for Offline Reinforcement Learning: Advantage Value Estimation and LayernormXiaosong Liu, Quan Liu, Lan Wu. 1-5 [doi]
- BCS-Net: Multi-Task Breast Cancer Screening Network Enhanced by Multi-Modality AttentionRuili Li, Ruiyu Li, Eichi Takaya, Zizhen Lin, Tomoya Kobayashi, Nanako Mtsuda, Takuya Ueda. 1-5 [doi]
- A GNSS-IR Aided Multispectral Satellite Data Fusion for Meter-Level Wide-Area Volumetric Soil Moisture EstimationNicolás Padrón, Sergiy Vorobyov. 1-5 [doi]
- DistillW2N: A Lightweight One-Shot Whisper to Normal Voice Conversion Model Using Distillation of Self-Supervised FeaturesTianyi Tan, Haoxin Ruan, Xin'an Chen, Kai Chen, Zhibin Lin, Jing Lu. 1-5 [doi]
- AudioComposer: Towards Fine-grained Audio Generation with Natural Language DescriptionsYuanyuan Wang, Hangting Chen, Dongchao Yang, Zhiyong Wu 0001, Xixin Wu. 1-5 [doi]
- SHAP-Integrated Convolutional Diagnostic Networks for Feature-Selective Medical AnalysisYan Hu, Ahmad Chaddad. 1-5 [doi]
- Interpreting Deep Neural Network-Based Receiver Under Varying Signal-To-Noise RatiosMarko Tuononen, Dani Korpi, Ville Hautamäki. 1-5 [doi]
- LocRef-Diffusion: Tuning-Free Layout and Appearance-Guided GenerationFan Deng, Yaguang Wu, XinYang Yu, Xiangjun Huang, Jian Yang, Guangyu Yan, Qiang Xu. 1-5 [doi]
- Exploiting Application-to-Architecture Dependencies for Designing Scalable OSYao Xiao, Nikos Kanakaris, Anzhe Cheng, Chenzhong Yin, Nesreen K. Ahmed, Shahin Nazarian, Andrei Irimia, Paul Bogdan. 1-5 [doi]
- PQNAS: Mixed-precision Quantization-aware Neural Architecture Search with Pseudo QuantizerTianxiao Gao, Li Guo 0006, Shihao Wang, Shiai Zhu, Dajiang Zhou. 1-5 [doi]
- Emotion-aware Structural Enhancement Graph Auto-Encoder for Rumor DetectionGuoyi Li, Zhongjiang Yao, Die Hu 0004, Yingrui Xu, Xiaodan Zhang 0004, Honglei Lyu. 1-5 [doi]
- Boli: A dataset for understanding stuttering experience and analyzing stuttered speechAshita Batra, Mannas Narang, Neeraj Kumar Sharma 0007, Pradip K. Das. 1-4 [doi]
- SCDiar: a streaming diarization system based on speaker change detection and speech recognitionNaijun Zheng, Xucheng Wan, Kai Liu, Huan Zhou 0008. 1-5 [doi]
- FAWL: Weakly-Supervised Video Corpus Moment Retrieval with Frame-Wise Auxiliary Alignment and Weighted Contrastive LearningYi Pan, Yujia Zhang 0001, Xiaoguang Zhao. 1-5 [doi]
- ASFC-NeRF: Large-Scale Scene Rendering with Adaptive Sampling and Feature-aware CompressionXinrui Zhang, Yufeng Wang 0004, Shuangkang Fang, Zesheng Wang 0002, Huayu Zhang, Dacheng Qi, Wenrui Ding. 1-5 [doi]
- kNN-CL: Enhancing Continual Learning with Nearest Neighbor RetrievalEnzhi Wang, Qicheng Li, Hao Chen, Ruiqi Sun, Xin Zhou. 1-5 [doi]
- DEBT: Enhancing Entity Alignment in Knowledge Graphs through Description Enrichment and Bootstrap TrainingTing Xiang, Jiapeng Zhang, Changjian Chen, Zhuo Tang. 1-5 [doi]
- ZOQO: Zero-Order Quantized OptimizationNoga Bar, Raja Giryes. 1-5 [doi]
- FedTLU: Federated Learning with Targeted Layer UpdatesJong-Ik Park, Carlee Joe-Wong. 1-5 [doi]
- Cross-Modality Fusion Mamba for All-in-One Extreme Weather-Degraded Image RestorationJiangang Ding 0001, Yihui Shan, Lili Pei, Yiquan Du, Yuanlin Zhao, Wei Li 0120. 1-5 [doi]
- Exact Solutions of the Inner Optimization Problem of Adversarial RobustnessDeepak Maurya, Adarsh Barik, Jean Honorio. 1-5 [doi]
- Personalized Graph Transformer for Federated Graph LearningHaohe Jia, Yi Huang, Hongbin Zhu, Hongfeng Chai. 1-5 [doi]
- S-KEY: Self-supervised Learning of Major and Minor Keys from AudioYuexuan Kong, Gabriel Meseguer-Brocal, Vincent Lostanlen, Mathieu Lagrange, Romain Hennequin. 1-5 [doi]
- Mask Augmentation For Tumor Classification In Medical ImagesChun Chieh Weng, Huan-Yu Chen, Jing-Tong Tzeng, Ching-Heng Lin, Po-Chih Kuo, Chi-Chun Lee. 1-5 [doi]
- Localised Frequency Latent Domain Watermarking of DDIM Generated ImagesQiran Lai, Adrian G. Bors. 1-5 [doi]
- On Overlap Ratio in Defocused Electron PtychographyAmirafshar Moshtaghpour, Angus I. Kirkland. 1-5 [doi]
- RetinaStereo: Dynamic-Volume Stereo Matching NetworkXiaoyan Liao, Haoliang Zhao, Fan Yang, Kwokching Cheung, Jun Jiang, Yong Zhao, Jie Chen, Xinan Wang. 1-5 [doi]
- Enhancing Speech Emotion Recognition with Speech Dynamic Modeling and Multi-Modal Knowledge DistillationChuanbo Zhu 0002, Chao Sun, Yifan Liu 0015, Jincai Chen, Ke Luo. 1-5 [doi]
- Gradient Norm-based Fine-Tuning for Backdoor Defense in Automatic Speech RecognitionNanjun Zhou, Weilin Lin, Li Liu. 1-5 [doi]
- Reconciling AMP Algorithms derived from Belief Propagation or the Large System Limit Bethe Free EnergyZilu Zhao, Fangqing Xiao, Christo Kurisummoottil Thomas, Dirk Slock. 1-5 [doi]
- Subsampling Decomposition based k-Space Refinement for Accelerated MRI ReconstructionXiaoyu Qiao, Weisheng Li 0001, Bin Xiao 0002, Yuping Huang, Lijian Yang. 1-5 [doi]
- Adaptive Time-Frequency Attention Network for Sleep Stage Classification Using Respiratory SignalsZihang Liang, Kejing He. 1-5 [doi]
- Bayesian Filtering on GraphsBishwadeep Das, Madeline Navarro, Santiago Segarra, Elvin Isufi. 1-5 [doi]
- Deep Support Vein Machine for Lung ParcellationHaichao Peng, Hao Fang, Wenkang Fan, Yong Wang, Sunkui Ke, Jie Luo, Xiongbiao Luo. 1-5 [doi]
- 3D Shape Classification by Registration: Neural-Network-Free and Training-FreeChang Gou, Yuanqu Mou, Wenjie Li 0002, Neetesh Purohit, Suneel Yadav, Haiyang Bai, Xu Zhang, Lijun Chen 0006. 1-5 [doi]
- RelaI2P: Relational Learning for Image-to-Point Cloud RegistrationMinghui Hou, Gang Wang, Zhiyang Wang, Baorui Ma. 1-5 [doi]
- Enhancing Information Extraction with METORIE: A Metaphor and Trap-Based Dataset for Cross-Domain Fine-TuningZhengyuan Pan, Yilian Peng, ZhongQuan Jian, Yanhao Chen, Wentao Qiu, Haonan Ma, Junfeng Yao, Meihong Wang, Qingqiang Wu 0001. 1-5 [doi]
- Structural-Aware Disentangled Learning with CLIP for Hyperbolic Zero-Shot Sketch-Based Image RetrievalQing Zhang, Jing Zhang, Feilong Bao, Xiangdong Su, Guanglai Gao. 1-5 [doi]
- Hybrid Coding and Weakly-Supervised Approach for Depth Estimation from Wrapped PhaseJie Ren, Chunqian Tan, Wanzhong Song. 1-5 [doi]
- Homogeneous Graph Extraction: An Approach to Learning Heterogeneous Graph EmbeddingShihao Gao, Xiaoyan Yu, Yu Cai, Xulong Zhang 0001, Jianzong Wang, Taisong Jin. 1-5 [doi]
- FedCAda: Adaptive Client-Side Optimization for Accelerated and Stable Federated LearningLiuzhi Zhou, Yu He, Kun Zhai, Xiang Liu, Sen Liu, Xingjun Ma, Guangnan Ye, Hongfeng Chai. 1-5 [doi]
- Prompt-UIE: A Unified Prompt-Driven Framework for Underwater Image EnhancementYanling Zhang, Linxuan Luo, Pan Mu, Cong Bai. 1-5 [doi]
- Neural Variational Mode Decomposition and Its Application for ECG DenoisingDe-Yan Lu, Jian-Jiun Ding, Yu Tsao. 1-5 [doi]
- Prototype Matching with Domain Alignment for Open-world Specific Emitter IdentificationWang Xiao, Yalan Ye, Tongjie Pan, Chenyang Li. 1-5 [doi]
- Multi-Modal Semantic Communication With Point Cloud DiffusionWeiyan Feng, Zhenyu Shi. 1-5 [doi]
- Hierarchical Bayesian Estimation of COVID-19 Reproduction NumberPatrice Abry, Juliette Chevallier, Gersende Fort, Barbara Pascal. 1-5 [doi]
- CEMSSL: Conditional Embodied Self-Supervised Learning is All You Need for High-precision Multi-solution Inverse Kinematics of Robot ArmsWeiming Qu, Tianlin Liu, Jiawei Du, Dingsheng Luo. 1-5 [doi]
- The Impact of Decorrelation on Transformer Interpretation Methods: Applications to Clinical Speech AILingfeng Xu, Kimberly D. Mueller, Julie Liss, Visar Berisha. 1-5 [doi]
- Full-Reference Point Cloud Quality Assessment with Multimodal Large Language ModelsRyosuke Watanabe, Tomoaki Konno, Hiroshi Sankoh, Bryan Tanaka, Tatsuya Kobayashi. 1-5 [doi]
- Unsupervised Motion-Robust Self-Distillation Framework for Remote Physiological MeasurementAnbang Liu, Shanlin Xiao, Wenming Zheng. 1-5 [doi]
- Multilingual Parameter-Sharing Adapters: A Method for Optimizing Low-Resource Neural Machine TranslationYunLong Zhang, Nan Chen, Yonghe Wang, Xiangdong Su, Feilong Bao. 1-5 [doi]
- Low-Complexity Cramér-Rao Lower Bound and Sum Rate Optimization in ISAC SystemsTianyu Fang, Nhan Thanh Nguyen 0001, Markku J. Juntti. 1-5 [doi]
- Revisiting and Refining Lagunas' Beamforming for Acoustic ImagingXun Wang 0002, Jérôme Antoni, Jianing Li, Jean-Daniel Chazot, Jing Lin. 1-5 [doi]
- A Margin-Maximizing Fine-Grained Ensemble MethodJinghui Yuan, Hao Chen, Renwei Luo, Feiping Nie 0001. 1-5 [doi]
- CSMT: Combining Snoring and Metadata-based Text for Sleep Apnea Severity ClassificationHeng Li 0013, Yukun Qian, Yun Lu 0004, Mingjiang Wang. 1-5 [doi]
- Progressive Subband Modeling for Artifacts-free Speech Super-resolutionDonghyun Kim, Joon-Hyuk Chang. 1-5 [doi]
- Federated Hybrid-Supervised Learning for Universal Medical Image SegmentationShenhai Zheng, Sian Wen, Congyu Li, Qing Chen, Laquan Li. 1-5 [doi]
- IDE: A Multi-Agent-Driven Iterative Framework for Dynamic Evaluation of LLMsXin Tong, Bo Jin, Jingya Wang, Wenpeng Xing, Tian Xia, Meng Han. 1-5 [doi]
- Automatic recognition of rodent call types using deep supervectorsFasih Haider, Raven Hickson, Peter Kind, Saturnino Luz. 1-5 [doi]
- Sequential DOA Trajectory Estimation using Deep Complex Network and Residual SignalsShreyas Jaiswal, Peter Gerstoft, Santosh Nannuru. 1-5 [doi]
- Transformer-Enhanced Iterative Feedback Mechanism For Polyp SegmentationNikhil Kumar Tomar, Debesh Jha, Koushik Biswas, Ulas Bagci. 1-5 [doi]
- Uncertainty-aware Masked Modeling in Medical ImagingJiayu Zhang, Yuxin Cao, Dexuan Xu. 1-5 [doi]
- Swin-VasMamba: A Topologically Constrained Model For 3D Vascular SegmentationZiyu Liu, Jiaxuan Li, Xiangjian He, Qing Xu 0014, Xin Chen, Shoujun Zhou. 1-5 [doi]
- Convolutional Retentive Network for EEG DecodingJunliang Wang, Wenlong Hang, Shuang Liang 0015, Qiong Wang 0001, Badong Chen, Jing Qin 0001. 1-5 [doi]
- SEE: Semantically Aligned EEG-to-Text TranslationYitian Tao, Yan Liang, Luoyu Wang, Yongqing Li, Qing Yang, Han Zhang 0002. 1-5 [doi]
- WiSenseNet: A Unified Foundation Model for Diverse Wi-Fi Sensing Tasks Using Channel State InformationNiall Lyons, Ashutosh Pandey, Avik Santra. 1-5 [doi]
- Learning from Reconstruction: A Two-Stage Global-to-Local Framework for Temporal Knowledge Graph CompletionWenjie Xu, Kai Liu, Zihao Jiang 0009, Mengting Song, Boyi Zhang, Min Peng 0002. 1-5 [doi]
- Channel and space-based joint rate allocation algorithmDayong Wang, Chao Yuan, Yu Sun, Xin Lu, Hui Guo, Frédéric Dufaux, Ce Zhu. 1-5 [doi]
- Efficient-USR: Prompt Guided Dual-Domain Feature Information for Efficient Underwater Image Super-ResolutionAlik Pramanick, Utsav Bheda, Arijit Sur. 1-5 [doi]
- UME: Upcycling Mixture-of-Experts for Scalable and Efficient Automatic Speech RecognitionLi Fu, Shanyong Yu, Siqi Li, Fan Lu 0003, Youzheng Wu, Xiaodong He 0001. 1-5 [doi]
- Divide-and-Conquer Variational Bayesian Inference for Multi-task Learning of High-resolution SAR ImageryLei Yang, Ming Sun, Zhongwei Hu, Zenan Zhang, Wenxuan Yuan. 1-5 [doi]
- EventLens: Enhancing Visual Commonsense Reasoning by Leveraging Event-Aware Pretraining and Cross-modal LinkingMingjie Ma, Zhihuan Yu, Yichao Ma, Guohui Li 0001, Zhong Yang. 1-5 [doi]
- Unsupervised Hierarchical Dynamic Similarity Hashing for Multimedia RetrievalYunfei Chen 0015, Zhan Yang, Jun Long. 1-5 [doi]
- Efficient Quantization and Denoising Using Local Graph Fourier FramesPhilipp Reingruber, Gerald Matz. 1-5 [doi]
- A Two-Stage AIGC Image Quality Assessment with T2I Correspondence and Visual PerceptionJili Xia, Lihuo He, Bo Hu 0008, Bo Han 0004, Xinbo Gao 0001. 1-5 [doi]
- Cognitive Decline Detection using DLB Extraction PipelinesShibingfeng Zhang, Nadia Khlif, Marcello Ferro, Gloria Gagliardi, Fabio Tamburini. 1-2 [doi]
- Evidential Neural GPLDA: A Novel Approach to Quantify Prediction Uncertainty in Speaker Verification SystemsMiao Jing, Vidhyasaharan Sethu, Beena Ahmed. 1-5 [doi]
- Dike: Enhancing Fairness and Efficiency in GPU Clusters for Deep LearningBingting Jiang, Jing Yao, Heyi Mu, Xin Su 0001, Zhuo Tang. 1-5 [doi]
- Patch Attention Excitation Based Vision Transformer for Small-Sized DatasetsAkash Verma, Shiv Ram Dubey, Satish Kumar Singh. 1-5 [doi]
- Low Frame-rate Speech Codec: a Codec Designed for Fast High-quality Speech LLM Training and InferenceEdresson Casanova, Ryan Langman, Paarth Neekhara, Shehzeen Hussain, Jason Li, Subhankar Ghosh, Ante Jukic, Sang Gil Lee. 1-5 [doi]
- Object-Based Video Tampering Localization via Trace Consistency AnalysisPengfei Pei, Yun Cao, Jinchuan Li, Zeyu Zhang, Yuqi Pang. 1-5 [doi]
- Extending MPR for Locating a Moving Object Based on TDOA and FDOABeichuan Tang, Yimao Sun, Xiantao Heng, Yanbing Yang, Liangyin Chen. 1-5 [doi]
- Latent Diffusion Bridges for Unsupervised Musical Audio Timbre TransferMichele Mancusi, Yurii Halychanskyi, Kin Wai Cheuk, Eloi Moliner, Chieh-Hsin Lai, Stefan Uhlich, Junghyun Koo, Marco A. Martínez Ramírez, Wei-Hsiang Liao 0001, Giorgio Fabbro, Yuki Mitsufuji. 1-5 [doi]
- Learning Deep Frequency Degradation Prior for Remote Sensing Spatio-temporal FusionYiting Bian, Shenglong Hu, Huihui Song, Kaihua Zhang 0001. 1-5 [doi]
- DiffRS: An Extensible Diffusion Model for Remote Sensing Image GenerationXinyue Huang, Xin Niu 0002, Jingfei Jiang, Hengyue Pan. 1-5 [doi]
- OARecon: Object-Aware Viewpoint Augmentation for Indoor Compositional ReconstructionYuanyuan Ding, Yiming Fei, Jiandang Yang, Xiaobin Wei, Jiajun Lv, Yong Liu 0007. 1-5 [doi]
- Towards HRTF Personalization using Denoising Diffusion ModelsJuan Camilo Albarracín Sánchez, Luca Comanducci, Mirco Pezzoli, Fabio Antonacci. 1-5 [doi]
- Apollo: Band-sequence Modeling for High-Quality Audio RestorationKai Li, Yi Luo 0004. 1-5 [doi]
- Improving Speech Emotion Recognition in Under-Resourced Languages via Speech-to-Speech Translation with Bootstrapping Data SelectionHsi-Che Lin, Yi-Cheng Lin, Huang-Cheng Chou, Hung-yi Lee. 1-5 [doi]
- Identical-Delay Based 2-D DOA and Frequency Joint Estimation With Sub-Nyquist Sampling for URALiang Liu 0004, Xinyun Zhang, Xinyi Zhou, Lu Gan 0003, Jiancheng An, Hongbin Li 0001. 1-5 [doi]
- A Multi-Agent Multi-Environment Mixed Q-Learning for Partially Decentralized Wireless Network OptimizationTalha Bozkus, Urbashi Mitra. 1-5 [doi]
- CoLLAP: Contrastive Long-form Language-Audio Pretraining with Musical Temporal Structure AugmentationJunda Wu, Warren Li, Zachary Novack, Amit Namburi, Carol Chen, Julian J. McAuley. 1-5 [doi]
- Semantic Consistency And Integrity Network For Cloth-changing Person Re-identificationAnqi Wang, Liyan Zhang 0001. 1-5 [doi]
- A Study of Improving The Privacy-Utility Trade-off of Task-specific Models with Learnable PrivacySavas Özkan, Taha Ceritli, Jeongwon Min, Eunchung Noh, Jung Min Cho, Dookun Park, Mete Ozay. 1-5 [doi]
- Movable Antenna Aided Physical Layer Security with No Eavesdropper CSIZhenqiao Cheng, Chongjun Ouyang, Xingqi Zhang. 1-5 [doi]
- *Rick S. Blum, Brian M. Sadler. 1-5 [doi]
- Self-Trained Model for ECG Complex DelineationAram Avetisyan, Nikolas Khachaturov, Ariana A. Asatryan, Shahane Tigranyan, Yury Markin. 1-5 [doi]
- NCNet: Learning to Find Non-Consistent Correspondence Using Learnable Frequency Response FunctionRuiyuan Li, Zhaolin Xiao, Meng Zhang, Haiyan Jin, Haonan Su. 1-5 [doi]
- Text-guided Multimodal Fusion for the Multimodal Emotion and Intent Joint UnderstandingYu Zhang, Bin Chen, Hongfei Ye, Zijian Gao, Tianjiao Wan, Long Lan, Kele Xu. 1-2 [doi]
- Rethinking Early-Fusion Strategies for Improved Multimodal Image SegmentationZhengwen Shen, Yulian Li, Han Zhang, Yuchen Weng, Jun Wang. 1-5 [doi]
- Integrating Spectro-Temporal Cross Aggregation and Multi-Scale Dynamic Learning for Audio Deepfake DetectionYunqi Hao, Minqiang Xu, Yihao Chen, Yanyan Liu, Liang He, Lei Fang, Lin Liu. 1-5 [doi]
- Deep Dynamic Probabilistic Canonical Correlation AnalysisShiqin Tang, Shujian Yu, Yining Dong, S. Joe Qin. 1-5 [doi]
- Mel-Spectrogram Inversion via Alternating Direction Method of MultipliersYoshiki Masuyama, Natsuki Ueno, Nobutaka Ono. 1-5 [doi]
- Improving Height Prediction for Vision-Based Roadside 3D Object DetectionTengfei Zhang, Heng Zhang, RenGang Li, Yaqian Zhao, Qi Deng, Ruyang Li. 1-5 [doi]
- An Ensemble Approach to Short-form Video Quality Assessment Using Multimodal LLMWen Wen, Yilin Wang 0001, Neil Birkbeck, Balu Adsumilli. 1-5 [doi]
- 2AD: Dual Consistency Learning for Zero-Shot Anomaly DetectionXiaoling Wang, Ruilong Xing, Zhuotao Tian, Yijun Liu, Senqiao Yang, Yaowei Wang, Jingyong Su. 1-5 [doi]
- Dynamic Category Queries Transformer for Generalized Few-shot Semantic SegmentationKunze Huang, Jieyuan Yang, Andreas Jakobsson, Luyao Tang, Xiaotong Tu, Xinghao Ding, Yue Huang 0001. 1-5 [doi]
- Map-Free Visual Relocalization Enhanced by Instance Knowledge and Depth KnowledgeMingyu Xiao 0003, Runze Chen, Haiyong Luo, Fang Zhao 0003, Fan Wu 0006, Hao Xiong, Xuepeng Ma, Juan Wang. 1-5 [doi]
- InfoMin-based Query Embedding Optimization For Query-based Universal Sound SeparationZhen Wang, Jiqing Han 0001, Liwen Zhang, Youcheng Zhang. 1-5 [doi]
- Enhanced Multimodal Emotion Recognition in Conversations via Contextual Filtering and Multi-Frequency Graph PropagationHuan Zhao 0003, Yingxue Gao, Haijiao Chen, Bo Li, Guanghui Ye, Zixing Zhang 0001. 1-5 [doi]
- Successive Interference Cancellation-aided Diffusion Models for Joint Channel Estimation and Data Detection in Low Rank Channel ScenariosSagnik Bhattacharya, Muhammad Ahmed Mohsin, Kamyar Rajabalifardi, John M. Cioffi. 1-5 [doi]
- Bridging Task Boundaries: Remote Sensing Image-Text Retrieval via Dictionary-Driven AdaptationJunwei Xu, Tao Huang, Zhenyu Wang, Weisheng Dong, Xin Li. 1-5 [doi]
- TFS: Revisiting Temporal Language Grounding from Frequency Spiking PerspectiveYifan Lyu, Hongzhou Wu, Lixiang Liu, Chuxiong Sun. 1-5 [doi]
- Decentralized Stochastic Successive Convex Approximation for composite non-convex problems with non-linear functional constraintsBasil M. Idrees, Shivangi Dubey Sharma, Ketan Rajawat. 1-5 [doi]
- Learned Approximated Optimization for Rapid Low-Complexity Hybrid Beamforming DesignAmit Milstein, Tomer Yablonka, Nir Shlezinger. 1-5 [doi]
- Pancreatic Cystic Neoplasms Lesion Detection for Non-contrast CT Image via Teacher-student ModelMengjie Pan, Qiu Guan, Zhiqiang Yang, Zhongwen Yu, Haixia Long, Xinli Xu, Ruihui Wang, Zhehao An, Feng Chen 0038. 1-5 [doi]
- Continuous-Discrete Differentiable Particle Filters for Irregular Time SeriesHao Wen, Paul Krause, Lee Gillam. 1-5 [doi]
- Adapting Single-Channel Pre-trained Transformer Models for Multi-Channel Sound Event Localization and DetectionChangjiang He, Siyao Cheng, Jiahua Bao, Jie Liu 0001. 1-5 [doi]
- Low Complexity Riemannian Coordinate-Descent over Symmetric Positive Definite MatricesYogesh Darmwal, Ketan Rajawat. 1-5 [doi]
- Text-dependent Speaker Verification Challenge 2024: Exploring Shared and User-defined PassphrasesHossein Zeinali, Kong-Aik Lee, Jahangir Alam 0001, Lukás Burget. 1-5 [doi]
- Addressing Speed-Induced Dispersion in Stepped-Frequency PMCW Radar SystemsMoritz Kahlert, Tai Fei, Claas Tebruegge, Shunqiao Sun, Markus Gardill. 1-4 [doi]
- Martin: Mobility-Aware Reputation Mechanism for Federated LearningLixin Liu, Xin Chang, Hanqing Yang 0008, Jingyu Wang, Xiaolin Zhang, Cuiyun Shi. 1-5 [doi]
- Simple Adaptive Spectrum Graph Filters for Rumor DetectionNanjun Yu, Qiang Cao, Zheng Dong. 1-5 [doi]
- ECG-guided individual identification via PPGRiling Wei, Hanjie Chen, Kelu Yao, Chuanguang Yang, Jun Wang, Chao Li 0028. 1-5 [doi]
- Open-Vocabulary Visual Emotion Adaptation via Prompt LearningZhaopan Xu, Sicheng Zhao, Xiaojiang Peng, Hongxun Yao. 1-5 [doi]
- Fair MP-BOOST: Fair and Interpretable Minipatch BoostingCamille Olivia Little, Genevera I. Allen. 1-5 [doi]
- FBSE-FTFCWT-Based Novel Automated Framework for Dysarthric Speech DetectionAmishi Vijay, Ram Bilas Pachori, Balasubramanyam Appina, Nitya Tiwari. 1-5 [doi]
- Denoising and Restoring Channel State Information for 5G Indoor Positioning in Low-SNR ScenariosJianing Chen, Junze Yang, Chuhao Chen 0002, Xiangxu Meng, Wenqi Zheng. 1-5 [doi]
- A Hierarchical Taxonomy For Deep State Space ModelsShiqin Tang, Pengxing Feng, Shujian Yu, Yining Dong, S. Joe Qin. 1-5 [doi]
- Robust Semantic Communications for Speech TransmissionZhenzi Weng, Zhijin Qin, Geoffrey Ye Li. 1-5 [doi]
- RF-GML: Reference-Free Generative Machine ListenerArijit Biswas, Guanxin Jiang. 1-5 [doi]
- Learning Control of Neural Sound Effects Synthesis from Physically Inspired ModelsYisu Zong, Joshua Reiss. 1-5 [doi]
- Dynamic Frequency-Adaptive Knowledge Distillation for Speech EnhancementXihao Yuan, Siqi Liu, Hanting Chen, Lu Zhou, Jian Li, Jie Hu. 1-5 [doi]
- GeMIMO: Searching the Cores of X-formers for Time Series ForecastingZhicheng Zhang, Yong Wang, Shaoqi Tan, Yujie Luo. 1-5 [doi]
- Improved Recognition of the Speech of People with Parkinson's Who StutterJonghwan Na, Xiuwen Zheng 0003, Bowon Lee, Mark Hasegawa-Johnson. 1-5 [doi]
- Leveraging Multimodal Methods and Spontaneous Speech for Alzheimer's Disease IdentificationYiFan Gao, Long Guo, Hong Liu. 1-2 [doi]
- No-Reference Point Cloud Quality Assessment Based on Graph Signal VariationRyosuke Watanabe, Keisuke Nonaka, Eduardo Pavez, Tatsuya Kobayashi, Antonio Ortega. 1-5 [doi]
- A Diffusion Model over Directed Acyclic Graphs for Event Schema GenerationGuoxuan Ding, Haotian Jin, Xiaobo Guo, Xin Wang, Nan Mu, Lei Wang, Daren Zha. 1-5 [doi]
- Class-Difficulty Aware Hybrid Active LearningYaling Ge, Xun Pu, Jun Zhou. 1-5 [doi]
- AnimateSketches: Animate Sketches with Instance-Aware MaskHaoge Deng, Xin Dai, Jijin Hu, Yonggang Qi. 1-5 [doi]
- LDGNet: LLMs Debate-Guided Network for Multimodal Sarcasm DetectionHengyang Zhou, Jinwu Yan, Yaqing Chen, Rongman Hong, Wenbo Zuo, Keyan Jin. 1-5 [doi]
- Contrast Memory for Unsupervised Anomaly DetectionJiahao Li 0007, Yiqiang Chen 0001, Yunbing Xing, Yang Gu 0001, Xiangyuan Lan. 1-5 [doi]
- KCGAFormer: When Large-Kernel ConvFormer Meets KAN in Semantic SegmentationShihao Wang, Zhengxing Huang, Enguang Zuo, Alimjan Aysa, Kurban Ubul. 1-5 [doi]
- Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning DataKe-Han Lu, Zhehuai Chen, Szu-Wei Fu, Chao-Han Huck Yang, Jagadeesh Balam, Boris Ginsburg, Yu-Chiang Frank Wang, Hung-yi Lee. 1-5 [doi]
- Enhanced Multimodal Depression Detection With Emotion PromptsShiyu Teng, Jiaqing Liu, Hao Sun 0013, Shurong Chai, Tomoko Tateyama, Lanfen Lin, Yen-Wei Chen 0001. 1-5 [doi]
- Graph-Driven Insights: Enhancing Stock Market Prediction with Relational Temporal DynamicsRenjun Jia, Kaiming Yang, Dawei Cheng, Li Han 0001, Yuqi Liang. 1-5 [doi]
- COREMIL: Contextual Position Encoding-based Retrievable Multiple Instance Learning for Slide-level ClassificationBingchen Li, Qiming He, Junru Cheng, Tian Guan, Yonghong He, Guangde Zhou. 1-5 [doi]
- Mamba-based Segmentation Model for Speaker DiarizationAlexis Plaquet, Naohiro Tawara, Marc Delcroix, Shota Horiguchi, Atsushi Ando, Shoko Araki. 1-5 [doi]
- Uniform Distribution Based Learnable Quantization via Self-Knowledge DistillationPeng Hu, Biao Leng. 1-5 [doi]
- SymGaussian: Occluded Human Rendering with Multi-scale Symmetry Feature from Monocular VideoZekai Jiang, Tong Duan, Dongyu Zhang. 1-5 [doi]
- Efficient and Effective Model ExtractionHongyu Zhu 0004, Wentao Hu, Sichu Liang, Fangqi Li, Wenwen Wang, Shilin Wang. 1-5 [doi]
- Adaptive-Similarity-Based Brain Dynamic Functional Connectivity with Spatial-Temporal Attention and Domain Adaptation for Schizophrenia DiagnosisYixin Ji, Vince D. Calhoun, Rongtao Jiang, Daoqiang Zhang, Shile Qi. 1-5 [doi]
- Implicit and Explicit Rule Injection for Complex Query Answering over Knowledge GraphsXin Zhang, Zhe Wang, Guozheng Rao, Kewen Wang. 1-5 [doi]
- FCConDubber: Fine And Coarse Grained Prosody Alignment For Expressive Video Dubbing via Contrastive Audio-Motion PretrainingQiulin Li, Zhichao Wu, Hanwei Li, Xin Dong, Qun Yang. 1-5 [doi]
- MPNAS: Multimodal Sentiment Analysis Pruning via Neural Architecture SearchBinyan Zhang, Ao Ren, Zihao Zhang, Moming Duan, Duo Liu, Yujuan Tan, Kan Zhong. 1-5 [doi]
- Dual-Function Waveform Design in Wireless Sensor Networks via SoS OptimizationRobin Amar, Linlong Wu, Saeid Sedighi, Mohammad Alaee Kerahroodi, M. R. Bhavani Shankar. 1-5 [doi]
- Step-by-Step Correction of LLM-based Math Word Problems SolutionsYiyao Li, Dhanish Musharraf Ubaidali, Lu Wang, Wenyu Zhang. 1-5 [doi]
- GATOmics: A Novel Multi-Omics Graph Attention Network Model for Cancer Driver Gene DetectionGe Kong, Jiao Wang, Juan Wang. 1-5 [doi]
- Transfer Risk Map: Mitigating Pixel-level Negative Transfer in Medical SegmentationShutong Duan, Jingyun Yang, Yang Tan, Guoqing Zhang, Yang Li, Xiao-Ping Zhang. 1-5 [doi]
- Augmenting Short Enrollment Speech via Synthesis for Target Speaker ExtractionZikang Huang, Jingru Lin, Meng Ge, Yu Jiang, Xiaobao Wang, Longbiao Wang, Jianwu Dang 0001. 1-5 [doi]
- kNN-SVC: Robust Zero-Shot Singing Voice Conversion with Additive Synthesis and Concatenation Smoothness OptimizationKeren Shao, Ke Chen 0021, Matthew Baas, Shlomo Dubnov. 1-5 [doi]
- Scaling Multilingual Visual Speech RecognitionK. R. Prajwal, Sindhu B. Hegde, Andrew Zisserman. 1-5 [doi]
- Disentanglement Analysis in Deep Latent Variable Models Matching Aggregate Posterior DistributionsSurojit Saha, Sarang C. Joshi, Ross T. Whitaker. 1-5 [doi]
- Model-Driven Learning Approach for Robust WiFi-based Fall DetectionSai Deepika Regani, Beibei Wang 0001, K. J. Ray Liu. 1-5 [doi]
- Learning Joint Appearance and Shape Co-Representations for Co-Saliency DetectionGuanting Guo, Shenglong Hu, Huihui Song, Kaihua Zhang 0001. 1-5 [doi]
- CA-MHFA: A Context-Aware Multi-Head Factorized Attentive Pooling for SSL-Based Speaker VerificationJunyi Peng, Ladislav Mosner, Lin Zhang, Oldrich Plchot, Themos Stafylakis, Lukás Burget, Jan Cernocký. 1-5 [doi]
- SPT: Sequence Prompt Transformer for Interactive Image SegmentationSenlin Cheng, Haopeng Sun, Tao Xie, Hangyue Zhao, Yiqiang Chen 0001, Bolei Xu, Xiaobo Li. 1-5 [doi]
- Synergistic Integration of Cross-Spatial Learning for Lightweight Crack DetectionSenyao Li, Jingling Yuan, Huilin Zhu, Xian Zhong. 1-5 [doi]
- CIEGCL: Counterfactual Intervention Enhancing Graph Contrastive Learning in Implicit FeedbackZijian Huang 0016, Tian Jiang, Yan Feng, Zerui Wen, Xiaohui Cui. 1-5 [doi]
- Speech Recognition for Automatically Assessing Afrikaans and isiXhosa Preschool Oral NarrativesChristiaan Jacobs, Annelien Smith, Daleen Klop, Ondrej Klejch, Febe de Wet, Herman Kamper. 1-5 [doi]
- EfficientNet-Gaze: Integrating Multi-Scale Feature Extraction with Frequency Domain Analysis for Efficient Gaze EstimationYanxia Wang, Guoyu Xia. 1-5 [doi]
- OV-HHIR: Open Vocabulary Human Interaction Recognition Using Cross-modal Integration of Large Language ModelsLala Shakti Swarup Ray, Bo Zhou 0005, Sungho Suh, Paul Lukowicz. 1-5 [doi]
- Context-Guided Active Domain Adaptation for Blended Target DomainYuwu Lu, Yihan Yang. 1-5 [doi]
- Multi-Agent Hierarchical Graph Attention Actor-Critic Reinforcement LearningTongyue Li, Dianxi Shi, Songchang Jin, Zhen Wang, Huanhuan Yang, Yang Chen. 1-5 [doi]
- Whisper in Medusa's Ear: Multi-head Efficient Decoding for Transformer-based ASRYael Segal-Feldman, Aviv Shamsian, Aviv Navon, Gill Hetz, Joseph Keshet. 1-5 [doi]
- LiDAR Light Scattering Augmentation (LISA): Physics-based Simulation of Adverse Weather Conditions for 3D Object DetectionVelat Kilic, Deepti Hegde, A. Brinton Cooper, Vishal M. Patel, Mark A. Foster. 1-5 [doi]
- Large Language Models are Strong Audio-Visual Speech Recognition LearnersUmberto Cappellazzo, Minsu Kim, Honglie Chen, Pingchuan Ma 0001, Stavros Petridis, Daniele Falavigna, Alessio Brutti, Maja Pantic. 1-5 [doi]
- KSSANet: KAN-Driven Spatial-Spectral Attention Networks for Hyperspectral Image Super-ResolutionBaisong Li, Xingwang Wang, Haixiao Xu. 1-5 [doi]
- Gaussian-Face: Talking Head Generation with Hybrid Density via 3D Gaussian SplattingGuanwen Feng, Yilin Zhang, Yunan Li 0001, Siyu Jin, Qiguang Miao. 1-5 [doi]
- Hierarchical Context Interaction and Reasoning with Transformer for Emotion RecognitionWenxuan Wang, Chenglei Wang, Xiaomei Wang, Xuli Shen, Qing Xu 0017, Xuelin Qian. 1-5 [doi]
- Learned ReLU-Based Soft Thresholding: A Data-Driven Method for Non-Negative Sparse Signal RecoveryAkash Sen, Pradyumna Pradhan, Ramunaidu Randhi, C. S. Sastry. 1-5 [doi]
- Enhancing 3D Medical Image Understanding with 2D Multimodal Large Language ModelsQiuhui Chen, Xuancheng Yao, Huping Ye, Yi Hong. 1-5 [doi]
- VPCI: Self-Supervised Visual Prompt-Guided Cross-Domain Interactive Image Fusion FrameworkYong Liu, Chengyu Wu, Jiayuan Cui, Bin Jiang. 1-5 [doi]
- Interactive and Balanced Multimodal Learning via Cross Attention and Gradient Modulation for Compressed Video Action RecognitionXinqi Li, Shaojie Li, Ming Ma 0006. 1-5 [doi]
- FEA-DETR: An Enhanced ConvNet for Detecting Prohibited Objects in X-Ray Images Using Frequency and Edge Aware InformationShilong Hong, Yanzhou Zhou, Weichao Xu. 1-5 [doi]
- Enhancing Federated Domain Adaptation via Multi-Granular Fine-Grained AlignmentZiyun Cai, Shangshang Song, Jie Song 0014, Yawen Huang, Changhui Hu 0001, Xiao-Yuan Jing. 1-5 [doi]
- MetaCon: Revitalizing Internet Congestion Control with Meta-Reinforcement LearningHe Bai 0011, Hui Li, Jianming Que, Minglong Zhang, Peter Han Joo Chong, Kalupahana Liyanage Kushan Sudheera, Xinyuan Pei. 1-5 [doi]
- Appearance-adapter: A Self-supervised Pose-guided Human Image Synthesis ApproachSameer Malik, Moyuru Yamada. 1-5 [doi]
- LP-Gaussians: Learnable Parametric Gaussian Splatting for Efficient Dynamic Reconstruction of Single-View ScenesShaoqi Wu, Weixing Xie, Youhong Peng, Jinwen Li, Jiawei Yao, Junfeng Yao. 1-5 [doi]
- Planing It by Ear: Convolutional Neural Networks for Acoustic Anomaly Detection in Industrial Wood PlanersAnthony Deschênes, Rémi Georges, Cem Subakan, Bruna Ugulino, Antoine Henry, Michael Morin. 1-5 [doi]
- DAREK - Distance Aware Error for Kolmogorov NetworksMasoud Ataei, Mohammad Javad Khojasteh, Vikas Dhiman. 1-5 [doi]
- Unsupervised Multi-View Outlier Detection via Optimal Graph FilteringZhiguo Hu, Ning Wang, Peng Zhou 0006, Liang Du 0003, Yuhua Qian, Cheng Wang, YanMing Zhang. 1-5 [doi]
- LTOS: Layout-controllable Text-object Synthesis via Adaptive Cross-attention FusionsXiaoran Zhao, Tianhao Wu, Yu Lai, Zhiliang Tian, Zhen Huang, Yahui Liu, Zejiang He, Dongsheng Li. 1-5 [doi]
- Hedging Is Not All You Need: A Simple Baseline for Online Learning Under Haphazard InputsHimanshu Buckchash, Momojit Biswas, Rohit Agarwal, Dilip K. Prasad. 1-5 [doi]
- SUGAR: Leveraging Contextual Confidence for Smarter RetrievalHanna Zubkova, Ji Hoon Park, Seong-Whan Lee. 1-5 [doi]
- Triple Path Enhanced Neural Architecture Search for Multimodal Fake News DetectionBo Xu 0009, Qiujie Xie, Jiahui Zhou, Linlin Zong. 1-5 [doi]
- Unsupervised Domain Adaptation for Music Transcription: Exploiting Cross-Version ConsistencyLele Liu, Christof Weiß. 1-5 [doi]
- Micro-expression Spotting based on Multi-modal Hierarchical Semantic-guided Deep Fusion ModelZhihua Xie, Haolin Chang. 1-5 [doi]
- Brick-Diffusion: Generating Long Videos with Brick-to-Wall DenoisingYunlong Yuan, Yuanfan Guo, Chunwei Wang, Hang Xu, Li Zhang. 1-5 [doi]
- Catch Causal Signals from Edges for Label Imbalance in Graph ClassificationFengrui Zhang, Yujia Yin, Hongzong Li, Yifan Chen, Tianyi Qu. 1-5 [doi]
- REAct: Rational Exponential Activation for Better Learning and Generalization in PINNsSourav Mishra, Shreya Hallikeri, Suresh Sundaram. 1-5 [doi]
- RWKVMatch: Vision RWKV-based Multi-scale Feature Matching Network for Unsupervised Deformable Medical Image RegistrationZixuan He, Jing Tang, Zitong Zhao, Zeyu Gong. 1-5 [doi]
- Intent-driven In-context Learning for Few-shot Dialogue State TrackingZihao Yi, Zhe Xu, Ying Shen 0001. 1-5 [doi]
- Semi-Supervised Speaker Diarization Using Graph Transformers and LLMs on Naturalistic Apollo 11 DataMeena M. Chandra Shekar, John H. L. Hansen. 1-5 [doi]
- RF-Pose Estimation based on Contrastive Camera-Radar-Images PretrainingYen-Hsiang Tseng, Po-Hsuan Tseng. 1-5 [doi]
- Token-Level Contextual Network with Ladder-Shaped Attention for End-to-End ASRMing Fang, Kai Guo, Tao Wei 0003, Ziyang Zhuang, Yan Shi, Ning Cheng 0001, Shaojun Wang, Jing Xiao 0006. 1-5 [doi]
- Mamba Fusion: Learning Actions Through QuestioningApoorva Beedu, Zhikang Dong, Jason Sheinkopf, Irfan Essa. 1-5 [doi]
- Editing Music with Melody and Text: Using ControlNet for Diffusion TransformerSiyuan Hou, Shansong Liu, Ruibin Yuan, Wei Xue, Ying Shan, Mangsuo Zhao, Chao Zhang. 1-5 [doi]
- KAnoCLIP: Zero-Shot Anomaly Detection through Knowledge-Driven Prompt Learning and Enhanced Cross-Modal IntegrationChengyuan Li, Suyang Zhou, Jieping Kong, Lei Qi, Hui Xue. 1-5 [doi]
- MHSDB: A Comprehensive Benchmark for Multimodal Humor and Sarcasm Detection Leveraging Foundation ModelsZhongren Dong, Donghao Wang, Ciqiang Chen, Dong-Yan Huang, Zixing Zhang 0001. 1-5 [doi]
- VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-SpeechChenpeng Du, Yiwei Guo, Hankun Wang, Yifan Yang 0005, Zhikang Niu, Shuai Wang 0016, Hui Zhang, Xie Chen 0001, Kai Yu 0004. 1-5 [doi]
- A Novel Audio-Visual Multimodal Semi-Supervised Model Based on Graph Neural Networks for Depression DetectionYaqin Li, Chenjian Sun, Yihong Dong. 1-5 [doi]
- Self-Incremental Training for Personalized Voice Command Recognition in a Wireless Audio Sensor NetworkManuele Rusci, Hugo Van Hamme, Tinne Tuytelaars. 1-5 [doi]
- ToolFiVe: Enhancing Tool-Augmented LLMs via Tool Filtering and VerificationHailun Lu, Xingming Li, Xuanyu Ji, Zhigang Kan, Qingyong Hu. 1-5 [doi]
- DMIBot: Dynamic Multimodal Interaction for Twitter Bot DetectionXiezhuo Lin, Qingfeng Wu, YuRui Huang. 1-5 [doi]
- D3RM: A Discrete Denoising Diffusion Refinement Model for Piano TranscriptionHounsu Kim, Taegyun Kwon, Juhan Nam. 1-5 [doi]
- Developing a Multilingual Dataset and Evaluation Metrics for Code-Switching: A Focus on Hong Kong's Polylingual DynamicsPeng Xie, Kani Chen. 1-5 [doi]
- When CLIP Meets PHOC: A Dual-Branch Network for Historical Document Image RetrievalJing Zhang, Hongxi Wei, Qing Zhang. 1-5 [doi]
- Topology Decoupled All-reduce AlgorithmRuixing Zong, Jiapeng Zhang, Zhuo Tang, Anwitaman Datta. 1-5 [doi]
- Causal Speech Enhancement Based on a Two-Branch Nested U-Net Architecture Using Self-Supervised Speech EmbeddingsSeorim Hwang, Sungwook Park, Youngcheol Park. 1-5 [doi]
- Lightweight Multi-Frequency Enhancement Network for RGB-D Video Salient Object DetectionDaerji Suolang, Jiahao He, Wangchuk Tsering, Keren Fu, Xiaofeng Li, Qijun Zhao. 1-5 [doi]
- Deep Learning for Modulo Sampling of FRI SignalsSem Koenen, Vincent van de Schaft, Ruud J. G. van Sloun. 1-5 [doi]
- Universal Low-Resource Speech Synthesis Via Phoneme Fusion Coordinating Low-Rank DecompositionYanliang Li, Zhengtao Yu, Linqin Wang, Shengxiang Gao, Ling Dong, Wenjun Wang. 1-5 [doi]
- Soft Knowledge Distillation with Multi-Dimensional Cross-Net Attention for Image Restoration Models CompressionYongheng Zhang, Danfeng Yan. 1-5 [doi]
- Learning to Follow Infrared Prior Repersentation for Image DehazingYiquan Du, Ming Yang, Haiyang Huo, Jiaqi Shi, Jiangang Ding 0001, Lili Pei. 1-5 [doi]
- Meta-Conscious Driven Domain-Aware Federated LearningZilong Yin, Haoyu Wang, Xiaogang Lin, Xin Zhang, Bin Chen, Chenyu Zhou. 1-5 [doi]
- LLaQo: Towards a Query-Based Coach in Expressive Music Performance AssessmentHuan Zhang, Vincent K. M. Cheung, Hayato Nishioka, Simon Dixon, Shinichi Furuya. 1-5 [doi]
- Advanced Zero-Shot Text-to-Speech for Background Removal and Preservation with Controllable Masked Speech PredictionLeying Zhang, Wangyou Zhang, Zhengyang Chen, Yanmin Qian. 1-5 [doi]
- Dynamic SRM Curriculum for Trustworthy Multi-modal ClassificationJian Zhu, Cui Yu, Xin Zou 0001, Zhangmin Huang, Chenshu Hu, Jun Sun, Bo Lyu, Lei Liu 0029, Chang Tang, Li-Rong Dai 0001. 1-5 [doi]
- LABEL-SAM: A Semi-Automatic Interactive Annotation Model for Aortic Dissection Segmentation in 3D CTA ImageWenjie Cai, Tao Tang, Balachander J., Lingming Kong, Ying Zhou, Qingfeng Wang, Jing Li. 1-5 [doi]
- Continual Unsupervised Domain Adaptation for Audio Deepfake DetectionXiaohuan Chen, Wenhuan Lu, Ruiteng Zhang, Junhai Xu, Xugang Lu, Lin Zhang, Jianguo Wei. 1-5 [doi]
- DRANet: Dual-threshold Guided Reliability Aware Network for Semi-Supervised Image Semantic SegmentationLizhe Qiu, Mingyang Zhang, Yuan Zheng. 1-5 [doi]
- Single Trial Reaction Time Prediction Using Optimal Synchrony Window DetectionAdarsh V. Parekkattil, Sanjeev Kumar Varun, Tharun Kumar Reddy Bollu. 1-5 [doi]
- UIDAPLE: Unsupervised Incremental Domain Adaptation through Adaptive Prompt LearningSamrat Mukherjee, Tanuj Sur, Saurish Seksaria, Subhasis Chaudhuri, Gemma Roig, Biplab Banerjee. 1-5 [doi]
- MambaInst: Lightweight State Space Model for Real-Time Instance SegmentationZeyu Wang, Chen Li, Huiying Xu, Xinzhong Zhu, Xiao Huang, Hongbo Li. 1-5 [doi]
- Fully Spiking Neural Network for Legged RobotsXiaoyang Jiang, Qiang Zhang 0029, Jingkai Sun, Jiahang Cao, Jingtong Ma, Renjing Xu. 1-5 [doi]
- Improving Micro-expression Recognition using Multi-sequence Driven Face GenerationYuan Chen, Chongju Zhong, Pinyi Huang, Wangyang Cai, Lei Wang. 1-5 [doi]
- Model-Based Machine Learning for Max-Min Fairness Beamforming Design in JCAS SystemsMengyuan Ma, Tianyu Fang, Nir Shlezinger, A. Lee Swindlehurst, Markku J. Juntti, Nhan Nguyen. 1-5 [doi]
- FedImp: Federated Learning Using Important Layers of Client Models for the Diagnosis of Breast Cancer Histopathology ImagesMangaldeep Banerjee, Angshuman Paul. 1-5 [doi]
- 3D-Speaker-Toolkit: An Open-Source Toolkit for Multimodal Speaker Verification and DiarizationYafeng Chen, Siqi Zheng, Hui Wang, Luyao Cheng, Tinglong Zhu, Rongjie Huang, Chong Deng, Qian Chen, Shiliang Zhang, Wen Wang, Xihao Li. 1-5 [doi]
- EEG Correlation Analysis-guided Graph Local Enhanced Feature Learning For Emotion RecognitionXinhui Li, Guowang Zhuang, Minchao Wu, Zhao Lv. 1-5 [doi]
- PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music ProcessingPhillip Long, Zachary Novack, Taylor Berg-Kirkpatrick, Julian J. McAuley. 1-5 [doi]
- PPDformer: Channel-Specific Periodic Patch Division for Time Series ForecastingMeng Wan, Qi Su, Huan Hao, Jue Wang 0013, Yuexiu Cui, Yuxuan Bi, Rongqiang Cao, Peng Shi 0006, Yangang Wang, Zonghua Qiu, Zongshan Zhang. 1-5 [doi]
- Uniform Convergence of Lipschitz Functions with Dependent Gaussian SamplesMina Sadat Mahmoudi, Saeed Foroutan, Seyed Abolfazl Motahari, Babak H. Khalaj. 1-5 [doi]
- DiffSSD: A Diffusion-Based Dataset For Speech ForensicsKratika Bhagtani, Amit Kumar Singh Yadav, Paolo Bestagini, Edward J. Delp. 1-5 [doi]
- Data-Driven White Noise Gain Constrained Robust Superdirective Beamformer for Speech EnhancementHanchen Pei, Gongping Huang, Jilu Jin, Jianbo Ma, Zhizheng Wu 0001, Jingdong Chen, Jacob Benesty. 1-5 [doi]
- Quantum-Behaved Particle Swarm Optimization for the Segmentation of Kidney Stone CT ImagesSajad Ahmad Rather, Akhilesh Kandwal, Mohammad Khalid Pandit, Partha Pratim Roy 0001. 1-5 [doi]
- Physically Robust and Imperceptible Adversarial Examples Generation Based on FrequencyChengyao Hua, Yitian Chen, Shigeng Zhang, Xuan Liu 0001, Senzhang Wang, Weiping Wang 0003, Kai Chen 0012. 1-5 [doi]
- Enhancing Incomplete Multimodal Learning via Modal Complementary RecoveringMeirong Ding, Hongyi Lin, Jiawei Zhu, Chuang Zou, Wenxiu Cai, Bingzhi Chen. 1-5 [doi]
- Bridging Neural and Symbolic Reasoning: A Dual-System Framework for Interpretable Question AnsweringJihao Shi, Xiao Ding, Hengwei Zhao, Ting Liu 0001, Bing Qin 0001. 1-5 [doi]
- Spatial Frequency Interleaving Residual Autoencoder for Indoor Radio Map ReconstructionShitong Chai, Jiahui Li 0006, Mengyao Ma, Junwen Xie, Xiaopeng Fan, Xianqi Zhang. 1-5 [doi]
- LLM-Augmented Symbolic RL with Landmark-Based Task DecompositionAlireza Kheirandish, Duo Xu, Faramarz Fekri. 1-5 [doi]
- Enhancing Federated Knowledge Distillation in Heterogeneous and Non-IID ScenariosWenjie Lv, Yu He, Sen Liu, Xingjun Ma, Xiang Liu, Guangnan Ye, Hongfeng Chai. 1-5 [doi]
- SUFT: Sparse and Uncertain Fusion Transformers for Multi-Atlas Brain Network AnalysisZhan Su, Jiashuang Huang, Shu Jiang, Mingliang Wang, Weiping Ding 0001. 1-5 [doi]
- PGD-Imp: Rethinking and Unleashing Potential of Classic PGD with Dual Strategies for Imperceptible Adversarial AttacksJin Li, Zitong Yu, Ziqiang He, Z. Jane Wang 0001, Xiangui Kang. 1-5 [doi]
- Meta-Analogy Learning Based on Dynamic Graph Neural Networks for Inductive Knowledge Graph Link PredictionJingyu Wang, Zhijuan Du, Tao Sun. 1-5 [doi]
- Graph Embedded Stochastic Configuration Networks for Imbalanced Data ClassificationYuanhang Qiu, Dianhui Wang. 1-4 [doi]
- Robust CLIP-Guided Deep Thinking: A Two-Stage Optimization Strategy for Enhancing Adversarial Robustness and Reliability in LVLMsYize Sui, Wanrong Huang, Wenjing Yang 0002, Chaofan Zhao, Jing Ren, Ji Wang 0001. 1-5 [doi]
- Enhancing Multilingual ASR for Unseen Languages via Language Embedding ModelingShao-Syuan Huang, Kuan-Po Huang, Andy T. Liu, Hung-yi Lee. 1-5 [doi]
- CGNet: Classification-Guided Multi-Task Interactive Network for Hyperspectral and Multispectral Image FusionGuohua Lv, Yanlong Xu, Yongbiao Gao, Guixin Zhao, Guotao Wang, Xiangcheng Sun. 1-5 [doi]
- MusicMamba: A Dual-Feature Modeling Approach for Generating Chinese Traditional Music with Modal PrecisionJiatao Chen, Xing Tang 0001, Tianming Xie, Jing Wang 0063, Wenjing Dong, Bing Shi 0002. 1-5 [doi]
- VoiceGuider: Enhancing Out-of-Domain Performance in Parameter-Efficient Speaker-Adaptive Text-to-Speech via AutoguidanceJiheum Yeom, Heeseung Kim, Jooyoung Choi, Che Hyun Lee, Nohil Park, Sungroh Yoon. 1-5 [doi]
- LiDAR-SPD: Improving Adversarial Robustness of 3D Object Detection via Spherical Projection and DiffusionMumuxin Cai, Xupeng Wang 0001, Ferdous Sohel, Hang Lei. 1-5 [doi]
- Hierarchical Loss for Bi-Level Classification of Speech into Language and DialectsAnanya Angra, Muralikrishna H, Dileep Aroor Dinesh, Veena Thenkanidiyoor. 1-5 [doi]
- EffectiveASR: A Single-Step Non-Autoregressive Mandarin Speech Recognition Architecture with High Accuracy and Inference SpeedZiyang Zhuang, Chenfeng Miao, Kun Zou, Ming Fang, Tao Wei 0003, Zijian Li, Ning Cheng 0001, Wei Hu, Shaojun Wang, Jing Xiao 0006. 1-5 [doi]
- MMEditor: Multimodal Prompt-Driven 3D Gaussian Splatting EditingLihua Lu, RenGang Li, Yaqian Zhao, Xiaohui Zhang 0017, Hui Wei 0005, Ruyang Li. 1-5 [doi]
- SlimSpeech: Lightweight and Efficient Text-to-Speech with Slim Rectified FlowKaidi Wang, Wenhao Guan, Shenghui Lu, Jianglong Yao, Lin Li, Qingyang Hong. 1-5 [doi]
- A Key to Effective Multi-task Learning: Separate Query Selection for Task-Synergized Handling and Node UtilizationShan-Ya Yang, Hao-Chung Cheng, Chien-Yao Wang, Jia-Ching Wang, Chun-Yi Lee. 1-5 [doi]
- KAN-Face: Efficient Resource Usage and Precision Lip-Sync in Talking Head GenerationGuanwen Feng, Siyu Jin, Zhihao Qian, Yunan Li 0001, Qiguang Miao. 1-5 [doi]
- A Non-autoregressive Model for Joint STT and TTSVishal Sunder, Brian Kingsbury, George Saon, Samuel Thomas 0001, Slava Shechtman, Hagai Aronowitz, Eric Fosler-Lussier, Luis A. Lastras. 1-5 [doi]
- Learning Class Unique Features in Fine-Grained Visual ClassificationRunkai Zheng, Li Liu 0036, Zhijia Yu, Yinqi Zhang, Hei Victor Cheng, Chris Ding. 1-5 [doi]
- Adaptive Compression of Supervised and Self-Supervised Models for Green Speech RecognitionMouaad Oujabour, Leila Ben Letaifa, Jean-François Dollinger, Jean-Luc Rouas. 1-5 [doi]
- A Privacy-Preserving Cross-Modal Retrieval Scheme Based on CLIP and Deep HashingYuan Cao 0005, Hui Zhang, Xinzheng Shang. 1-5 [doi]
- Voice Conversion via Structural EntropyLinqin Wang, Zhengtao Yu, Shengxiang Gao, Cunli Mao, Ling Dong, Yuxin Huang. 1-5 [doi]
- Joint Beamforming Design for Multi-Functional RIS-Aided Over-the-Air ComputationXinran Zhang, Hui Tian 0003, Wanli Ni. 1-5 [doi]
- Can Quality Survive Scale? Toward an Equal-Quality Instance-Dependent Label Noise ModelHaohao Song, Qiao Xiang, Jiwu Shu. 1-5 [doi]
- Semantic Prior-Guided Scalable Image CodingWuzhen Shi, Wennan Yin, Fei Tao 0005, Yang Wen. 1-5 [doi]
- WMAJL: Watcher-Mediated Attention Joint Learning Model for Multimodal Relation ExtractionYunrui Dong, Guiduo Duan, Tianxi Huang, Yunhao Li. 1-5 [doi]
- Debiased Prototype Evolving for Point Cloud Domain Adaptation via 3D Foundation ModelsFeng Yang, Yichao Cao, Xuanpeng Li. 1-5 [doi]
- An End-to-End Graph-Guided Spatiotemporal Model for Adaptive Frame-Level Facial Affect Analysis in the WildYan Liang, Yan Hao, Zenan Yao, Jiacheng Liao, Jiahui Pan. 1-5 [doi]
- FasterGold-DETR: An Efficient End-to-End Fire Detection Model via Gather-and-Distribute MechanismChengming Liu, Fan Wu, Lei Shi 0001. 1-5 [doi]
- Multi-Scale Conditional Generative Adversarial Networks for Wind Speed Data Imputation in Earthen Ruins ProtectionHang Li, Hai Wang, Rui Cao 0003, Shuo Ji, Jie Zheng. 1-5 [doi]
- MLSwinTNet: A Multi-Level Feature Interaction Network for Low-Light Image EnhancementMingyang Sun, Xinxin Wang, Ru Yi. 1-5 [doi]
- Causal Feature Supervision Decoupling: A Novel Method for Clothes-Changing Person Re-identification AlgorithmWenxin Hu, Caidan Zhao, Chenxing Gao, Zhiqiang Wu 0001. 1-5 [doi]
- CorrGAN: Simultaneous Learning of Speech Enhancement and Perceptual Quality Loss FunctionsVasily Zadorozhnyy, Saeed Amizadeh, Qiang Ye 0003, Kazuhito Koishida. 1-5 [doi]
- Adapter-Based Multi-Agent AVSR Extension for Pre-Trained ASR ModelsChristopher Simic, Korbinian Riedhammer, Tobias Bocklet. 1-5 [doi]
- DSSM: Dual State Space Model For Human Motions GenerationYiming Liu, Huan Zhao 0003, Yaqian Liu, Haijiao Chen, Bo Li, Guanghui Ye, Zixing Zhang 0001. 1-5 [doi]
- Generating Customized 4D Motions from Text Inputs Using Spatial-Temporal Slicing ApproachesZhichao Zhang, Hui Chen, Ming Xu, Jinsheng Deng, Xingshen Song. 1-5 [doi]
- FSENet: Frequency Separation Enhancement Network for Super-ResolutionLiangdong Li, Zhihuai Xie. 1-5 [doi]
- Multi-scale Graph Convolution with Corrective Contrastive Learning for Skeleton-based Action RecognitionTianming Zhuang, Erqiang Zhou, Hanwen Zhang, Yi Ding, Ji Geng, Zhen Qin 0002. 1-5 [doi]
- Simplified Augmented Real-Valued Time-Delay Neural Network for Digital PredistortionLesthuruge Silva, Sri Satish Krishna Chaitanya Bulusu, Nuutti Tervo, Premanandana Rajatheva. 1-5 [doi]
- Gradient-Oriented Clustered Federated Learning With Efficient Knowledge Sharing in Non-IID SettingsKenta Kubota, Ren Togo, Keisuke Maeda, Takahiro Ogawa 0001, Miki Haseyama. 1-5 [doi]
- Calibration of Multiple Asynchronous Microphone Arrays using Hybrid TDOAChengjie Zhang, Wenda Pan, Xinyang Han, He Kong. 1-5 [doi]
- Towards Clinically Feasible Nonintrusive Quality and Intelligibility Indices for Hearing AidsVahid Ashkani Chenarlogh, Paula Folkeard, Vijay Parsa. 1-5 [doi]
- Identical Human Preference Alignment Paradigm for Text-to-Image ModelsHaoyuan Sun, Bo Xia, Yifei Zhao, Yongzhe Chang, Xueqian Wang. 1-5 [doi]
- Diffused Poses and Distilled Expressions for Controllable Audio-driven Talking Face GenerationZiqi Zhou, Weize Quan, Zhaojin Lu, Dong-Ming Yan 0001. 1-5 [doi]
- Detecting and Defending Against Adversarial Attacks on Automatic Speech Recognition via Diffusion ModelsNikolai Lund Kühne, Astrid H. F. Kitchena, Marie S. Jensen, Mikkel S. L. Brøndt, Martin Gonzalez, Christophe Biscio, Zheng-Hua Tan. 1-5 [doi]
- Stable Extended U-Net for Noise-Robust Speaker VerificationZonghui Wang, Zhihua Fang, Liang He 0003. 1-5 [doi]
- Pushing Wi-Fi Towards Fine-Grained Sensing Via Spectrogram EnhancementHongbo Jiang 0001, Yiwei Chen, Jingyang Hu, Siyu Chen. 1-5 [doi]
- Prelude echoes Finale: Video Domain Adaptation with Fine-grained Temporal ConsistencyShuxian Wang, Mengmeng Jing, Xianlong Tian, Yuguo Hu, Lin Zuo. 1-5 [doi]
- LMFCA-Net: A Lightweight Model for Multi-Channel Speech Enhancement with Efficient Narrow-Band and Cross-Band AttentionYaokai Zhang, Hanchen Pei, Wanqi Wang, Gongping Huang. 1-5 [doi]
- mmHIU: a human-to-human interaction understanding system based on mmWave sensingFenglin Zhang, Chenglin Wu, Anfu Zhou, Huadong Ma. 1-5 [doi]
- FLowHigh: Towards Efficient and High-Quality Audio Super-Resolution with Single-Step Flow MatchingJun-Hak Yun, Seung-bin Kim, Seong-Whan Lee. 1-5 [doi]
- Target word activity detector: An approach to obtain ASR word boundaries without lexiconSunit Sivasankaran, Eric Sun, Jinyu Li 0001, Yan Huang 0028, Jing Pan. 1-5 [doi]
- Ambisonics Binaural Rendering via Masked Magnitude Least SquaresOr Berebi, Fabian Brinkmann, Stefan Weinzierl, Boaz Rafaely. 1-5 [doi]
- Person-in-Bed Detection from Mattress-Integrated AccelerometersAkhil Bhimaraju. 1-2 [doi]
- Transformer-Based Contrastive Meta-Learning For Low-Resource Generalizable Activity RecognitionJunyao Wang 0001, Mohammad Abdullah Al Faruque. 1-5 [doi]
- Uncertainty-Participation Context Consistency Learning for Semi-supervised Semantic SegmentationJianjian Yin, Yi Chen 0023, Zhichao Zheng 0006, Junsheng Zhou, Yanhui Gu. 1-5 [doi]
- Leave-One-EquiVariant: Alleviating Invariance-Related Information Loss in Contrastive Music RepresentationsJulien Guinot, Elio Quinton, György Fazekas. 1-5 [doi]
- A Robust Online Miscalibration Detection and Correction Method for LiDAR-CameraFeng Pan, Wei Wang 0076, Jianing Zhang. 1-5 [doi]
- UML: A Unified Multimodal Learning Framework for Cataract Postoperative Visual Acuity Prediction with Uncertain Missing ModalitiesTongyu Yang, Qian Zhou, Hua Zou, Haifeng Jiang, Yong Wang. 1-5 [doi]
- GPA: Enhancing Generalizable Physical Adversarial Attacks Across Multiple Vision TasksMingye Xie, Suncheng Xiang, Jiacheng Ruan, Xian Gao, Zefang Yu, Ting Liu, Yuzhuo Fu. 1-5 [doi]
- Sound Zone Control Robust To Sound Speed ChangeSankha Subhra Bhattacharjee, Jesper Rindom Jensen, Mads Græsbøll Christensen. 1-5 [doi]
- Position-aware Hypergraph Message-Passing Neural NetworkXinyu Zhang, Qize Jiang, Hanyuan Zhang, Weiwei Sun 0008. 1-5 [doi]
- Beyond Uniformity: Deblurring Images With Complex Noise Patterns Using Half Quadratic SplittingAvinash Kumar, Koyyada Dinesh Kumar, Sujit Kumar Sahoo. 1-4 [doi]
- Self-Supervised Learning-Based Multimodal Prediction on Prosocial Behavior IntentionsAbinay Reddy Naini, Zhaobo K. Zheng, Teruhisa Misu, Kumar Akash. 1-5 [doi]
- A Graph-Based Generative Adversarial Network Model for Inferring Task-State from Resting-State Functional Connectivity NetworksTao Jin, Hongzheng Guan, Li Xiao, Gang Qu, Yu-Ping Wang. 1-5 [doi]
- Rethinking the Fragility and Robustness of Fingerprints of Deep Neural NetworksFangqi Li 0001, Shilin Wang, Lei Yang 0062. 1-5 [doi]
- Superpoints Guided Local Explanation For Deep 3D TrackersRiran Cheng, Xupeng Wang 0001, Ferdous Sohel, Hang Lei. 1-5 [doi]
- Refiner: Fine-grained Cross-modal Concepts Refinement for Compositional Zero-Shot LearningXiao Zhang, Haodong Jing, Hui Chen, Yongqiang Ma, Nanning Zheng 0001. 1-5 [doi]
- Knowledge Distillation From Ensemble for Spoken Language IdentificationRaghuveer Peri, Seyed Omid Sadjadi, Daniel Garcia-Romero, Srikanth Vishnubhotla, Kyu J. Han. 1-5 [doi]
- EGENN: An Efficient Graph-Enhanced Neural Network for Multivariate Time Series ForecastingHaoxuan Xu, Haiqi Zhu, Yifan Chen, Chunzhi Yi, Baichun Wei, Feng Jiang 0001. 1-5 [doi]
- Reference-Guided Parallel Independent Component Analysis: Estimating Cognition Associated Multimodal Patterns In SchizophreniaJingxian Hu, Chuang Liang, Tülay Adali, Qi Zhu 0001, Daoqiang Zhang, Rongtao Jiang, Vince D. Calhoun, Shile Qi. 1-5 [doi]
- USD: Unsupervised Soft Contrastive Learning for Fault Detection in Multivariate Time SeriesHong Liu, Xiuxiu Qiu, Yiming Shi, Zelin Zang. 1-5 [doi]
- A Novel Weighted Sparse Component Analysis for Underdetermined Blind Speech SeparationYudong He, Baeck Hyun Woo, Richard Hau Yue So. 1-5 [doi]
- Hybrid Offline Passive Grammatical Inference and Online Planning for Non-Markovian TasksMahyar Alinejad, Alvaro Velasquez, Yue Wang 0068, George K. Atia. 1-5 [doi]
- Enhancing Large Language Model Inference Efficiency via Lookahead Cache FilteringJie Ou, Yueming Chen, Shuaihong Jiang, Wenhong Tian. 1-5 [doi]
- Lightweight neural front-ends for low-resource on-device Text-to-SpeechGiulia Comini, Heereen Shim, Manuel Sam Ribeiro. 1-5 [doi]
- Towards Expressive Video Dubbing with Multiscale Multimodal Context InteractionYuan Zhao, Rui Liu, Gaoxiang Cong 0001. 1-5 [doi]
- CardioRiskNet: Attention-based CVAE-enabled GCN for Risk Prediction in STEMIAkshat Gupta, Anubha Gupta, Manu Kumar Shetty, Dixit Goyal, Girish M. P, Mohit D. Gupta. 1-5 [doi]
- JointSwinUNETR: an Efficient Feature-enhanced Architecture for Small Intestine Cine MRI SegmentationYue Wang, Wenhui Li, Ziming Wang, Taoli Du, Ming Ma, Mengchao Zhang. 1-5 [doi]
- Semantic-Aware Prompt Learning for Multimodal Sarcasm DetectionGuangjin Wang, Bao Wang, Fuyong Xu, Zhenfang Zhu, Peipei Wang, Ru Wang, Peiyu Liu 0001. 1-5 [doi]
- Enhancing Imaging Generation through Implicit Neural Representations and HyperNetwork for Spatial VariabilityJaehoon Cha, Siu Lun Yeung, Siddharth Dhanpal, Jeyan Thiyagalingam. 1-5 [doi]
- Learning to Optimally Sample in MRI for Denoising-Driven RegularizationPavan Kumar Reddy K, Kunal N. Chaudhury. 1-5 [doi]
- Mask augmented Object-Centric Contrastive Learning for Amodal Instance SegmentationTomokazu Kaneko, Ryosuke Sakai, Takashi Shibata 0001, Soma Shiraishi. 1-5 [doi]
- E-RNS : Enhancing Negative Sample Quality from Gradient Perspective for Graph RecommendationQiangsheng Feng, Jiwei Qin, Jie Ma. 1-5 [doi]
- A Zero-Shot Physics-Informed Dictionary Learning Approach for Sound Field ReconstructionStefano Damiano, Federico Miotello, Mirco Pezzoli, Alberto Bernardini, Fabio Antonacci, Augusto Sarti, Toon van Waterschoot. 1-5 [doi]
- SNNPTrack: Spiking Neural Network Based Prompt for High-Accuracy RGBE TrackingYixi Ji, Qinghang Zhao, Yuping Liang, Jinjian Wu. 1-5 [doi]
- Automated Graph Attention Network for Heterogeneous Entity ResolutionChen Liu, Xiaohui Rong. 1-5 [doi]
- Robust Supervised Graph Embedding Method For EEG-Based Brain Network Emotion RecognitionPengcheng Zhu, Cunbo Li, Peiyang Li, Fali Li, Dezhong Yao 0001, Peng Xu 0001. 1-5 [doi]
- Multi-Microphone Speech Emotion Recognition Using the Hierarchical Token-Semantic Audio Transformer ArchitectureOhad Cohen, Gershon Hazan, Sharon Gannot. 1-5 [doi]
- AMuSE: Attentive Multilingual Speech Encoding for Zero-Prior ASRAshutosh Varshney, Debmalya Chakrabarty, Akshat Jaiswal, Harish Arsikere, Abhinav Jain, Swayambhu Nath Ray, Frederick Weber, Anand Mohan, Prantik Sen, Garima Lalwani, Sambuddha Bhattacharya, Sri Garimella. 1-5 [doi]
- Triplet Synthesis for Enhancing Composed Image Retrieval via Counterfactual Image GenerationKenta Uesugi, Naoki Saito 0006, Keisuke Maeda, Takahiro Ogawa 0001, Miki Haseyama. 1-5 [doi]
- Adaptive Large Language Models via Attention ShortcutsPrateek Verma, Mert Pilanci. 1-5 [doi]
- Graph Structure Learning via Transfer Entropy for Multivariate Time Series Anomaly DetectionMingyu Liu, Yijie Wang 0001, Xiaohui Zhou, Yongjun Wang. 1-5 [doi]
- A Fuzzy C-Means Clustering Algorithm for Real Medical Image SegmentationFeifei Zhang, Fei Shi, Dayong Ren, Yue Li. 1-5 [doi]
- TRACE: A Robust Framework for Malicious Traffic Detection with Noisy LabelsYitong Cai, Chengwei Peng, Shu Li, Yuyi Liu, Hongfei Zhang, Binxing Fang. 1-5 [doi]
- RAS-GNN: Reconstructing APT Attack Scenario Using Graph Neural NetworkZhicheng Huang, Ping Wang. 1-5 [doi]
- Investigation of Whisper ASR Hallucinations Induced by Non-Speech AudioMateusz Baranski, Jan Jasinski, Julitta Bartolewska, Stanislaw Kacprzak, Marcin Witkowski, Konrad Kowalczyk. 1-5 [doi]
- Distributed Interference Alignment Precoding and Detection for MU-MIMO OTSM Downlink in Time-Varying ChannelsSapta Girish Neelam. 1-5 [doi]
- SECC-Stega: Generative Linguistic Steganographic Framework Based on Error Correcting CodesYuzhe Guo, Zhongliang Yang, Zhuang Wang, Zhili Zhou, Linna Zhou. 1-5 [doi]
- VP-NTK: Exploring the Benefits of Visual Prompting in Differentially Private Data SynthesisChia-Yi Hsu, Jia-You Chen, Yu-Lin Tsai, Chih-Hsun Lin, Pin-Yu Chen, Chia-Mu Yu, Chun-Ying Huang. 1-5 [doi]
- Meta-MMD Fusion: Enhancing Cross-Subject Motor Imagery ClassificationMinghui Chen, Chao Qu, Jiahui Pan. 1-5 [doi]
- Learning in the Model Space: Fault Diagnosis by Co-objective Learning in DynInt Model SpaceZiyu Tang, Xiren Zhou, Shikang Liu, Chuyang Wei, Ao Chen 0002, Huanhuan Chen. 1-5 [doi]
- Audio Features Investigation for Singing Voice Deepfake DetectionMahyar Gohari, Davide Salvi, Paolo Bestagini, Nicola Adami. 1-5 [doi]
- GDRIVE: Adaptive Object Detection in Autonomous Vehicles via Graph-Based Feature LearningSuyang Xi, Yunhao Liu, Hao Lu, Yi Ding, Hong Ding. 1-5 [doi]
- Improved Pitch and Voicing Determination Using the Reflected Root Chirp Group Delay SpectrumNishant Singh, Mudit D. Batra, C. S. Ramalingam. 1-5 [doi]
- Subjective Quality Evaluation of Point Clouds Using a Head-Mounted DisplayJoão Prazeres, Rafael Rodrigues, Manuela Pereira, António M. G. Pinheiro. 1-5 [doi]
- Inside and Inside: Efficient Anomaly Detection by Fully Capturing the Detailed DynamicsZiyu Tang, Xiren Zhou, Ao Chen 0002, Shikang Liu, Chuyang Wei, Huanhuan Chen. 1-5 [doi]
- Regularized Weighted Descent: Model-Based Learner for Multi-Target Radar Waveform DesignJunho Kweon, Fulvio Gini, Maria S. Greco, Muralidhar Rangaswamy, Vishal Monga. 1-5 [doi]
- Piano Transcription by Hierarchical Language Modeling with Pretrained Roll-based EncodersDichucheng Li, Yongyi Zang, Qiuqiang Kong. 1-5 [doi]
- Adaptive Skeleton Prompt Tuning for Cross-Dataset 3D Human Pose EstimationHaolun Li 0001, Fuchen Zheng, Ye Liu 0005, Jian Xiong 0005, Wenhua Zhang, Haidong Hu, Hao Gao 0005. 1-5 [doi]
- Semantics-Guided Dynamic Hypergraph Network for Human Mobility Nowcasting in DisasterBowen Zhang 0005, Yunlong Xing, Zinao Su, Jinzhou Cao, Tianhong Zhao, Genan Dai. 1-5 [doi]
- CoMT: Chain-of-Medical-Thought Reduces Hallucination in Medical Report GenerationYue Jiang, Jiawei Chen 0012, Dingkang Yang, Mingcheng Li, Shunli Wang 0001, Tong Wu, Ke Li, Lihua Zhang. 1-5 [doi]
- The Importance of Spatial and Spectral Information in Multiple Speaker TrackingHanan Beit-On, Vladimir Tourbabin, Boaz Rafaely. 1-5 [doi]
- SelfDeblur with Sparsity Enforced Bregman LearningYaoyun Zeng, Beier Chen, Hongxia Wang. 1-5 [doi]
- A Pre-trained Plug-in Mixture-of-LoRAs Model for Transferable Sequential RecommendationWenqi Sun, Ruobing Xie, Junjie Zhang 0009, Zitian Guo, Wayne Xin Zhao, Zhanhui Kang, Ji-Rong Wen. 1-5 [doi]
- MedCAM-OsteoCls: Medical Context Aware Multimodal Classification of Knee OsteoarthritisAkshay Daydar, Alik Pramanick, Arijit Sur, Subramani Kanagaraj. 1-5 [doi]
- Stable and Lightweight Deep Primal-Dual Unrolling for Constrained Image Restoration with Convolutional Sparse CodingTakafumi Ueki, Kazuki Naganuma, Shunsuke Ono. 1-5 [doi]
- Self-supervised Hyperspectral and Multispectral Fusion via Deep Low-Rank Prior and Learnable Degradation NetworksNa Liu 0014, Lianming Xu, Suxian Fu, Li Wang 0039. 1-5 [doi]
- Subject Representation Learning from EEG using Graph Convolutional Variational AutoencodersAditya Mishra, Ahnaf Mozib Samin, Ali Etemad, Javad Hashemi. 1-5 [doi]
- An Exceptional Dataset For Rare Pancreatic Tumor SegmentationWenqi Li, Yingli Chen, Keyang Zhou, Xiaoxiao Hu, Zilu Zheng, Yue Yan, Xinpeng Zhang 0001, Wei Tang, Zhenxing Qian. 1-5 [doi]
- FedDiT: Federated Learning by Distillation Token Enhanced Vision TransformerJue Xiao, Zepu Yi, Hewang Nie, Zhi Lu, Xueming Tang, Songfeng Lu, Zhiguo Huang, Runqing Zhang. 1-5 [doi]
- Robust Audio Deepfake Detection using Ensemble Confidence CalibrationKwok Chin Yuen, Duc-Tuan Truong, Jia Qi Yip. 1-5 [doi]
- Asymptotic Behavior Analysis of Antenna Selection via Sparsity-Induced PrecoderXiuxiu Ma, Abla Kammoun, Mohamed-Slim Alouini, Tareq Y. Al-Naffouri. 1-5 [doi]
- Communication-efficient Exact Diffusion for Decentralized LearningGustavo Faia, Stefan Vlaski, Roula Nassif. 1-5 [doi]
- Multi-Prototype-based Embedding Refinement for Medical Image SegmentationYali Bi, Enyu Che, Yinan Chen, Yuanpeng He, Jingwei Qu. 1-5 [doi]
- MPFL: A Decentralised Federated Learning Framework Based on Multi-Population Genetic AlgorithmWenqi Ding, Yuanchao Liu, Zhongjie Wang 0003, Zheng Chu. 1-5 [doi]
- A Unified Metric for Simultaneous Evaluation of Error Rate and Annotation CostMark Lindsey, Francis Kubala, Richard M. Stern. 1-5 [doi]
- Video-Poetry Retrieval with Multimodal Knowledge Graph Guided Unsupervised Pre-trainingXinru Wei, Yuqing Li, Bin Wu. 1-5 [doi]
- Self-supervised Speaker Verification with Batch-scale Pseudo-labels CorrectionJunxu Wang, Zhihua Fang, Liang He 0003. 1-5 [doi]
- Low Rank and Sparse Fourier Structure in Recurrent Networks Trained on Modular AdditionAkshay Rangamani. 1-5 [doi]
- Artistic Image Aesthetics Assessment Assisted by Photographic Visual AttributesHaiyong Tang, Yihua Chen 0001, Xiaoping Liang, Lv Chen, Pengsheng Huang, Zhenjun Tang. 1-5 [doi]
- TAGMO: Temporal Control Audio Generation for Multiple Visual Objects Without TrainingXinyu Zhang, Keyu Fan, Yiran Wang, Yingshan Liang, Jiasheng Lu, Zhicheng Du, Qingyang Shi, Peiwu Qin. 1-5 [doi]
- COCO-OLAC: A Benchmark for Occluded Panoptic Segmentation and Image UnderstandingWenbo Wei, Jun Wang 0121, Abhir Bhalerao. 1-5 [doi]
- Robust Speech Recognition with Schrödinger Bridge-Based Speech EnhancementRauf Nasretdinov, Roman Korostik, Ante Jukic. 1-5 [doi]
- Sparse Bayesian Integrated CNN Framework for Enhanced Acoustic Source LocalizationPriyadarshini Dwivedi, Gyanajyoti Routray, Rajesh M. Hegde. 1-5 [doi]
- Full-text Error Correction for Chinese Speech Recognition with Large Language ModelZhiyuan Tang, Dong Wang, Shen Huang, Shidong Shang. 1-5 [doi]
- MMA-Net: Multi-Modal Attention Network for 2-D Object Detection in Autonomous DrivingAbhilash Gaur, Shubh Goel, Kanishk Goel, Seshan Srirangarajan, Po-Hsuan Tseng, Kai-Ten Feng. 1-5 [doi]
- Speech Separation for Low-Resource LanguagesMarvin Borsdorf, Zexu Pan, Pascal Himmelmann, Haizhou Li 0001, Tanja Schultz. 1-5 [doi]
- Mouth Articulation-Based Anchoring for Improved Cross-Corpus Speech Emotion RecognitionShreya G. Upadhyay, Ali N. Salman, Carlos Busso, Chi-Chun Lee. 1-5 [doi]
- Hierarchical Spatiotemporal Attention Network for Fine-grained Brain Cognitive State RecognitionYike Wu, Ning An, Zixuan Zeng, Youyong Kong. 1-5 [doi]
- Deep Sylvester Posterior Inference for Adaptive Compressed Sensing in Ultrasound ImagingSimon W. Penninga, Hans Van Gorp, Ruud J. G. van Sloun. 1-5 [doi]
- Towards a Single ASR Model That Generalizes to Disordered SpeechJimmy Tobin, Katrin Tomanek, Subhashini Venugopalan. 1-5 [doi]
- PTQ4ADM: Post-Training Quantization for Efficient Text Conditional Audio Diffusion ModelsJayneel Vora, Aditya Krishnan 0003, Nader Bouacida, Prabhu R. V. Shankar, Prasant Mohapatra. 1-5 [doi]
- HANet: A Harmonic Attention-Based Network for Singing Melody Extraction from Polyphonic MusicShijun Wang, Xiangzhu Kong, Hao Huang, Kai Wang, Ying Hu. 1-5 [doi]
- SwinGAN-AVSS: Audio-Visual Speech Synthesis Leveraging Swin Transformer-Enhanced Generative Adversarial NetworksSubhayu Ghosh, Swapnil Saha, Nanda Dulal Jana. 1-5 [doi]
- KAN v.s. MLP for Offline Reinforcement LearningHaihong Guo, Fengxin Li, Jiao Li, Hongyan Liu. 1-5 [doi]
- ChannelMixer: A Hybrid CNN-Transformer Framework for Enhanced Multivariate Long-Term Time Series ForecastingErlei Zhang, Wenxuan Yuan, Xiangsen Liu. 1-5 [doi]
- Knowledge Enhanced Multi-Domain Recommendations in an AI Assistant ApplicationElan Markowitz, Ziyan Jiang, Fan Yang, Xing Fan, Zheng Chen 0010, Greg Ver Steeg, Aram Galstyan. 1-5 [doi]
- Revisiting Neighborhood Aggregation in Graph Neural Networks for Node Classification using Statistical Signal ProcessingMounir Ghogho. 1-5 [doi]
- Dual-Modality Guided Artistic Style Transfer with Pre-trained Diffusion ModelsJiaxiong Liu, Xiaolong Xiong, Jun Zhou. 1-5 [doi]
- FG3DFormer: Fine-Grained 3D Shape Classification Based on Vision TransformerXiangyu Ma, Jing Bai 0004, Jinzhe Jiang, Bin Peng. 1-5 [doi]
- Covert and Potent: A Weather-Camouflaged Backdoor Attacks on Self-Supervised LearningYang Wei, Yonghao Yang, Bo Liu 0047, Bin Xiao 0002. 1-5 [doi]
- *Anfei Fan, Jun Yang 0025, Wei Li, Chiyu Zhang. 1-5 [doi]
- Training-Free Task Planning by Parsing Language Signals With Common SenseXianqi Zhang, WenRui Wang, Shitong Chai, Xingtao Wang, Xiaopeng Fan. 1-5 [doi]
- Opportunities and Challenges for Bluetooth LE Audio Assistive Listening SystemsIan C. Bruce, Steve Armstrong, Daniel J. Bosnyak, Hany Tawfik. 1-5 [doi]
- Improving Knowledge Base Question Answering via Retrieval Enhancement and Stepwise ReasoningDian Huang, Jianqi Gao 0001, Xiangfeng Luo, Hao Wu. 1-5 [doi]
- A Critical Assessment of Visual Sound Source Localization Models Including Negative AudioXavier Juanola, Gloria Haro, Magdalena Fuentes. 1-5 [doi]
- 3GPP IVAS Codec - Perspectives on Development, Testing and StandardizationStefan Bruhn, Tomas Toftgård, S. Döhla, H.-Y. Su, L. Laaksonen, T. Moriya, Stéphane Ragot, Hiroyuki Ehara, M. Szczerba, Imre Varga, A. Schevciw, Milan Jelinek. 1-5 [doi]
- Investigation of Spatial Self-Supervised Learning and Its Application to Target Speaker Speech RecognitionYoshiaki Bando, Samuele Cornell, Satoru Fukayama, Shinji Watanabe 0001. 1-5 [doi]
- SpectralCam: High-Resolution Low-Cost Spectral Imaging Using DSLR CamerasA. Paruchuri, Andres Ramirez-Jaime, Gonzalo R. Arce, A. Alrushud, Xu Ma, R. Radpour. 1-5 [doi]
- Comprehensive Feature Processing Based on Attention Mechanism for Co-Salient Object DetectionGuohua Lv, Mao Yuan, Zengbin Zhang, Zhengyang Zhang, Zhenhui Ding, Guangxiao Ma. 1-5 [doi]
- CMFNThinker: A Novel Cross-source Multi-modal Fake News Detection ModelKaijia Tian, Guozheng Rao, Xin Wang 0030, Mufan Yu, Jiayin Zhang, Li Zhang 0059. 1-5 [doi]
- SUVAD: Semantic Understanding Based Video Anomaly Detection Using MLLMShibo Gao, Peipei Yang, LinLin Huang. 1-5 [doi]
- Efficient Estimation of Kernel Matrix Spectral Norm using Random FeaturesYiting Cao, Shayan Shafaei, Luyuan Yang, Chao Lan. 1-5 [doi]
- A CT-based Prediction System for Determining Respiratory Support Level in COVID-19 PatientsAhmed Sharafeldeen, Hossam Magdy Balaha, Ibrahim Shawky Farahat, Mohammed Ghazal, James Connelly, Eric Van Bogaert, Ayman El-Baz. 1-5 [doi]
- Description-Based Controllable Text-to-Speech With Cross-Lingual Voice ControlRyuichi Yamamoto, Yuma Shirahata, Masaya Kawamura, Kentaro Tachibana. 1-5 [doi]
- Sample-level Self-paced Learning to Tackle Multimodal Imbalance ProblemYing Zhou, Xuefeng Liang, Yue Xu, Bowen Gao. 1-5 [doi]
- Fast Sparse DFT Computation for Arbitrary Length by Circular ConvolutionSoo-Chang Pei, Kuo-Wei Chang. 1-5 [doi]
- Twenty-Five Years of MIR Research: Achievements, Practices, Evaluations, and Future ChallengesGeoffroy Peeters, Zafar Rafii, Magdalena Fuentes, Zhiyao Duan, Emmanouil Benetos, Juhan Nam, Yuki Mitsufuji. 1-5 [doi]
- Towards Feature-Consistent Parameter Collaboration for Personalized Federated LearningXintong Lu, Jiahe Li, Yuchao Zhang, Wendong Wang. 1-5 [doi]
- DiffMEL: A large-scale difficulty-graded dataset for Multimodal Entity LinkingFang Wang 0011, Xiaoying Bai, Tianwei Yan 0001, Minghao Hu, Yi Liang. 1-5 [doi]
- MDNet: Multi-Decoder Network for Abdominal CT Organs SegmentationDebesh Jha, Nikhil Kumar Tomar, Koushik Biswas, Gorkem Durak, Matthew Antalek, Zheyuan Zhang, Bin Wang, Md Mostafijur Rahman, Hongyi Pan, Alpay Medetalibeyoglu, Vandan Gorade, Abhijit Das, Yury Velichko, Daniela P. Ladner, Amir Borhani, Ulas Bagci. 1-5 [doi]
- Targeted Password Guessing Using Neural Language ModelsJiahong Yang, Wenting Li, Haibo Cheng, Ping Wang. 1-5 [doi]
- DFT-Spread-Based OTFS Waveform Design With Good Peak-to-Average Power Ratio for Joint Sensing and CommunicationsZhiying Chen, Yongzhe Li, Ran Tao 0003. 1-5 [doi]
- Less is more: Efficient Scene Graph Generation with reparameterizationJonghwan Hong, Seonghyeok Noh, Bonhwa Ku, Hanseok Ko. 1-5 [doi]
- TROI: Cross-Subject Pretraining with Sparse Voxel Selection for Enhanced fMRI Visual DecodingZiyu Wang, Tengyu Pan, Zhenyu Li, Ji Wu, Xiuxing Li, Jianyong Wang 0001. 1-5 [doi]
- Bi-attention pyramid network for small defect with complex background in industrial detectionYihang Li, Zhiyuan Zou, Xu Liang. 1-5 [doi]
- Precisely Controllable Neural Speech SynthesisPaul Konstantin Krug, Christoph Wagner, Peter Birkholz, Timo Stich. 1-5 [doi]
- MoHGNN: Enhanced Heterogeneous Graph Neural Network via Metapath OptimizationTaiyao Zhang, Xingyu Fu, Yuxin Zhang, Qingyun Liu. 1-5 [doi]
- Why disentanglement-based speaker anonymization systems fail at preserving emotions?Ünal Ege Gaznepoglu, Nils Peters. 1-5 [doi]
- Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative DecodingBohan Li, Hankun Wang, Situo Zhang, Yiwei Guo, Kai Yu 0004. 1-5 [doi]
- PIN: A Prompt-based Implicit Sentiment Analysis Network for ChineseKun Bu, Yuanchao Liu, Wenbo Wang, Ziyi Cao. 1-5 [doi]
- Conditional Latent Diffusion-Based Speech Enhancement via Dual Context LearningShengkui Zhao, Zexu Pan, Kun Zhou 0003, Yukun Ma, Chong Zhang 0003, Bin Ma 0001. 1-5 [doi]
- Injecting Global Context for Multivariate Time Series Forecasting on Variable SubsetsXin-Yi Li, Yu-Bin Yang. 1-5 [doi]
- Enhancing Generalized EEG Classification with Decomposed Statistics-diverse Feature AugmentationYubin He, C. L. Philip Chen, Bianna Chen, Tong Zhang 0015. 1-5 [doi]
- Decentralized Online Ensembles of Gaussian Processes for Multi-Agent SystemsFernando Llorente 0001, Daniel Waxman 0002, Petar M. Djuric. 1-5 [doi]
- Low-Resolution Hierarchical Training for Efficient 3D Gaussian SplattingYilin Jin, Shaohui Li, Zhi Li, Yu Liu. 1-5 [doi]
- SoCov: Semi-Orthogonal Parametric Pooling of Covariance Matrix for Speaker RecognitionRongjin Li, Weibin Zhang, Dongpeng Chen, Jintao Kang, Xiaofen Xing. 1-5 [doi]
- PromptArtisan: Multi-instruction Image Editing in Single Pass with Complete Attention ControlKunal Swami, Raghu Chittersu, Pranav Adlinge, Rajeev Irny, Shashavali Doodekula, Alok Shukla. 1-5 [doi]
- Parameter-Efficient Federal-Tuning Enhances Privacy Preserving for Speech Emotion RecognitionHaijiao Chen, Huan Zhao 0003, Yingxue Gao, Yiming Liu, Zixing Zhang 0001. 1-5 [doi]
- Generative Diffusion Model-based Energy Management in Networked Energy SystemsXinyu Lu, Zhanbo Feng, Jiawei Sun, Jiong Lou, Chentao Wu, Wugedele Bao, Jie Li 0002. 1-5 [doi]
- Advancing Few-Shot Class-Incremental Learning with Virtual Prototype Guidance PromptingXiang Qiu, Huanjia Zhu, Xiaocheng Fang, Jun Liang, Bingzhi Chen, Hui Lin. 1-5 [doi]
- ATGnet: Adaptive Temporal Graph Network for EEG-enabled Sound Source Tracking in Cocktail Party ScenariosSaurav Pahuja, Gabriel Ivucic, Siqi Cai, Dashanka De Silva, Tanja Schultz, Haizhou Li 0001. 1-5 [doi]
- Adaptive Layered-Trust Robust Defense Mechanism for Personalized Federated LearningHe Wang, Zhen Xu, Yan Zhang, Yu Wang. 1-5 [doi]
- Mates of Cross Z-Complementary Pairs for Channel Estimation in Generalized SM-MIMO SystemShibsankar Das, Adrish Banerjee. 1-5 [doi]
- Maximum Mutual Information Estimation based Graph Attention Network for Knowledge Graph CompletionWenbin Zhang 0010, Shimei Luo, Zechen Meng, Mankun Zhao, Tianyi Xu, Jian Yu 0003, Jiale Mei, Mei Yu 0004. 1-5 [doi]
- Streaming Keyword Spotting Boosted by Cross-layer Discrimination ConsistencyYu Xi, Haoyu Li, Xiaoyu Gu, Hao Li, Yidi Jiang, Kai Yu 0004. 1-5 [doi]
- MTTM: Memory-Augmented with Mamba for 3D Medical Images AnalysisHongkai Wei, Yang Yang, Shijie Sun 0001, Huansheng Song, Keyu Guo, Yongfeng Bu. 1-5 [doi]
- Diffusion Augmentation Sub-center Modeling for Unsupervised Anomalous Sound Detection with Partially Attribute-Unavailable ConditionsJiawei Yin, Yu Gao, Wenbin Zhang, Tianyi Wang, Mingjun Zhang. 1-5 [doi]
- MVANet: Multi-Stage Video Attention Network for Sound Event Localization and Detection with Source Distance EstimationHengyi Hong, Qing Wang, Ruoyu Wei, Mingqi Cai, Xin Fang. 1-5 [doi]
- Enhancing Age-Related Robustness in Children Speaker VerificationVishwas M. Shetty, Jiusi Zheng, Steven M. Lulich, Abeer Alwan. 1-5 [doi]
- Unsupervised Word Discovery: Boundary Detection with Clustering vs. Dynamic ProgrammingSimon Malan, Benjamin van Niekerk, Herman Kamper. 1-5 [doi]
- Identity-Agnostic Learning for Deepfake Face DetectionXuan Zhou, Zongyong Deng, Qijun Zhao. 1-5 [doi]
- Segment-Recurrent Transformer with Multi-Scale Fusion for Long-Term Time Series ForecastingZiang Yang, Lingwei Wei, Biyu Zhou, Xuehai Tang, Ruixuan Li 0001, Songlin Hu 0001. 1-5 [doi]
- Distribution Alignment Informed Thresholding for Semi-Supervised Curvilinear Structure SegmentationYuhao Mo, Bo Peng, Bihan Wen, XuLei Yang, Ce Zhu, Xun Xu 0002. 1-5 [doi]
- CAMEL: Cross-Attention Enhanced Mixture-of-Experts and Language Bias for Code-Switching Speech RecognitionHe Wang 0022, Xucheng Wan, Naijun Zheng, Kai Liu, Huan Zhou 0004, Guojian Li, Lei Xie 0001. 1-5 [doi]
- Efficient Infrared Image Super-Resolution Reconstruction via Guided Filter Coefficients Estimation with Parallax Attention MechanismQingyao Wu, Bosheng Chen, Chen Li, Xiaotong Tu, Xinghao Ding, Yue Huang 0001. 1-5 [doi]
- BIGFR: Bridging Individual and Group Fairness in Recommendation SystemsYaorui Gan, Xuemin Wang, Tieyuan Liu, Liang Chang 0003, Qicang Gen, Yu Zeng. 1-5 [doi]
- Classification Error Bound for Low Bayes Error Conditions in Machine LearningZijian Yang, Vahe Eminyan, Ralf Schlüter, Hermann Ney. 1-5 [doi]
- TCTformer: Long-term forecasting with dual attention transformersLong Sun, Xiaoyan Gao, Kai Xia, Xuwei Hu, Yuan Feng. 1-5 [doi]
- Multi-hop Self-augmented Graph Contrastive Learning for Node ClassificationYutong Wang, Xiaofeng Meng, Minhao Zou, Siyang Leng. 1-5 [doi]
- Poisoning The Diffusion: A Simple and Robust Watermarking Method for Audio GenerationYi Tang. 1-5 [doi]
- Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient TuningXinlong Li, Weijieying Ren, Wei Qin, Lei Wang 0185, Tianxiang Zhao 0001, Richang Hong. 1-5 [doi]
- Prompt Fusion and Aspect-Oriented Filtration for Aspect-Based Multimodal Sentiment AnalysisJiachang Sun, Xiuhong Li, Fuxian Zhu. 1-5 [doi]
- Adaptive Acquisition in Bayesian Optimization with Agnostic EnsemblesAnand Ravishankar, Fernando Llorente 0001, Yuanqing Song, Petar M. Djuric. 1-5 [doi]
- SCAT: Shared-Convolution Adaptation Tuning for Foreground SegmentationKaiwen Li, Dezheng Gao, Zelin Yang, Xing Wei 0001. 1-5 [doi]
- Stereo Downmix in 3GPP IVAS for EVS CompatibilityTakehiro Moriya, Stéphane Ragot, Arnaud Lefort, Alexandre Guérin, Noboru Harada, Ryosuke Sugiura, Yutaka Kamamoto. 1-5 [doi]
- General Dynamic Regularization Federated Learning with Hybrid Sharpness-Aware MinimizationFengchun Zhang, Dongfen Li, Jinshan Lai, Yang Zhang, Fengli Zhang, Ruijin Wang. 1-5 [doi]
- ImmerseDiffusion: A Generative Spatial Audio Latent Diffusion ModelMojtaba Heydari, Mehrez Souden, Bruno Conejo, Joshua Atkins. 1-5 [doi]
- Mean-Field Aided QMIX: A Scalable and Flexible Q-Learning Approach for Large-Scale Agent GroupsEnze Zhang, Huaze Tang, Xiao-Ping Zhang 0002, Wenbo Ding 0001. 1-5 [doi]
- AdaptVC: High Quality Voice Conversion with Adaptive LearningJaehun Kim, Ji-Hoon Kim, Yeunju Choi, Tan Dat Nguyen, Seongkyu Mun, Joon Son Chung. 1-5 [doi]
- Noise-Resilient Unlimited Sampling and Recovery of Sparse SignalsGeethu Joseph. 1-5 [doi]
- Medical Image Segmentation via Sparse Coding DecoderLong Zeng 0005, Mingwei Zhu, Kaigui Wu, Zefang Li. 1-5 [doi]
- Advances in Microphone Array Processing and Multichannel Speech EnhancementGongping Huang, Jesper Rindom Jensen, Jingdong Chen, Jacob Benesty, Mads Græsbøll Christensen, Akihiko Sugiyama, Gary W. Elko, Tomas Gänsler. 1-5 [doi]
- A Unified Spatiotemporal Frequency Graph Neural Network for fMRI-based Brain Functional Connectivity AnalysisYulang Huang, Zhiyuan Ding, Guokai Duan, Yan Liu 0054, Xiangzhu Zeng, Zheng Wang, Yingying Xu, Ling Wang 0013. 1-5 [doi]
- Diffusion-based Unsupervised Audio-visual Speech EnhancementJean-Eudes Ayilo, Mostafa Sadeghi, Romain Serizel, Xavier Alameda-Pineda. 1-5 [doi]
- Diversified Augmentation with Domain Adaptation for Debiased Video Temporal GroundingJunlong Ren, Gangjian Zhang, Haifeng Sun, Hao Wang. 1-5 [doi]
- MTMDC-GAN: Self-Attention Driven Multi-Scale Temporal Synthesis with Multi-Domain Analysis and Contrastive LearningWeihai Zhi, Kejing He. 1-5 [doi]
- GMMCL: Adaptive Concept Drift in Data Streams with Gaussian Mixture Models based on Contrastive LearningHongwei Wu, Jin Pan, Rong Yang, Hong Zhang, Guang Shi, Zhuojun Jiang, Qingyun Liu. 1-5 [doi]
- Toward Robust Early Detection of Alzheimer's Disease via an Integrated Multimodal Learning ApproachYifei Chen, Shenghao Zhu, Zhaojie Fang, Chang Liu, Binfeng Zou, Linwei Qiu, Yuhe Wang, Shuo Chang, Fan Jia, Feiwei Qin, Jin Fan, Yong Peng 0001, Changmiao Wang. 1-5 [doi]
- SmartNet: One-shot Talking Head Synthesis via Subtle Motion and Appearance CompensationWei Hu, Yuzhu Ji, An Zeng, Dan Pan 0001, Yiqun Zhang 0006, Haijun Zhang 0002. 1-5 [doi]
- Let There Be Light: Robust Lensless Imaging Under External Illumination With Deep LearningEric Bezzam, Stefan Peters, Martin Vetterli. 1-5 [doi]
- FASTER: Face Attribute Sliders with Semantic RewardsJingyan Chen, Lanxiang Zhou, Han Fang, Zerun Feng, Chao Ban, Yaqi Li, Hao Sun, Jiani Hu. 1-5 [doi]
- Semi-Supervised Cognitive State Classification from Speech with Multi-View Pseudo-LabelingYuanchao Li, Zixing Zhang 0001, Jing Han 0010, Peter Bell 0001, Catherine Lai. 1-5 [doi]
- Efficient Global Attention and Correlation-Aware Fusion for Hyperspectral Image ClassificationHongkang Zhang, Shao-Lun Huang, Ercan Engin Kuruoglu. 1-5 [doi]
- Accelerating Convergence in Bounding Box Regression with a Refined IoU Loss FunctionEnhui Chai, Xingyu Li, Tianxiang Cui, Zheng Lu 0002, Fiseha Berhanu Tesema. 1-5 [doi]
- Disentangling Hierarchical Features for Anomalous Sound Detection Under Domain ShiftJian Guan 0001, Jiantong Tian, Qiaoxi Zhu, Feiyang Xiao, Hejing Zhang, Xubo Liu 0001. 1-5 [doi]
- Data Efficient Child-Adult Speaker Diarization with Simulated ConversationsAnfeng Xu, TianTian Feng, Helen Tager-Flusberg, Catherine Lord, Shrikanth Narayanan. 1-5 [doi]
- KLFormer: Karhunen-Loève Transform for Robust 3D Human Pose EstimationXin Zeng, Haonan Luo, Zihang Wang, Sijia Li, Leyu Zhang, Tianrui Li 0001. 1-5 [doi]
- Fioma: Towards Open-Set Semi-Supervised Specific Emitter IdentificationQingyun Xu, Lixiang Liu, Xin Zhou. 1-5 [doi]
- Stacking U-Nets in U-shape: Redesigning the Information Flow in Model-based Networks for MRI ReconstructionXiaoyu Qiao, Weisheng Li 0001, Bin Xiao 0002, Yuping Huang, Lijian Yang. 1-5 [doi]
- Multi-Descriptor Mesh Animation CompressionTai Qin, Chunyang Fu, Ge Li 0002, Shan Liu 0001. 1-5 [doi]
- Effective Context Modeling Framework for Emotion Recognition in ConversationsVan Cuong Tran, Thanh V. T. Tran, Van Nguyen, Truong-Son Hy. 1-5 [doi]
- Identifying and Mitigating Mismatched Language Code in Multilingual ASRJaeyoung Kim, Sepand Mavandadi, Kartik Audhkhasi, Shikhar Bharadwaj, Brian Farris, Tongzhou Chen, Bhuvana Ramabhadran, Sriram Ganapathy. 1-5 [doi]
- Capturing Rich Behavior Representations: A Dynamic Action Semantic-Aware Graph Transformer for Video CaptioningCaihua Liu, Xu Li, Wenjing Xue, Wei Tang, Xia Feng. 1-5 [doi]
- Grey Wolf Optimizer Algorithm Based Active Noise Control Without Secondary Path IdentificationZhehua Duan, Tian Zhang, Fei Xu, Chenlin Lu. 1-5 [doi]
- Robust Hybrid Convolutional Beamspace for High Resolution DOA EstimationRubin Jose Peter, Sooraj K. Ambat. 1-5 [doi]
- Audio Diffusion with Large Language ModelsYinghui Huang, Kyle Kastner, Kartik Audhkhasi, Bhuvana Ramabhadran, Andrew Rosenberg. 1-5 [doi]
- Cohort-Sensitive Labeling: An Effective Approach for Enhancing ASR PerformanceJonghwan Na, Mark Hasegawa-Johnson, Bowon Lee. 1-5 [doi]
- ADD: A Detection Method for Image-Processing Adversarial DefensesDa Zhang, Jiazheng Sun, Chenxiao Xia, Ruinan Ma, Jun Zheng. 1-5 [doi]
- Avoiding Domain Drift and Constant Predictions with Diffusion Enhanced Vector-Quantized Autoencoders for Temperature PredictionsNina Lampl, João Machado de Freitas, Alexander Fuchs 0009, Benedikt Brezina, Michael Klitzsch, Franz Pernkopf. 1-5 [doi]
- Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio ReasoningChun-Yi Kuan, Hung-yi Lee. 1-5 [doi]
- De-confusing Hard Samples for Text Semantic HashingTian Huang, Jian Wang 0118, Yuqing Sun 0001. 1-5 [doi]
- Enhancing Cross-Domain Slot Filling with Joint LLM Data Generation and Data CurationPeijie Huang, Weizhen Li, Yuhong Xu, Junbao Huang. 1-5 [doi]
- A Novel Split Deep Unfolding Transformer for Pan-SharpeningJiannan Chen, Zhizhuo Jiang, Xueqian Wang 0002, Yaowen Li, Huajie Wang, Yu Liu 0005. 1-5 [doi]
- Cluster-Perceptive Graph Contrastive Learning for Community DetectionHongbin Li, Fanyu Han, Wei Wang 0033. 1-5 [doi]
- Community-entropy Based Graph Structure Learning for Topology-imbalanceXinyi Wang, Ling Guo, Hui Yan. 1-5 [doi]
- Performance Bounds for Single-Bit AOA-Based RF Source LocalizationShaunak Kubal, Anastasia Lavrenko, André Kokkeler. 1-5 [doi]
- EPE-P: Evidence-based Parameter-efficient Prompting for Multimodal Learning with Missing ModalitiesZhe Chen, Xun Lin, Yawen Cui, Zitong Yu. 1-5 [doi]
- Diagram Formalization Enhanced Multi-Modal Geometry Problem SolverZeren Zhang, Jo-Ku Cheng, Jingyang Deng, Lu Tian, Jinwen Ma, Ziran Qin, Xiaokai Zhang, Na Zhu, Tuo Leng. 1-5 [doi]
- Age of Gossip with the Push-Pull ProtocolArunabh Srivastava, Thomas Jacob Maranzatto, Sennur Ulukus. 1-5 [doi]
- Tip the Scales: Achieving Balance in Adversarial Examples Across ModalitiesZhenbo Shi, Zhidong Yu, Yuxuan Zhang 0007, Shuchang Wang, Xiaoman Liu, Wei Yang 0011, Liusheng Huang. 1-5 [doi]
- Weakly Supervised Phonological Features for Pathological Speech AnalysisJenthe Thienpondt, Geoffroy Vanderreydt, Abdessalem Hammami, Kris Demuynck. 1-5 [doi]
- A Self-supervised UAV Detection Method Based on Channel State InformationPengxuan Gao, Disheng Xiao, Ruiheng Zou, Kai Ying. 1-5 [doi]
- LossControl: Defending Membership Inference Attacks by Controlling the LossBo Yang, Hongwei Yang, Renhao Lu, Hui He, Weizhe Zhang, Haoyu He, Rahul Yadav. 1-5 [doi]
- EvaSR: Rethinking Efficient Visual Attention Design for Image Super-ResolutionZhijian Wu, Chenhan Zhang, Dingjiang Huang. 1-5 [doi]
- U2AD: A UAV-Assisted Autonomous Driving Framework for Enhancing Vehicle Risk Perception and Decision-Making CapabilitiesChuangxin Li, Yongqiang Gao, Rao Fu, Jia Chen. 1-5 [doi]
- Neural Ambisonic Encoding For Multi-Speaker Scenarios Using A Circular Microphone ArrayYue Qiao, Vinay Kothapally, Meng Yu 0003, Dong Yu 0001. 1-5 [doi]
- Group-CLIP Uncertainty Modeling for Group Re-IdentificationQingxin Zhang, Haoyan Wei, Yang Qian. 1-5 [doi]
- Unifying Within and Across: Intra-Modality Multi-View Fusion and Inter-Modality Alignment for Knowledge Graph CompletionZhen Li, Jibin Wang, Zhuo Chen, Kun Wu, Meng Ai, Leike An, Liqiang Wang, Haoxuan Li. 1-5 [doi]
- KAFQN: Kolmogorov-Arnold Fuzzy-guided Q-Network in Reinforcement LearningBo Zhao, Zhizhong Liu, Zhuo Tang. 1-5 [doi]
- AceParse: A Comprehensive Dataset with Diverse Structured Texts for Academic Literature ParsingHuawei Ji, Cheng Deng, Bo Xue, Zhouyang Jin, Jiaxin Ding 0001, Xiaoying Gan, Luoyi Fu, Xinbing Wang, Chenghu Zhou. 1-5 [doi]
- FruitMMBench: A Multi-modal Benchmark for Fruit Quality AssessmentJiawei Chen, Gong Huang, Liu Liu, Zhenbo Xu, Qinghong Yang. 1-5 [doi]
- Towards A Distribution Alignment Framework for Incomplete Data ClassificationLinqing Huang, Jinfu Fan, Shilin Wang, Gongshen Liu, Shouxuan Liu. 1-5 [doi]
- Using Emotionally Rich Speech Segments for Depression PredictionJiawei Yu, Heysem Kaya. 1-5 [doi]
- Implanting Robust Watermarks in Latent Diffusion Models for Video GenerationXiaoHang Liu, Heng Chang, Jinfu Wei, Lei Zhu, Li Liu, Likun Li, Shiji Zhou, Chengyuan Li, Di Xu, Wei Gao. 1-5 [doi]
- Dense Point Clouds Matter: Dust-GS for Scene Reconstruction from Sparse ViewpointsShen Chen, Jiale Zhou, Lei Li. 1-5 [doi]
- A Block Term Decomposition Model Based Algorithm for Tensor Completion of Multidimensional Harmonic SignalsLei Wang, Xiao-Feng Gong, Xi-Yuan Liu, Wei Feng, Qiu-Hua Lin. 1-5 [doi]
- Multi-Task Model Fusion via Adaptive MergingLuming Chen, Ziwei Xiang, Kai Lei, Xu-Yao Zhang. 1-5 [doi]
- Local Statistics for Generative Image DetectionYung Jer Wong, Teck Khim Ng. 1-5 [doi]
- Practical Radar Sensing Using Two Stage Neural Network for Denoising OTFS SignalsAshok S. Kumar, Sheetal Kalyani. 1-5 [doi]
- Robust and Efficient Text-based Speech Editing using Noise Conditioning and Rectified FlowHaowen Yin, Kai Wang, Hongli Yang, Hao Huang 0009, Wushour Silamu. 1-5 [doi]
- FiTGAN: Content Fusion with Style Transformation for Few-shot Image GenerationYingbo Zhou, Pengyu Zhang, Yutong Ye 0001, Zhihao Yue, Xian Wei, Mingsong Chen 0001. 1-5 [doi]
- Adaptive Gradient-Based Timesurface for Event-based DetectionZiling Wang, Ziming Wang, Shuang Lian, Rui Yan 0005, Huajin Tang. 1-5 [doi]
- IPNet: Interpretable Prototype Network for Multi-Source Domain AdaptationRui Chen, Haifeng Xia, Siyu Xia, Ming Shao, Zhengming Ding. 1-5 [doi]
- Dynamically Optimize MTD Strategy in Satellite Computing Systems Using A2C Reinforcement LearningLin Zhang, Yunchuan Guo, Shoukun Guo, FengHua Li, Faqun Jiang, Liang Fang. 1-5 [doi]
- Compgen: Synthesis and Generation of Faces From EdgemapsRuban Vishnu Pandian, Abhiram Rao Gorle, Karthick Krishna M, Nambi Seshadri, Ravinder David Koilpillai. 1-5 [doi]
- ForensiCam-215K: A Large Scale Image and Video Dataset for Forensic AnalysisSuwen Du, Pengpeng Yang 0001, Daniele Baracchi, Jinglian Jin, Dasara Shullani, Alessandro Piva. 1-5 [doi]
- Reduced Effectiveness of Kolmogorov-Arnold Networks on Functions with NoiseHaoran Shen, Chen Zeng, Jiahui Wang, Qiao Wang. 1-5 [doi]
- CJST: CTC Compressor based Joint Speech and Text Training for Decoder-Only ASRWei Zhou, Junteng Jia, Leda Sari, Jay Mahadeokar, Ozlem Kalinli. 1-5 [doi]
- Incorporating Improved Sinusoidal Threshold-based Semi-supervised Method and Diffusion Models for Osteoporosis DiagnosisWenchi Ke, Hu Chen, Yi Zhang, Xiong Deng. 1-5 [doi]
- Attention-Driven Causal Discovery: From Transformer Matrices to Granger Causal Graphs for Non-Stationary Time-series DataJiageng Zhu, Kehao Li, Zheda Mai, Hanchen Xie, Wael AbdAlmageed, Zubin Abraham. 1-5 [doi]
- Codar: Complex-valued Neural Network for Crossing-Floor Intrusion Detection via WiFiWeiting Ou, Yipeng Liu 0001, Zhijie Sun, Bing Li, Le Zhang, Ce Zhu. 1-5 [doi]
- Temporal Dynamics Decoupling with Inverse Processing for Enhancing Human Motion PredictionJiexin Wang, Yiju Guo, Bing Su 0001. 1-5 [doi]
- Earbuds Orientation Alignment Based on Markov Chain Monte Carlo SamplingXianghao Zhan, Nafiul Rashid, Ebrahim Nemati, Mohsin Y. Ahmed, Jilong Kuang. 1-5 [doi]
- Multi-modal Streaming ASR in Cross-talk Scenario for Smart GlassesYa Jiang, Hongbo Lan, Qing Wang, Shutong Niu. 1-5 [doi]
- A Risk Prediction Model for Real Estate Corporations Using High-Target Semantic BERT and Improved GRUXiaoyan Ma, Peng Zhu, Qinyuan Liu, Zidong Wang 0001. 1-5 [doi]
- Conditional Deep Canonical Time WarpingRan Eisenberg, Afek Steinberg, Ofir Lindenbaum. 1-5 [doi]
- Simultaneous Music Separation and Generation Using Multi-Track Latent Diffusion ModelsTornike Karchkhadze, Mohammad Rasool Izadi, Shlomo Dubnov. 1-5 [doi]
- Known-Plaintext Attacks to Thumbnail-Preservation Encryption Using Pix2pix Generative Adversarial NetworkZhiyang Li, Dong Xie 0005, Sanqiang Liu, Fulong Chen 0002, Peng Hu, Taochun Wang. 1-5 [doi]
- Scalable Speech Enhancement With Dynamic Channel PruningRiccardo Miccini, Clément Laroche, Tobias Piechowiak, Luca Pezzarossa. 1-5 [doi]
- CT Image Prediction Of PD-1 Gastric Cancer Patients Based On The PLSG FrameworkChaoyu Yuan, Nan Wang, Mohan Wang, Yingwei Xue. 1-5 [doi]
- Query-Aware Temporal Aggregation Network for Temporal Knowledge Graph ReasoningShihao Liu, Xiaofei Zhou. 1-5 [doi]
- MIFAE-Forensics: Masked Image-Frequency AutoEncoder for DeepFake DetectionHanyi Wang, Zihan Liu, Shilin Wang. 1-5 [doi]
- Domain-Independent Automatic Generation of Descriptive Texts for Time-Series DataKota Dohi, Aoi Ito, Harsh Purohit, Tomoya Nishida, Takashi Endo, Yohei Kawaguchi. 1-5 [doi]
- Power in Unity: Combining in-Domain and out-of-Domain Pre-Training Strategies for EEG-Based Person IdentificationChristos Garoufis, Marios Glytsos, Ioanna Chourdaki, Panagiotis Paraskevas Filntisis, Petros Maragos. 1-2 [doi]
- SwapTalk: Audio-Driven Talking Face Generation with One-Shot Customization in Latent SpaceZeren Zhang, Haibo Qin, Jiayu Huang, Jo-Ku Cheng, Yixin Li, Hui Lin, Yitao Duan, Jinwen Ma. 1-5 [doi]
- Keeping Your Eyes on the Fingertip: A Two-Stage In-Air Handwritten Recognition MethodZeyu Qiu, Weiqiang Wang. 1-5 [doi]
- Diversity Matters: Co-training for Semi-Supervised Change Detection in Remote Sensing ImagesZan Mao, Xin Li, Ze Luo, Yingchao Piao. 1-5 [doi]
- Low-Complexity Own Voice Reconstruction for Hearables with an In-Ear MicrophoneMattes Ohlenbusch, Christian Rollwage, Simon Doclo. 1-5 [doi]
- Regret Optimization Experience Replay in Off-Policy Reinforcement LearningJie Zhang, Yirong Yao, Wei He, Yiqun Niu, Chongjun Wang. 1-5 [doi]
- Tool Playgrounds: A Comprehensive and Analyzable Benchmark for LLM Tool InvocationZhiwei Dong, Ruihao Gong, Yang Yong, Shuo Wu, Yongqiang Yao, Song-Lu Chen, Xu-Cheng Yin. 1-5 [doi]
- Collaborative Inference Acceleration with Non-Penetrative Tensor PartitioningZhibang Liu, Chaonong Xu, Zhenjie Lv, Zhizhuo Liu, Suyu Zhao. 1-5 [doi]
- LV-ReID: Large Language-Vision Alignment Model for Text-based Person Re-identificationYinghui Xia, Chao Wang, Jinsong Yang. 1-5 [doi]
- Few-Shot Object Detection in Satellite Imagery with Feature Fusion Pyramid and Adaptive Region Proposal NetworksTuoyu Feng, Weiping Li, Zhijie Tan, Liwen Zhang, Xiang Yuan, Xu Chu 0001. 1-5 [doi]
- LHGNN: Local-Higher Order Graph Neural Networks For Audio Classification and TaggingShubhr Singh, Emmanouil Benetos, Huy Phan, Dan Stowell. 1-5 [doi]
- Automated Extraction of Spatio-Semantic Graphs for Identifying Cognitive ImpairmentSi Ioi Ng, Pranav S. Ambadi, Kimberly D. Mueller, Julie Liss, Visar Berisha. 1-5 [doi]
- Estimation of Doppler, Range, and Direction of Targets in Wideband Bistatic Automotive RadarAli Moussa, Wei Liu 0001. 1-5 [doi]
- Graph Neural Networks for Parkinson's Disease DetectionShakeel A. Sheikh, Yacouba Kaloga, Md. Sahidullah, Ina Kodrasi. 1-5 [doi]
- Boolean matrix compressed sensingQiang Liu, Mahdi Soleymani, Hessam Mahdavifar. 1-5 [doi]
- Out-of-Distribution Detectors: Not Yet Primed for Practical DeploymentChangshun Wu, Wendi Ding, Xiaowei Huang 0001, Saddek Bensalem. 1-5 [doi]
- Integrating Concept Associations for Query Focused Knowledge SummarizationJian Wang, Zhi Liu, Yuqing Sun, Xin Li. 1-5 [doi]
- FGU3R: Fine-Grained Fusion via Unified 3D Representation for Multimodal 3D Object DetectionGuoxin Zhang, Ziying Song, Lin Liu, Zhonghong Ou. 1-5 [doi]
- Using RLHF to align speech enhancement approaches to mean-opinion quality scoresAnurag Kumar, Andrew Perrault, Donald S. Williamson. 1-5 [doi]
- Leveraging Mixture of Experts for Improved Speech Deepfake DetectionViola Negroni, Davide Salvi, Alessandro Ilic Mezza, Paolo Bestagini, Stefano Tubaro. 1-5 [doi]
- Quaternion CNN With Salient Features for Color Image DenoisingYi Liu, Qiyu Jin, Jie Yang. 1-5 [doi]
- Accelerometer-Based Person-in-Bed Detection ChallengeLauren Mentzer, Ravi Kiran Raman, Atulya Yellepeddi. 1-2 [doi]
- The Conformer Encoder May Reverse the Time DimensionRobin Schmitt, Albert Zeyer, Mohammad Zeineldeen, Ralf Schlüter, Hermann Ney. 1-5 [doi]
- Generalizable Indoor Path Loss PredictionCheick Tidiani Cisse, Oumaya Baala, Valéry Guillet, François Spies, Alexandre Caminada. 1-2 [doi]
- First-frame Supervised Video Polyp Segmentation via Propagative and Semantic Dual-teacher NetworkQiang Hu, Mei Liu, Qiang Li 0018, Zhiwei Wang 0002. 1-5 [doi]
- A Domain Adaptation Framework for Speech Recognition Systems with Only Synthetic dataMinh Tran, Yutong Pang, Debjyoti Paul, Laxmi Pandey, Kevin Jiang, Jinxi Guo, Ke Li 0023, Shun Zhang, Xuedong Zhang, Xin Lei. 1-5 [doi]
- Binary Stochastic Flip Optimization for Training Binary Neural NetworksTatsukichi Shibuya, Nakamasa Inoue, Rei Kawakami, Ikuro Sato. 1-5 [doi]
- PointActionCLIP: Preventing Transfer Degradation in Point Cloud Action Recognition with a Triple-Path CLIPWei Tao, Shenglin He, Xiaoyang Qu, Jiguang Wan, Jianzong Wang. 1-5 [doi]
- FDDSGCN: Fractional Decoupling Dynamic Spatiotemporal Graph Convolutional Network for Traffic ForecastingJinpeng Xu, Chunna Zhao, Jing Yang, Yaqun Huang, Yaoyuan Yang, Lip Yee Por. 1-5 [doi]
- Multi-Grained Feature Pruning for Video-Based Human Pose EstimationZhigang Wang, Shaojing Fan, Zhenguang Liu, Zheqi Wu, Sifan Wu 0001, Yingying Jiao. 1-5 [doi]
- A Bayesian Perspective on Uncertainty Quantification for Estimated Graph SignalsLennard Rompelberg, Michael T. Schaub. 1-5 [doi]
- Self-supervised Contrastive Pre-training for Dry Electrode EEG Emotion Recognition via Cross Device Representation ConsistencyMeihong Zhang, Shaokai Zhao, Zhiguo Luo, Liang Xie 0012, Tiejun Liu, Dezhong Yao 0001, Ye Yan 0001, Erwei Yin. 1-5 [doi]
- HBRW: A Hardness-Based Re-Weighting Approach for Long-tailed Medical Image ClassificationYongheng Xu, Hanjiang Lai. 1-5 [doi]
- High-Fidelity Single-View Reconstruction of Indoor Scenes using 3D Shape Prior Template and Pixel-Aligned DeformationXiaohao Zhang, Xiaolin He, Jialin Wu, Xu Wang, Zhuo Tang, Ruihui Li. 1-5 [doi]
- FreeAlign: Superior Text-Image Alignment by Modulating Prompt AttentionYibo Zhang, Dahua Gao, Feng Xie, Minxi Yang, Wenlong Wang, Ruichao Liu. 1-5 [doi]
- ReCLAP: Improving Zero Shot Audio Classification by Describing SoundsSreyan Ghosh, Sonal Kumar, Chandra Kiran Reddy Evuru, Oriol Nieto, Ramani Duraiswami, Dinesh Manocha. 1-5 [doi]
- Towards Robust Subject Identification From EEG Segments With Data Augmentation TechniquesMeka Nani, Supriyo Banerjea, Raghavan R. 1-2 [doi]
- Enhancing Image Editing with Chain-of-Thought Reasoning and Multimodal Large Language ModelsMengxue Kang, Xinyu Zhang, Fei Wei, Shuang Xu, Yuhe Liu. 1-5 [doi]
- Low Complexity Super Resolution for Resampling-based Video CodingChaoyi Lin, Yue Li 0015, Junru Li, Kai Zhang 0007, Li Zhang 0006. 1-5 [doi]
- TrackFusion: Enhancing Multi-Object Tracking With Temporal Trajectory Modeling and Frame-Integrated DetectionXinhao Zhang, Shuai Liu, Bingyang Wang, Jiaojiao Dai, Jinqing Qi, Huchuan Lu, You He. 1-5 [doi]
- Multi-scale Feature Interaction and Adaptive Experts for Panoptic Segmentation in Remote Sensing ImagesZhenkun Sun, Jia Liu 0020, Wenhua Zhang, Fang Liu, Jingxiang Yang, Liang Xiao 0001. 1-5 [doi]
- VisAgent: Narrative-Preserving Story Visualization FrameworkSeungkwon Kim, Gyutae Park, Sangyeon Kim, Seung-Hun Nam. 1-5 [doi]
- A Deformable-Based Source-Free Unsupervised Domain Adaptation Method for Cervical Cell DetectionQiao Pan, Yawen Xue, Bin Yang. 1-5 [doi]
- Extremum Encoding for Joint Baseband Signal Compression and Time-Delay Estimation for Distributed SystemsAmir Weiss, Yuval Kochman, Gregory W. Wornell. 1-5 [doi]
- Digital Operating Mode Classification of Real-World Amateur Radio TransmissionsMaximilian Bundscherer, Thomas H. Schmitt, Ilja Baumann, Tobias Bocklet. 1-5 [doi]
- SiQA: A Large Multi-Modal Question Answering Model for Structured Images Based on RAGJiawang Liu, Ye Tao 0002, Fei Wang, Hui Li 0010, Xiugong Qin. 1-5 [doi]
- FCoDT-Net: A Novel Framework for High-Precision Medical Image Segmentation Using Contextual Distillation TransformerYutao Qin, Sizhe Yang, Bang Hu, Wei Ren. 1-5 [doi]
- Cross-attention Inspired Selective State Space Models for Target Sound ExtractionDonghang Wu, Yiwen Wang, Xihong Wu, Tianshu Qu. 1-5 [doi]
- Efficient Localized Perception for Resource-Constrained Vision SystemsA. V. Subramanyam, Niyati Singal, Vinay K. Verma. 1-5 [doi]
- Prototypical Part Transformer for Interpretable Image RecognitionAnni Yu, Yu-Bin Yang. 1-5 [doi]
- DapPep: Domain Adaptive Peptide-agnostic Learning for Universal T-cell Receptor-antigen Binding Affinity PredictionJiangbin Zheng 0002, Qianhui Xu, Ruichen Xia, Stan Z. Li. 1-5 [doi]
- Low Complexity Rate Splitting Approach in RIS-Aided Systems Based on Channel StatisticsSadaf Syed, Michael Joham, Wolfgang Utschick. 1-5 [doi]
- NCL-CIR: Noise-aware Contrastive Learning for Composed Image RetrievalPeng Gao, Yujian Lee, Zailong Chen, Xubo Liu, Hui Zhang, Yiyang Hu, Guquan Jing. 1-5 [doi]
- SSM2Mel: State Space Model to Reconstruct Mel Spectrogram from the EEGCunhang Fan, Sheng Zhang, Jingjing Zhang, Zexu Pan, Zhao Lv. 1-5 [doi]
- Disparity-Guided Cross-View Transformer For Stereo Image Super-ResolutionBingting Li, Wenjing Shang, Yongshun Gong, Qiangchang Wang, Xinxin Zhang 0004, Yilong Yin. 1-5 [doi]
- Part in Part Embedding Network for Zero-Shot LearningZhexian Zhou, Liang Xiao, Guo-Sen Xie. 1-5 [doi]
- APTSniffer: Detecting APT Attack Traffic Using Retrieval-Augmented Large Language ModelsHongbo Xu, Chengxiang Si, Zhou Zhou 0007, Chenxu Wang, Peishuai Sun, Qingyun Liu 0001. 1-5 [doi]
- The Sound of Language: A Bilingual Analysis of Voice Conversion and Text-to-Speech SynthesisJeong-Eun Choi, Karla Schäfer, Martin Steinebach. 1-5 [doi]
- TA-V2A: Textually Assisted Video-to-Audio GenerationYuhuan You, Xihong Wu, Tianshu Qu. 1-5 [doi]
- Speech Emotion Recognition Based on Large-Scale Automatic Speech RecognizerRyo Fukuda, Takatomo Kano, Atsushi Ando, Atsunori Ogawa. 1-5 [doi]
- Collision-less and Balanced Sampling for Language-Queried Audio Source SeparationBinh Thien Nguyen, Daiki Takeuchi, Masahiro Yasuda, Daisuke Niizumi, Noboru Harada. 1-5 [doi]
- DocVideoQA: Towards Comprehensive Understanding of Document-Centric Videos through Question AnsweringHaochen Wang, Kai Hu, Liangcai Gao. 1-5 [doi]
- Realistic Real-Time Talking Head Synthesis with Grid Encoding and Progressive ConditioningZhiling Ye, Liang-Guo Zhang, Dingheng Zeng, Quan Lu, Ning Jiang. 1-5 [doi]
- DARN: An Attention-Based Neural Network Using Residual Blocks for Sleep Micro-Events DetectionFei Wang, Zhuorong Li, Jingcong Li. 1-5 [doi]
- CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMsJunlin Lv, Yuan Feng, Xike Xie, Xin Jia, Qirong Peng, Guiming Xie. 1-5 [doi]
- UMSSS: A Visual Scene Semantic Segmentation Dataset for Underground MinesJiawen Wang, Chenfei Liao, ZhongQi Zhao, Lianghui Li, Xuan Gao, Suna Pan, Fangzhen Shi, Yudong Wang, Weijie Zhou, Kehu Yang. 1-5 [doi]
- 3D Gaussian Splatting with Grouped Uncertainty for Unconstrained ImagesHao-Yu Hou, Chia-Chi Hsu, Yu-Chen Huang, Mu-Yi Shen, Wei-Fang Sun, Cheng Sun 0004, Chia-Che Chang, Yu-Lun Liu 0001, Chun-Yi Lee. 1-5 [doi]
- Speech Data Selection for Efficient ASR Fine-Tuning using Domain Classifier and Pseudo-Label FilteringPradeep Rangappa, Juan Zuluaga-Gomez, Srikanth R. Madikeri, Andrés Carofilis, Jeena Prakash, Sergio Burdisso, Shashi Kumar, Esaú Villatoro-Tello, Iuliia Nigmatulina, Petr Motlícek, Karthik Pandia, Aravind Ganapathiraju. 1-5 [doi]
- InfoHarmonizer Graph Contrastive ClusteringZhongyang Zhou, Haomin Wu, Zihao Feng, Feiyu Chen, Bin Tang. 1-5 [doi]
- Learning Diffusion Model from Noisy Measurement using Principled Expectation-Maximization MethodWeimin Bai, Weiheng Tang, Enze Ye, Siyi Chen, Wenzheng Chen, He Sun 0010. 1-5 [doi]
- SELMA: A Speech-Enabled Language Model for Virtual Assistant InteractionsDominik Wagner 0002, Alexander W. Churchill, Siddharth Sigtia, Erik Marchi. 1-5 [doi]
- Multi-Relational Geometric Regularization Framework for Multi-Modal Emotion Recognition in ConversationTao Zhang, Zhenhua Tan. 1-5 [doi]
- Sparse-to-Dense Body Surface Potential Mapping using A Structural Similarity-Enhanced Attention GANAyan Mukherjee, Sawon Pratiher, Oishee Mazumder, Aniruddha Sinha. 1-5 [doi]
- Breaking Through the Spike: Spike Window Decoding for Accelerated and Precise Automatic Speech RecognitionWei Zhang, Tian-Hao Zhang, Chao Luo, Hui Zhou, Chao Yang, Xinyuan Qian 0001, Xu-Cheng Yin. 1-5 [doi]
- Enhanced Control for Diffusion Bridge in Image RestorationConghan Yue, Zhengwei Peng, Junlong Ma, Dongyu Zhang 0002. 1-5 [doi]
- Semi-Supervised Knowledge Distillation Framework towards Lightweight Large Language Model for Spoken Language TranslationTonmoy Rajkhowa, Amartya Roy Chowdhury, Achyut Mani Tripathi, Sanjeev Sharma, Om Jee Pandey. 1-5 [doi]
- MAITFuse: Multi-Dimension Adaptive Interaction Transform Network For Infrared-visible Image FusionYabin Sun, Wentai Lei, Ziyi Zhang, Jiongchang Liu, Chenxu Li, Tao Zhang. 1-5 [doi]
- High-Fidelity Music Vocoder using Neural Audio CodecsLuca A. Lanzendörfer, Florian Grötschla, Michael Ungersböck, Roger Wattenhofer. 1-5 [doi]
- Unsupervised UAV 3D Trajectories Estimation with Sparse Point CloudsHanfang Liang, Yizhuo Yang, Jinming Hu, Jianfei Yang, Fen Liu, Shenghai Yuan. 1-5 [doi]
- Group-wise Semantic-enhanced Interaction Network for Remote Sensing Spatio-Temporal FusionBaoluo Zhu, Shenglong Hu, Huihui Song, Kaihua Zhang 0001. 1-5 [doi]
- Relaxing Distillation Constraints for Improved New Class Learning in Continual Semantic SegmentationZheng Ti, Xuze Hao, Renhai Chen. 1-5 [doi]
- Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTCJiawen Kang 0002, Lingwei Meng, Mingyu Cui, Yuejiao Wang, Xixin Wu, Xunying Liu, Helen Meng. 1-5 [doi]
- Dynamic Graph Convolutional Networks with Spatiotemporal Missing Pattern AwarenessBingheng Pang, Zhuoxuan Liang, Wei Li 0109, Xiangping Zheng, Rokia Abdein. 1-5 [doi]
- First-order State Space Model for Lightweight Image Super-resolutionYujie Zhu, Xinyi Zhang, Yekai Lu, Guang Yang, Faming Fang, Guixu Zhang. 1-5 [doi]
- DepMamba: Progressive Fusion Mamba for Multimodal Depression DetectionJiaxin Ye, Junping Zhang, Hongming Shan. 1-5 [doi]
- Dynamic Incentive Model for Federated Learning Model Trading via Evolutionary Game TheoryTianxiang Chen, Feng Wang, Wenjie Hou, ShaoTing Tang, Zhiming Zheng 0001. 1-5 [doi]
- Multi-Feature Audio Fusion for Nonverbal Vocalization ClassificationSiddhant Bikram Shah, Kristina T. Johnson. 1-5 [doi]
- Spectral Enhancement and Pseudo-Anchor Guidance for Infrared-Visible Person Re-IdentificationYiyuan Ge, Zhihao Chen, Ziyang Wang, Jiaju Kang, Mingya Zhang. 1-5 [doi]
- Passive Non-Line-of-Sight Imaging with Parallel EncoderXiaolong Du, Ruixu Geng, Jiarui Zhang, Yan Chen, Yang Hu. 1-5 [doi]
- Investigating F0 Estimation in Speech Synthesis from Real-time MRI Articulatory DataYuto Otani, Shun Sawada, Hidefumi Ohmura, Kouichi Katsurada. 1-5 [doi]
- RTF Estimation Using Riemannian Geometry for Speech Enhancement in the Presence of InterferencesOr Ronai, Yuval Sitton, Amitay Bar, Ronen Talmon. 1-5 [doi]
- Clinically Robust Polyp Segmentation: Enhanced Generalization and Perturbation ResistanceShanchuan Wang, Fang Yuan, Jian-Nan Su, Min Gan. 1-5 [doi]
- ECSNN: Spiking Neural Networks for Efficient Exposure Correction in Endoscopy ImagingJun Zhang, Zhuoran Zheng, Jingang Zhang, Wenqi Ren. 1-5 [doi]
- A Study of Multi-Scale Feature Learning From Pre-Trained Models on Speaker VerificationShengyu Peng, Wu Guo, Jie Zhang, Zuoliang Li, Yu Guan, Bin Gu, Yang Ai. 1-5 [doi]
- Robust Activity Detection for Massive Access using Covariance-based Matching PursuitXinjue Wang, Esa Ollila, Sergiy A. Vorobyov. 1-5 [doi]
- GMCL: Graph-Enhanced Multimodal Contrastive Learning for Rumor DetectionKun Lu 0006, Hongli Zhang 0001, Tianze Sun, Yuchen Yang, Chao Meng, Gongzhu Yin, Binxing Fang. 1-5 [doi]
- Synthetic Dataset Generation for String Ensemble SeparationMinju Kim, Joonhyeon Bae, Eunsik Shin, Kyogu Lee. 1-5 [doi]
- Language Models Can See Better: Visual Contrastive Decoding For LLM Multimodal ReasoningYuqi Pang, Bowen Yang, Haoqin Tu, Yun Cao, Zeyu Zhang. 1-5 [doi]
- Underwater Image Restoration via Polymorphic Large Kernel CNNsXiaojiao Guo, Yihang Dong, Xuhang Chen 0002, Weiwen Chen, Zimeng Li, Fuchen Zheng, Chi-Man Pun. 1-5 [doi]
- Enhanced Breast Cancer Molecular Biomarker Classification: A Novel Two-Stage Machine Learning Pipeline for Accurate Histological Analysis of Whole Slide ImagesAhmed Aboudessouki, Kh. M. Ali, Ahmed Alksas, M. Elsharkawy, M. ABO Rahma, M. Ghazal, Nagham E. Mekky, Eman El-Daydamony, Dibson D. Gondim, Ayman El-Baz. 1-5 [doi]
- A Novel Multimodal Method for Decoding Speech Perception from Brain ActivitiesAoke Zhang, Bo Wang, Xihong Wu, Jing Chen 0019. 1-5 [doi]
- CycleFlow: Leveraging Cycle Consistency in Flow Matching for Speaker Style AdaptationZiqi Liang, Xulong Zhang 0001, Chang Liu, Xiaoyang Qu, Weifeng Zhao, Jianzong Wang. 1-5 [doi]
- HiLiteMamba: A Lightweight and High-Frequency Aware Network for Single Image Super-ResolutionZijing Zhang, Jianfei Xiao, Bate Liu. 1-5 [doi]
- D2S: Towards Efficient Sparse 3D Object Detection via Dense to Sparse Knowledge DistillationYuqi Huang, Longjun Liu, Yingke Gao, Haonan Zhang, Haoteng Li. 1-5 [doi]
- MA-Det: A Discriminative Morphology-Aware Detector for Cervical Lesion Cell ClumpsZiyang Yin, Qian Huang, Yulin Chen, Hao Lu. 1-5 [doi]
- Emo-DPO: Controllable Emotional Speech Synthesis through Direct Preference OptimizationXiaoxue Gao, Chen Zhang, Yiming Chen, Huayun Zhang, Nancy F. Chen. 1-5 [doi]
- Robust Fixed-Filter Sound Zone Control with Audio-Based Position TrackingSankha Subhra Bhattacharjee, Andreas Jonas Fuglsig, Flemming Christensen, Jesper Rindom Jensen, Mads Græsbøll Christensen. 1-5 [doi]
- Towards Dynamic Neural Communication and Speech Neuroprosthesis Based on Viseme DecodingJi-Ha Park, Seo-Hyun Lee, Soowon Kim, Seong-Whan Lee. 1-5 [doi]
- Enhancing Prosody Transfer in Speech Synthesis by Using Prosodically-Aligned ReferencesLin Liu. 1-5 [doi]
- Automatic Geometric Quantification and Rupture Risk Evaluation of 3D Intracranial AneurysmsXudong Ru, Zeyao Zhang, Xingce Wang, Jing-Yi Liu, Yi-Cheng Zhu, Zhongke Wu. 1-5 [doi]
- Leave No Stone Unturned: Optimizing Subpattern Information Entropy for Coreset SelectionHaohao Song, Qiao Xiang, Jiwu Shu. 1-5 [doi]
- Exploiting Foundation Models for Label-Efficient Few-Shot Learning via Feature Coupling: A Case Study of cardiac CT SegmentationWei Chen, Chen Li, Wenjuan Zhou, Yuhang Li, Tianhang Guo, Yuhua Tang. 1-5 [doi]
- Hierarchical Expectation Propagation for Semi-Blind Channel Estimation in Cell-Free NetworksZilu Zhao, Dirk Slock. 1-5 [doi]
- Tag-Aware Weakly-Supervised Online Hashing with Enhanced Joint RepresentationNa Wang, Yu-Wei Zhan, Zhen-Duo Chen 0001, Yongxin Wang 0001, Xin Luo 0006, Xin-Shun Xu. 1-5 [doi]
- Federated Prototype Guided Adaption for Vision-Language ModelsYouchao Liu, Dingjiang Huang. 1-5 [doi]
- An Adversarial Perturbation Generation Method for Image Anti-Forensics Based on Dual-Path Spatial Attention GANYihong Lu, Jianyi Liu, Ru Zhang. 1-5 [doi]
- GRACED: A Plug-and-Play Solution for Certifiable Graph ClassificationXiaoyu Liang, Haohua Du, He Lu, Fei Shang. 1-5 [doi]
- Exploring Acoustic Foundations in Speech Production Assessment Models for Children with Cochlear ImplantsSeonwoo Lee, SunHee Kim, Minhwa Chung. 1-5 [doi]
- Whisper-GPT: A Hybrid Generative LLM For Speech And MusicPrateek Verma. 1-5 [doi]
- DeepPEM-AFC: An Improved Prediction-Error-Method-based Adaptive Feedback Cancellation with Deep Learning for Hearing AidsXiaofan Zhan, Fengyuan Hao, Xiaodong Li 0002, Chengshi Zheng. 1-5 [doi]
- Dual Decoder for Fast Inference in Natural Language GenerationWenbo Wang, Huiying Wang, Zhaoyang Wang, Shuailou Li, Yu Wen. 1-5 [doi]
- Network Games Induced Prior for Graph Topology LearningChenyue Zhang, Shangyuan Liu, Hoi-To Wai, Anthony Man-Cho So. 1-5 [doi]
- Dual Trajectory Revised Diffusion Model for Time Series ForecastingZilong Hu, Yan Qiao, Zidang Cai, Rongyao Hu, Junjie Wang, Meng Li 0018, Zhenchun Wei. 1-5 [doi]
- Optimum Power-Subcarrier Allocation and Time-Sharing in Multicarrier NOMA UplinkSagnik Bhattacharya, Kamyar Rajabalifardi, Muhammad Ahmed Mohsin, John M. Cioffi. 1-5 [doi]
- Semi-supervised Video Anomaly Detection With Compact Deformable 3D ConvolutionShibo Gao, Peipei Yang, LinLin Huang. 1-5 [doi]
- SPEAK: Speech-Driven Pose and Emotion-Adjustable Talking Head GenerationChangpeng Cai, Guinan Guo, Jiao Li, Junhao Su, Fei Shen, Chenghao He, Jing Xiao, Yuanxu Chen, Lei Dai, Feiyu Zhu. 1-5 [doi]
- MIMO Channel as a Neural Function: Implicit Neural Representations for Extreme CSI CompressionHaotian Wu, Maojun Zhang, Yulin Shao, Krystian Mikolajczyk, Deniz Gündüz. 1-5 [doi]
- Key Clues Guided Video Character Social Relationship Recognition Enhanced by LLMWenlong Dong, Qing Zhu, Qirong Mao. 1-5 [doi]
- "I've Heard of You!": Generate Spoken Named Entity Recognition Data for Unseen EntitiesJiawei Yu, Xiang Geng, Yuang Li, Mengxin Ren, Wei Tang, Jiahuan Li, Zhibin Lan, Min Zhang, Hao Yang, Shujian Huang, Jinsong Su. 1-5 [doi]
- Positional Differential Encoding for Distributed LearningLeah Woldemariam, Anna Scaglione. 1-5 [doi]
- Large Language Models Are Efficient Learners as Zero-Shot Speech TranslatorsChenxuan Liu, Liping Chen, Peiwang Tang, Weitai Zhang, Xiaoxi Li, Sreyan Ghosh, Zhongyi Ye, Mingjia Yu. 1-5 [doi]
- Decoupling While Coupling: Towards More Accurate Stereo Image Sand Removal Beyond CertaintyBingcai Wei, Hui Liu, Chuang Qian 0001, Yi Jia, Wangyu Wu, Zhishan Li. 1-5 [doi]
- Generalizable Audio Deepfake Detection via Latent Space Refinement and AugmentationWen Huang 0004, Yanmei Gu, Zhiming Wang, Huijia Zhu, Yanmin Qian. 1-5 [doi]
- Lossless Phase Conversion Method for Object Wave-based Hologram CompressionHiroki Kojima, Ryota Koiso, Ryosuke Watanabe, Keisuke Nonaka. 1-5 [doi]
- Multiclass Arrhythmia Classification using Smartwatch Photoplethysmography Signals Collected in Real-life SettingsDong Han, Jihye Moon, Luís Roberto Mercado Díaz, Darren Chen, Devan Williams, Eric Y. Ding, Khanh-Van Tran, David D. McManus, Ki H. Chon. 1-5 [doi]
- Multi-Scale Denoising in the Feature Space for Low-Light Instance SegmentationJoanne Lin, Nantheera Anantrasirichai, David Bull 0001. 1-5 [doi]
- Toward Forward-Secure End-to-End Data Sharing: An Attribute-Key-Free CP-ABE SchemeXinyi Shi, Yunchuan Guo, Wei Jin, Mingjie Yu, Daiyong Quan, Wenlong Kou, FengHua Li. 1-5 [doi]
- Semantic-oriented Visual Prompt Learning for Class Incremental LearningShuai Guo 0001, Yang Gu 0001, Yuan Ma, Yingwei Zhang 0002, Weining Weng, Jun Liu, Weiwei Dai, Yiqiang Chen 0001. 1-5 [doi]
- Reinforcement Learning-Based Multi-Teacher Knowledge Distillation for Enhancing Retrieval Ranking ConsistencyXiukang Yang, Jingguo Ge, Liangxiong Li, Bingzhen Wu. 1-5 [doi]
- Exploring Local Interpretable Model-Agnostic Explanations for Speech Emotion Recognition with Distribution-ShiftMaja J. Hjuler, Line H. Clemmensen, Sneha Das. 1-5 [doi]
- Multi-Head Auto-Correlation Attention Networks for Session-based Social RecommendationMengying Lu, Xingyu Lu, Hai-Tao Zheng, Zhao Wei, Yong Xu, Bingxu An. 1-5 [doi]
- Melody Structure Transfer Network: Generating Music with Separable Self-AttentionJunlin Wu, Ning Zhang, Cheng Zhong, Boan Chen, Huanxi Liu, Junchi Yan. 1-5 [doi]
- Segment Any Bone in CT with Partial SupervisionTianyou Liang, Xiaoxu Li, Yu Peng, Min Xu 0001. 1-5 [doi]
- Targeted Data Poisoning for Black-Box Audio Datasets Ownership VerificationWassim Bouaziz, El Mahdi El Mhamdi, Nicolas Usunier. 1-5 [doi]
- Dual-Domain Feature-Guided Task Alignment for Enhanced Small Object DetectionFangrui Guo, Junwei Wu, Quan Zhang. 1-5 [doi]
- Audio Array-Based 3D UAV Trajectory Estimation with LiDAR Pseudo-LabelingAllen H.-X. Lei, Tianchen Deng, Han Wang, Jianfei Yang, Shenghai Yuan. 1-5 [doi]
- Consensus Graph-Based Spectral Ensemble Clustering via Low-Rank Tensor LearningZhe Cao, Haonan Xin, Zihua Zhao, Jie Wang 0024, Rong Wang 0001. 1-5 [doi]
- A Small-footprint Acoustic Echo Cancellation Solution for Mobile Full-Duplex Speech InteractionsYiheng Jiang, Biao Tian. 1-5 [doi]
- Improving Dialect Identification in Indian Languages Using Multimodal Features from Dialect Informed ASRAmartyaveer, Saurabh Kumar, Sumit Sharma, Sathvik Udupa, Sandhya Badiger, Abhayjeet Singh, Deekshitha G, Jesuraja Bandekar, Savitha Murthy, Prasanta Kumar Ghosh. 1-5 [doi]
- Maximum Likelihood Estimation of Stable ARX Models using Randomized Coordinate DescentOzmen Erkin Kokten, Raviv Raich. 1-5 [doi]
- Uncertainty-Aware Dynamic Fusion for Multimodal Clinical Prediction TasksJiayu Guo, Ying Cheng, Wen He, Yuejie Zhang, Rui Feng, Xiaobo Zhang. 1-5 [doi]
- Self-Optimization Training for Weakly Supervised Image Manipulation LocalizationZhangchen Zhu, Jiafeng Li, Ying Wen 0003. 1-5 [doi]
- Enhancing Task-Specific Feature Learning with LLMs for Multimodal Emotion and Intent Joint UnderstandingZhaoyang Li, Cheng Lu, Xiaolin Xu, Kaifei Zhang, Yujia Gu, Banghua Li, Yuan Zong, Wenming Zheng. 1-2 [doi]
- Enhancing Multivariate Time Series Forecasting with Multi-scale Moving TransformationWenjie Ou, Hongmin Du, Wenqiang Zhao, Dongyue Guo, Yi Lin 0006. 1-5 [doi]
- Transcribing and Translating, Fast and Slow: Joint Speech Translation and RecognitionNiko Moritz, Ruiming Xie, Yashesh Gaur, Ke Li 0023, Simone Merello, Zeeshan Ahmed, Frank Seide, Christian Fuegen. 1-5 [doi]
- SSFMamba: Spatial-Spectral Fusion State Space Model for PansharpeningMengting Ma, Mengjiao Zhao, Yizhen Jiang, Xiangdong Li, Wei Zhang 0243. 1-5 [doi]
- Single-Loop Variance-Reduced Stochastic Algorithm for Nonconvex-Concave Minimax OptimizationXia Jiang, Linglingzhi Zhu, Taoli Zheng, Anthony Man-Cho So. 1-5 [doi]
- AER-LLM: Ambiguity-aware Emotion Recognition Leveraging Large Language ModelsXin Hong, Yuan Gong, Vidhyasaharan Sethu, Ting Dang. 1-5 [doi]
- On the Relation Between Speech Quality and Quantized Latent Representations of Neural CodecsMhd Modar Halimeh, Matteo Torcoli, Philipp Grundhuber, Emanuël A. P. Habets. 1-5 [doi]
- Leveraging Out-of-Domain Noise for Unsupervised Domain Adaptation in Speech EnhancementYu Liao, Haixin Guan, Shuang Wei, Yanhua Long. 1-5 [doi]
- Adaptive Contribution Modulation For Multi-Modal Manipulation Media Detection and GroundingYixiang Li, Biao Leng. 1-5 [doi]
- PRESS: Defending Privacy in Retrieval-Augmented Generation via Embedding Space ShiftingJiaming He, Cheng Liu, Guanyu Hou, Wenbo Jiang, Jiachen Li. 1-5 [doi]
- Hop-level Direct Preference Optimization for Knowledge Graph Reasoning with TreesTiesunlong Shen, Jin Wang 0008, Xuejie Zhang 0002, Erik Cambria. 1-5 [doi]
- Efficient Quality Controllable Neural Image Compression based on QD-ModelShaokang Wang, Guoqing Xiang, Jinchang Xu, Shanghang Zhang, Xiaodong Xie. 1-5 [doi]
- Spatial Annotation-free Training for Sound Event Localization and DetectionMasahiro Yasuda, Shoichiro Saito, Nao Sato, Noboru Harada. 1-5 [doi]
- RPPFL: Robust and Privacy-Preserving Federated Learning via Trusted Execution EnvironmentsXiaolei Zhang, Zhaoyu Chen, Guangpu Chen, Xinyu Feng 0002, Qingni Shen, Zhonghai Wu. 1-5 [doi]
- Classification of Eye-Tracking Data Based on Spatiotemporal Attention EncodingJiaju He, Chen Xia, Kuan Li, Tian Zhang. 1-5 [doi]
- I-KAN: Reconstructing Over-Range Inertial SignalsYifeng Wang, Shu Zhang, Yi Zhao. 1-5 [doi]
- MSECG: Incorporating Mamba for Robust and Efficient ECG Super-ResolutionJie Lin, I Chiu, Kuan-Chen Wang, Kai-Chun Liu, Hsin-Min Wang, Ping-Cheng Yeh, Yu Tsao 0001. 1-5 [doi]
- SX-Stitch: An Efficient VMS-UNet Based Framework for Intraoperative Scoliosis X-Ray Image StitchingYi Li, Heting Gao, Mingde He, Jinqian Liang, Jason Gu, Wei Liu. 1-5 [doi]
- Causal Speech Enhancement with Predicting Semantics based on Quantized Self-supervised Learning FeaturesEmiru Tsunoo, Yuki Saito 0001, Wataru Nakata, Hiroshi Saruwatari. 1-5 [doi]
- Improved Cross-Lingual Speaker Verification Using Speaker Sensitive Feature Guidance and Fine-grained Phonetic InformationYongtai Ji, Guangxing Li, Hao Huang 0009, Yanbing Li, Wushour Silamu. 1-5 [doi]
- Deep Unfolding of Full Waveform Inversion for Quantitative Ultrasound ImagingNiv Cohen, Yhonatan Kvich, Rui Guo, Yonina C. Eldar. 1-5 [doi]
- Predicting Biomechanical Risk Factors for Division - I Women's Basketball AthletesAayushi Shah, Vanaja Agarwal, Dhairya Shah, Harman Jani, Srishti U. Sharma, Tolga Kaya, Christopher Taber, Mehul S. Raval. 1-5 [doi]
- Incorporating Spatial Cues in Modular Speaker Diarization for Multi-channel Multi-party MeetingsRuoyu Wang, Shutong Niu, Gaobin Yang, Jun Du 0002, Shuangqing Qian, Tian Gao, Jia Pan. 1-5 [doi]
- Speech Enhancement with MAP-based Training for Robust ASRYou-Jin Li, Rong Chao, Borching Su, Yu Tsao 0001. 1-5 [doi]
- Multi-level Conflict-Aware Network for Multi-modal Sentiment AnalysisYubo Gao, Haotian Wu, Lei Zhang. 1-5 [doi]
- Stable Reduced-Rank VAR Estimators in Closed FormsXinhui Rong. 1-5 [doi]
- Psycholinguistic Features Predict Word Duration in Hindi Read Aloud SpeechRajakrishnan Rajkumar, Sneha Raman, Aadya Ranjan, Mildred Pereira, Nagesh Nayak, Preeti Rao. 1-5 [doi]
- A Multi-Prior Fusion Network for Video-based Micro-Expression RecognitionChuang Ma, Shaokai Zhao, Yu Pei, Liang Xie 0012, Erwei Yin, Ye Yan 0001. 1-5 [doi]
- LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating Disinformation GenerationHieu-Thi Luong, Haoyang Li, Lin Zhang, Kong-Aik Lee, Eng Siong Chng. 1-5 [doi]
- Leveraging Registers in Vision Transformers for Robust AdaptationSrikar Yellapragada, Kowshik Thopalli, Vivek Sivaraman Narayanaswamy, Wesam Sakla, Yang Liu, Yamen Mubarka, Dimitris Samaras, Jayaraman J. Thiagarajan. 1-5 [doi]
- Look Before You Leap: Problem Elaboration Prompting Improves Mathematical Reasoning in Large Language ModelsHaoran Liao, Jidong Tian, Shaohua Hu, Zhihao Zhu, Hao He 0007, Yaohui Jin. 1-5 [doi]
- CAF-YOLO: A Robust Framework for Multi-Scale Lesion Detection in Biomedical ImageryZilin Chen, Shengnan Lu. 1-5 [doi]
- Beyond Jensen's Inequality: Speeding Up ML Estimation of Generalized Hyperbolic DistributionsChenyu Gao, Ziping Zhao 0002. 1-5 [doi]
- Multiresolution Encoder-Decoder Convolutional Neural Network for Magnetic Resonance Image SegmentationKishore Kumar Tarafdar, Aaditya Meher, Mirat Shah, Qutubuddin Saifee, Dushyant Kumar, Anant V. Nimkar, Rama Jayasundar, Vikram M. Gadre. 1-5 [doi]
- Using Corrected ASR Projection to Improve AD Recognition Performance from Spontaneous SpeechYunfan Zhang, Yun Jin, Guanlin Chen, Yong Ma, Maoshen Jia, Peng Song 0007. 1-5 [doi]
- Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data GapGuanrou Yang, Fan Yu 0002, Ziyang Ma 0001, Zhihao Du, Zhifu Gao, Shiliang Zhang, Xie Chen 0001. 1-5 [doi]
- Radio Map Estimation via Latent-Domain Plug-and-Play DenoisersLe Xu, Lei Cheng 0003, Junting Chen, Wenqiang Pu, Xiao Fu 0001. 1-5 [doi]
- DreamVideo: High-Fidelity Image-to-Video Generation with Image Retention and Text GuidanceCong Wang, Jiaxi Gu, Panwen Hu, Yuanfan Guo, Xiao Dong, Hang Xu 0004, Xiaodan Liang. 1-5 [doi]
- Impact of Temporal Precision on Speech Synthesis Accuracy From Electrocorticographic Brain SignalsQinwan Rabbani, Matthew S. Fifer, Nathan E. Crone, Laureano Moro-Velázquez. 1-5 [doi]
- Text-guided Device-realistic Sound Generation for Fiber-based Sound Event ClassificationWataru Kohno, Noriyuki Tonami, Jian Fang, Shaobo Han, Jingchen Sun, Ting Wang. 1-5 [doi]
- Model-based Online Millimeter-wave Channel Sensing with Learned Empirical PriorsParthasarathi Khirwadkar, Bhaskar D. Rao, Piya Pal. 1-5 [doi]
- Spatio-Semantic Prompt guided Adaptive Segment Anything for Remote Sensing Change DetectionShenglong Hu, Zhidong Han, Gang Dong, Lingyan Liang, Dongchao Wen, Kaihua Zhang 0001. 1-5 [doi]
- PGDGS: Improving Few-shot 3D Gaussian Splatting with Progressive Gaussian DensificationHaoyang Huang, Zhe Zhang, Guanhua Wu, Ronggang Wang. 1-5 [doi]
- Sequence Knowledge Enhancement Distillation Framework for Ultra-Fast Image DerainingJihao Li, Jincheng Hu, Ming Liu, Pengyu Fu, Jingjing Jiang, Yuanjian Zhang. 1-5 [doi]
- AgentPose: Progressive Distribution Alignment via Feature Agent for Human Pose DistillationFeng Zhang, Jinwei Liu, Xiatian Zhu, Lei Chen. 1-5 [doi]
- Fresh-CL: Feature Realignment through Experts on Hypersphere in Continual LearningZhongyi Zhou, Yaxin Peng, Pin Yi, Minjie Zhu, Chaomin Shen 0001. 1-5 [doi]
- MHAD: Multimodal Home Activity Dataset with Multi-Angle Videos and Synchronized Physiological SignalsLei Yu, Jintao Fei, Xinyi Liu, Yang Yao, Jun Zhao, Guoxin Wang, Xin Li. 1-5 [doi]
- Cued Speech Generation Leveraging a Pre-trained Audiovisual Text-to-Speech ModelSanjana Sankar, Martin Lenglet, Gérard Bailly, Denis Beautemps, Thomas Hueber. 1-5 [doi]
- Influence of Oropharyngeal Esophageal Cavity Geometry and Beak Angle on Vocal Tract Resonance of Birds using Computational ModelingNoumida Abdul Kareem, Rajeev Rajan. 1-5 [doi]
- Multi-Scale Attention-Based Dense Spatial-Temporal Model for Emotion Induction in Response to Olfactory StimuliJian-ming Zhang, Wei-Bang Jiang, Wei-Long Zheng, Bao-Liang Lu. 1-5 [doi]
- Emotion information recovery potential of wav2vec2 network fine-tuned for speech recognition taskTilak Purohit, Mathew Magimai-Doss. 1-5 [doi]
- Higher-Order Topological Directionality and Directed Simplicial Neural NetworksManuel Lecha, Andrea Cavallo, Francesca Dominici, Elvin Isufi, Claudio Battiloro. 1-5 [doi]
- Adapting Large Language Model for Spatio-Temporal Understanding in Next Point-of-Interest PredictionQiuhan Han, Atsushi Yoshikawa 0002, Masayuki Yamamura. 1-5 [doi]
- Channel-Aware Domain-Adaptive Generative Adversarial Network for Robust Speech RecognitionChien-Chun Wang, Li-Wei Chen, Cheng-Kang Chou, Hung-Shin Lee, Berlin Chen, Hsin-Min Wang. 1-5 [doi]
- OCLNet: Obfuscation feature Contrastive Learning Network for Weakly Supervised Semantic Segmentation on Ultrasound ImagesJie Gao, Xianzhi Zhang, Zijian Zhang, Xuewei Li, Mei Yu 0004, Ruiguo Yu, Zhiqiang Liu 0002. 1-5 [doi]
- To Learn Better Character Embeddings in Generative Models for Password AttackMingli Zheng, Haibo Cheng, Jiahong Yang, Wenbo Zhang, Ping Wang 0003. 1-5 [doi]
- Hypergraph-Based Dynamic Graph Node ClassificationXiaoxu Ma, Chen Zhao 0010, Minglai Shao 0001, Yujie Lin. 1-5 [doi]
- A Bayesian Interpretation of Adaptive Low-Rank AdaptationHaolin Chen, Philip N. Garner. 1-5 [doi]
- Basket-Enhanced Heterogenous Hypergraph for Price-Sensitive Next Basket RecommendationYuening Zhou, Yulin Wang, Qian Cui, Xinyu Guan, Francisco Cisternas. 1-5 [doi]
- Retinex-Based Self-Conditioned Diffusion Model for Low-Light Image EnhancementJiawei Zhang, Ziwen Li, Jinpu Zhang, Yuehuan Wang. 1-5 [doi]
- PsSR: Hybrid Path Selection Mechanism for Efficient Image Super-ResolutionCheng Ding, Zhongqiu Zhao, Yang Zhao. 1-5 [doi]
- LGNet: Linear Graph Representation for Efficient Cold-Start RecommendationsZhenglong Li, Ruiqi Luo, Bangchao Wang, Lin Li, Xian Zhong. 1-5 [doi]
- RFEM: Remote Feature Enhancement Module for Target DetectionFeng Hu, Chuangye Wang, Jian Xiong 0005, Wenhua Zhang, Haolun Li 0001, Hao Gao 0005. 1-5 [doi]
- Symmetric Bi-branch Modality-search Aggregation Network for Multi-modal Liver SegmentationHuaxiang Liu, Jie Yang, Jie Jin, Youyao Fu, Shiqing Zhang, Wenbin Ji, Dandan Wang, Jiangxiong Fang. 1-5 [doi]
- EM-MIAs: Enhancing Membership Inference Attacks in Large Language Models through Ensemble ModelingZichen Song, Sitan Huang, Zhongfeng Kang. 1-5 [doi]
- Facilitating Semi-Supervised Pedestrian Detection with Structurally Controllable Instance SynthesisTianyou Zhang, Wenhao Wu, Si Wu 0002, Rui Li 0045. 1-5 [doi]
- A Novel Single Continuous Shot Multiple Lesions Endoscopy Report GenerationXinpan Yuan, Junhua Kuang, Liujie Hua, Guihu Zhao, Changhong Zhang, Jiabao Li. 1-5 [doi]
- 2-SAC: A Relaxation-and-Refinement SAC Agent for Stock Portfolio TradingXiaoyun Han, Jun Wang. 1-5 [doi]
- Dynamic Dictionary Design for Localization in Automotive Radar SystemsFarhan Bishe, Mohammed Saif, Jun Li 0091, Shahrokh Valaee. 1-5 [doi]
- Design and Optimization of Superdirective Beamforming and Post-Filtering for Speech EnhancementXiaoran Yang, Gongping Huang, Jilu Jin, Jingdong Chen, Jacob Benesty. 1-5 [doi]
- NanoVoice: Efficient Speaker-Adaptive Text-to-Speech for Multiple SpeakersNohil Park, Heeseung Kim, Che Hyun Lee, Jooyoung Choi, Jiheum Yeom, Sungroh Yoon. 1-5 [doi]
- Aligning Noisy-Clean Speech Pairs at Feature and Embedding Levels for Learning Noise-Invariant Speaker RepresentationsZuoliang Li, Yang Ai, Jie Zhang, Shengyu Peng, Yu Guan, Bin Gu, Wu Guo. 1-5 [doi]
- Integrating Adaptive Sampling for Optimal Learned Video CompressionWuyang Cong, Yuzhuo Kong, Ming Lu, Lizhong Wang, Weijing Shi, Zhan Ma. 1-5 [doi]
- Membership Encoding for Black-Box Neural Network WatermarkingHangwei Zhang, Fang-Qi Li, Shi-Lin Wang. 1-5 [doi]
- Spatial-Temporal Reconstruction Error for AIGC-based Forgery Image DetectionChengji Shen, Zhenjiang Liu, Kaixuan Chen 0004, Jie Lei 0002, Mingli Song, Zunlei Feng. 1-5 [doi]
- AMSER: Accelerate Mobile Speech Emotion Recognition with Signal CompressionYu Lu, Ran Wang, Dian Ding, Han Zhang, Liyun Zhang, Lanqing Yang, Yi-Chao Chen 0001, Guangtao Xue. 1-5 [doi]
- Reinforcement Learning for Charged Particle Beam Control to Minimize Injection Mismatch in Particle AcceleratorsThilina Balasooriya, Shinjae Yoo, Vincent Schoefer, Huan-Hsin Tseng, Yuan Gao, Weijian Lin, Chanaka De Silva. 1-5 [doi]
- Adversarial Speech-Text Pre-Training for Speech TranslationChenxuan Liu, Liping Chen, Weitai Zhang, Xiaoxi Li, Peiwang Tang, Mingjia Yu, Sreyan Ghosh, Zhongyi Ye. 1-5 [doi]
- Massive MIMO-ISAC Beamforming Design Via Sensing Energy MaximizationFang Li, Bin Liao. 1-5 [doi]
- Phoneme-Aware Acoustic Analysis of Natural Speech for Lung Function AssessmentSejal Bhalla, Tien Han, Andrea Gershon, Robert Wu, Eyal de Lara, Alex Mariakakis. 1-5 [doi]
- Correlative3D: Inter-Object Correlation-Aware 3D Scene UnderstandingTingxuan Gao, Wenming Yang, Yang Wu 0001, Yehu Shen. 1-5 [doi]
- Quant-NeRF: Efficient End-to-End Quantization of Neural Radiance Fields with Low-Precision 3D Gaussian RepresentationAhmed Hasssan, Anupreetham Anupreetham, Jian Meng, Jae-sun Seo. 1-5 [doi]
- PAUSE: Privacy-Aware Active User Selection for Federated LearningOri Peleg, Natalie Lang, Stefano Rini, Nir Shlezinger, Kobi Cohen. 1-5 [doi]
- Fusion of Information in Multiple Particle Filtering in the Presence of Unknown Static ParametersXiaokun Zhao, Marija Iloska, Yousef El-Laham, Mónica F. Bugallo. 1-5 [doi]
- Jack of All Trades, Master of None: PMP-Guided Adaptive Multi-Teacher Distillation with Meta-LearningSisi Zhang, Zechao Lin, Xingbin Wang, Yulan Su, Yan Wang, Rui Hou 0001, Dan Meng. 1-5 [doi]
- Multi-view Subspace Classification: A Hierarchical Contrastive Approach and Low-rank Latent RepresentationDeyu Zeng, Tengyu Zhang, Zongze Wu 0001, Wei Liu, Chris Ding. 1-5 [doi]
- Robust Detection Based on the K-Score TestKoby Todros. 1-5 [doi]
- DN-DR: Discriminative Network with Dual Reconstruction for Image Anomaly DetectionWen Li, Chune Zhang. 1-5 [doi]
- SBL Algorithms for the Multiple Measurement Vector Problem: New Modeling and Inference MethodsVinay Kanakeri, Florian Meyer, Bhaskar D. Rao. 1-5 [doi]
- PHMamba: Preheating State Space Models with Context-Augmented Features for Medical Image SegmentationNuo Chen, Shaoyu Wang, Ran Lu, Wenxuan Li, Xiujin Shi. 1-5 [doi]
- Deeply Coupling EEG Signals and Eye Movements for Multi-Modal and Region-Aware Emotion RecognitionYuepeng Chen, Yue Gao 0012, Xiaoling Fu, Hua He, Tianxiong Ouyang, Songling Chen, Xiangling Fu. 1-5 [doi]
- Correlated Multiple IHC Virtual Staining for Breast Histopathological ImagesXianchao Guan, Zheng Zhang, Yifeng Wang, Ye Zhang, DanLing Jiang, Yongbing Zhang. 1-5 [doi]
- Efficient Prototypical Classifier for Class-Incremental LearningWei Zhang, Jingyang Qiao, Yuan Xie, Zhizhong Zhang 0001, Xin Tan. 1-5 [doi]
- Bridging the Fairness Gap: Enhancing Pre-trained Models with LLM-Generated SentencesLiu Yu 0001, Ludie Guo, Ping Kuang, Fan Zhou 0002. 1-5 [doi]
- INN-based Secure Steganography Using Lost Information as Adversarial PerturbationsFei Shang, Weixiang Zhao, Xiangui Kang, Z. Jane Wang 0001. 1-5 [doi]
- Multiple Choice Learning for Efficient Speech Separation with Many SpeakersDavid Perera, François Derrida, Théo Mariotte, Gaël Richard, Slim Essid. 1-5 [doi]
- Online Learning With Non-convex Losses: New Condition To Achieve Small Dynamic RegretSumit Sah, B. N. Bharath 0001. 1-5 [doi]
- Accurate Hardware Trojan Detection for SGIN Device: A Prompt-Tuning and LangChain ApproachDuo Zhang, Ming Mao, Yunchuan Guo, FengHua Li, Daiyong Quan. 1-5 [doi]
- Optimizing Speech Multi-View Feature Fusion through Conditional ComputationWeiqiao Shan, Yuhao Zhang, Yuchen Han, Bei Li, Xiaofeng Zhao, Yuang Li, Min Zhang, Hao Yang, Tong Xiao, Jingbo Zhu. 1-5 [doi]
- Enhancing Stutter Detection using Long-Term Average Spectrum ValuesVamshiraghusimha Narasinga, Priyanka Kommagouni, Sridhar Vanga, Kowshik Siva Sai Motepalli, Sai Akarsh C, Purva Barche, Anil Vuppala. 1-5 [doi]
- Auxiliary Tasks Benefit Skeleton-based Action RecognitionYuheng Yang, Haipeng Chen 0002. 1-5 [doi]
- Heart Sounds for High Blood Pressure PredictionErika Bondareva, Jing Han 0010, Katarzyna Szczurek, Dawid Szczepanek, Tomasz Jadczyk, Cecilia Mascolo. 1-5 [doi]
- DiffETM: Diffusion Process Enhanced Embedded Topic ModelWei Shao 0009, Mingyang Liu, Linqi Song. 1-5 [doi]
- Investigating the Sensitivity of Pre-trained Audio Embeddings to Common EffectsVictor Deng, Changhong Wang, Gaël Richard, Brian McFee. 1-5 [doi]
- Cognitive MIMO Radar Beamforming for Target Tracking Using a BCRB-based CriterionHelin Sun, Joseph Tabrikian, Hagit Messer, Hongyuan Gao. 1-5 [doi]
- Domain-Incremental Learning for Audio ClassificationManjunath Mulimani, Annamaria Mesaros. 1-5 [doi]
- Adaptive Central Frequencies Locally Competitive Algorithm for SpeechSoufiyan Bahadi, Eric Plourde, Jean Rouat. 1-5 [doi]
- Rethinking Mean Opinion Scores in Speech Quality Assessment: Score Aggregation through Quantized Distribution FittingYuto Kondo, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko. 1-5 [doi]
- Co-training with Progressive Distribution Alignment and Uncertainty-Interactive Relabeling for Semi-Supervised Domain Adaptive Semantic SegmentationRuiguo Yu, Yida Wang, Xuewei Li 0001, Xuzhou Fu, Zijian Zhang, Yuan Tian, Jie Gao 0008. 1-5 [doi]
- High-Resolution Speech Restoration with Latent Diffusion ModelTushar Dhyani, Florian Lux, Michele Mancusi, Giorgio Fabbro, Fritz Hohl, Ngoc Thang Vu. 1-5 [doi]
- Towards Efficient Deep Hashing Retrieval: Condensing Your Data via Feature-Embedding MatchingTao Feng, Jie Zhang, Huashan Liu, Zhijie Wang, Shengyuan Pang. 1-5 [doi]
- How much to Dereverberate? Low-Latency Single-Channel Speech Enhancement in Distant Microphone ScenariosSatvik Venkatesh, Philip Coleman, Arthur Benilov, Simon Brown, Selim Sheta, Frederic Roskam. 1-5 [doi]
- Multi-Modal Medical Image Fusion via 3D Manifold Fitting and Dual-Domain Cross-AttentionZeyu Wang, Jiayu Wang, Haiyu Song, Jiawei Feng, Haoran Duan. 1-5 [doi]
- SiMBA-TS: Simplified Channel Mixing and Mamba for Long-term Time Series ForecastingBadri Narayana Patro, Vijay Srinivas Agneeswaran. 1-5 [doi]
- Noise-Robust Speech Emotion Recognition Using Shared Self-Supervised Representations with Integrated Speech EnhancementJing-Tong Tzeng, Seong-Gyun Leem, Ali N. Salman, Chi-Chun Lee, Carlos Busso. 1-5 [doi]
- Exploiting Robust Model Watermarking Against the Model Fine-Tuning Attack via Flat Minima Aware OptimizersDongdong Lin, Yue Li, Bin Li 0011, Jiwu Huang. 1-5 [doi]
- Training-free Adapter for Multi-Modal Image Matching for All-Day Visual Place RecognitionAnuradha Uggi, Sumohana S. Channappayya. 1-5 [doi]
- Large Multimodal Model is a Better Comparator on Facial Beauty PredictionZhenyou Liu, Xuefeng Liang, Jian Lin. 1-5 [doi]
- Exploring the Robustness of In-Context Learning with Noisy LabelsChen Cheng, Xinzhi Yu, Haodong Wen, Jingsong Sun, Guanzhang Yue, Yihao Zhang, Zeming Wei. 1-5 [doi]
- HyperDiff: Masked Diffusion Model with High-efficient Transformer for Hyperspectral Image Cross-Scene ClassificationPei Zhang, Dong Wang, Chanyue Wu, Jing Yang, Lei Kang, Zongwen Bai, Ying Li, Qiang Shen. 1-5 [doi]
- BiCG: Binaural Cue Generation from Unified HRTF DatasetsXikun Lu, Yilei Wang, Jinqiu Sang, Chengshi Zheng. 1-5 [doi]
- ArtTwin: A Novel Concept of Developing Digital Twin of Human Arterial SystemAkasapu Hemanthika, Dinu S. Chandran, Sandeep Kumar, Sitikantha Roy. 1-5 [doi]
- Re-Evaluating Privacy in Centralized and Decentralized Learning: An Information-Theoretical and Empirical StudyChanglong Ji, Richard Heusdens, Stephane Maag, Qiongxiu Li. 1-5 [doi]
- Towards Fully Test-Time Adaptation via Variance Balancing and Semantic AugmentationHoucheng Su, Bingli Wang, Daixian Liu, Jiao Li, Chen-Bin Feng, Chi-Man Vong. 1-5 [doi]
- Explainable Adversarial Attacks on Coarse-to-Fine ClassifiersAkram Heidarizadeh, Connor Hatfield, Lorenzo Lazzarotto, HanQin Cai, George K. Atia. 1-5 [doi]
- Deep Model Pruning without Finetuning for Few Category DatasetsYuchen Huang, Jie Xie, Cheng Wu. 1-5 [doi]
- FARE: A Deep Learning-Based Framework for Radar-Based Face Recognition and Out-of-Distribution DetectionSabri Mustafa Kahya, Boran Hamdi Sivrikaya, Muhammet Sami Yavuz, Eckehard G. Steinbach. 1-5 [doi]
- Audio Codec Augmentation for Robust Collaborative Watermarking of Speech SynthesisLauri Juvela, Xin Wang. 1-5 [doi]
- SoundTRC: DNN-based Acoustic Target Region ControlYuhang He, Andrew Markham, Okan Köpüklü. 1-5 [doi]
- Node Selection for Physical Layer Security Using Graph-Based Vertex-Frequency RepresentationVenugopalachary Kotha, Vijay Kumar Chakka, Shaik Basheeruddin Shah, Nazar T. Ali. 1-5 [doi]
- Colored Point Cloud-based Mesh Registration for Enhancing Inter-frame Coding of Texture Video in V-DMCJianxiang Sun, Wenjie Zou, Jiapan Zhao, FuZheng Yang 0001. 1-5 [doi]
- Big-Moe: Bypassing Isolated Gating For Generalized Multimodal Face Anti-SpoofingYingjie Ma, Zitong Yu, Xun Lin, Weicheng Xie, LinLin Shen. 1-5 [doi]
- Stochastic-Aware Mamba Diffusion for Pedestrian Trajectory PredictionZiyang Ren, Ping Wei 0001, Haowen Tang, Huan Li, Jin Yang, Jialu Qin. 1-5 [doi]
- Distill To Detect: Amplifying Anomalies in Backdoor Models through Knowledge DistillationChang Hu, Xuyang Teng, Wenpeng Xing, Han Chen, Chenhao Ye, Meng Han. 1-5 [doi]
- ALIC: Adaptive Fusion Entropy Model for Learned Image CompressionLingxue Li, Meiqin Liu 0002, Yifan Zhang, Qi Tang, Chao Yao, Yao Zhao 0001. 1-5 [doi]
- Improved Bounds For Online Convex OptimizationTanvi S. Nayak, B. N. Bharath 0001. 1-5 [doi]
- PDSeg: Patch-Wise Distillation and Controllable Image Generation for Weakly-Supervised Histopathology Tissue SegmentationWei-Hua Li, Yu-Hsing Hsieh, Huei-Fang Yang, Chu-Song Chen. 1-5 [doi]
- DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-SpeechXin Qi, Ruibo Fu, Zhengqi Wen, Tao Wang 0074, Chunyu Qiang, Jianhua Tao 0001, Chenxing Li, Yi Lu, Shuchen Shi, Zhiyong Wang, Xiaopeng Wang, Yuankun Xie, Yukun Liu, Xuefei Liu, Guanjun Li. 1-5 [doi]
- Aligned Contrastive Learning for Text-to-Music RetrievalTatsuya Komatsu, Hokuto Munakata, Takuya Hasumi, Yusuke Fujita. 1-5 [doi]
- Hyperbolic Distance Based on EMD and Diffusion for Hyperspectral ImagingElad Lavi, Amir Bourvine, Ya-Wei Eileen Lin, Ronen Talmon. 1-5 [doi]
- Integrated Global-Local Gaussian Attention for Image CompressionAtefeh Khoshkhahtinat, Piyush M. Mehta. 1-5 [doi]
- VoxVietnam: a Large-Scale Multi-Genre Dataset for Vietnamese Speaker RecognitionHoang Long Vu, Phuong Tuan Dat, Pham Thao Nhi, Nguyen Song Hao, Nguyen Thi Thu Trang. 1-5 [doi]
- Robust Multi-task Adversarial Attacks Using Min-max OptimizationJiacheng Guo, Lei Li, Haochen Yang, Baocheng Geng, Hongkai Yu, Minghai Qin, Tianyun Zhang. 1-5 [doi]
- Col-OLHTR: A Novel Framework for Multimodal Online Handwritten Text RecognitionChenyu Liu, Jinshui Hu, Baocai Yin, Jia Pan, Bing Yin, Jun Du, Qingfeng Liu. 1-5 [doi]
- EGAS: Enhanced Geometry-aware 3D Asset Generation Using Gaussian SplattingShengjie Hu, Xiaogang Zhang, Hua Chen 0008, Wenbin Yan. 1-5 [doi]
- Smooth-Foley: Creating Continuous Sound for Video-to-Audio Generation Under Semantic GuidanceYaoyun Zhang, Xuenan Xu, Mengyue Wu. 1-5 [doi]
- Partially Observable Contextual Bandits With Linear PayoffsSihan Zeng, Sujay Bhatt, Alec Koppel, Sumitra Ganesh. 1-5 [doi]
- TSLA: A Multi-Task Time Series Language ModelLiri Fang, Yuncong Chen, Wenchao Yu, Yanchi Liu, Lu-An Tang, Vetle I. Torvik, Haifeng Chen. 1-5 [doi]
- FairAdapter: Detecting AI-generated Images with Improved FairnessFeng Ding 0007, Jun Zhang, Xinan He, Jianfeng Xu. 1-5 [doi]
- SVRM: Composing Various Network Service Fuzzing Corpus with One Single ModelWenfeng Lin, Zhiyuan Jiang, Fangliang Xu, Yunfei Su, Zhiwei Li, Lingchu Mao, Chaojing Tang. 1-5 [doi]
- A Novel Compressive Compound Word Encoding and Independent Word Attention for Symbolic Music GenerationLianyu Zhou, Liang Yin, Yukun Qian. 1-5 [doi]
- MFDPonzi: Detecting Ethereum Ponzi Schemes Using Static Features from Novel Opcode SequencesLongwei Cao, Jiwei Qin, Xuzi Zhang. 1-5 [doi]
- UPCS: Unbiased Persona Construction for Dialogue GenerationKuiyun Chen, Yanbin Wei. 1-5 [doi]
- MHGNet: Multi-Heterogeneous Graph Neural Network for Traffic PredictionMei Wu, Yiqian Lin, Tianfan Jiang, Wenchao Weng. 1-5 [doi]
- Keeping the Best: The K-Best rule for Efficient Quickest Change Detection with Unknown Post-Change DistributionJames Zachary Hare, Lance M. Kaplan, Venugopal V. Veeravalli, Don Towsley. 1-5 [doi]
- Domain-Aware Knowledge Debiasing for Generalizable Video Understanding in CLIPQingmeng Zhu, Qihuan Wu, ZhiPeng Yu, Yi Li, Ziyin Gu, Hao He. 1-5 [doi]
- Generative Sensing: Pre-training LiDAR with Masked Autoencoders for Ultra-Frugal PerceptionSina Tayebati, Theja Tulabandhula, Amit Ranjan Trivedi. 1-5 [doi]
- Low-shot Image Classification Using Mixture of ExpertsZheng Zhang, Saket Sathe 0001. 1-5 [doi]
- A High-Precision Character Cartoon Style Transfer Method Based on VToonify and Diffusion ModelsWeiting Wang, Weiqi Wang, Feilong Bao. 1-5 [doi]
- Diffusion-based Target Device Style Transfer for Robust Acoustic Scene ClassificationWon-Gook Choi, Joon-Hyuk Chang. 1-5 [doi]
- Causality-Guided Context-Aware Multimodal Public Speaking Anxiety Detection for Out-of-Distribution GeneralizationTingting Zhang, Jiachen Tan, Zihua Xiong, Bin Wu 0001, Chunping Zheng. 1-5 [doi]
- Dual Path Unsupervised Real Image DenoisingYao Li, Zhengjun Zha. 1-5 [doi]
- DDA: Distillation-Driven Acceleration of the Reverse Diffusion Process for Stochastic Multi-Ship Trajectory PredictionKun Ma, Qilong Han, Jingzheng Yao, Changmao Wu, Yuntao Zhang. 1-5 [doi]
- Predicting Compact Phrasal Rewrites with Large Language Models for ASR Post EditingHao Zhang, Felix Stahlberg, Shankar Kumar. 1-5 [doi]
- DPM-LVSN: A Diffusion Probabilistic Model-based Left Ventricular Segmentation NetworkJia Luo, Yuhao Zhong, Yi Ding, Chenghao Zhou, Minghui Pang. 1-5 [doi]
- MARS6: A Small and Robust Hierarchical-Codec Text-to-Speech ModelMatthew Baas, Pieter Scholtz, Arnav Mehta, Elliott Dyson, Akshat Prakash, Herman Kamper. 1-5 [doi]
- Unsupervised Learning for Gain-Phase Impairment Calibration in ISAC SystemsJosé Miguel Mateos-Ramos, Christian Häger, Musa Furkan Keskin, Luc Le Magoarou, Henk Wymeersch. 1-5 [doi]
- SCS: Spatially Consistent Self-Supervised approach for One-Shot Anatomical Landmark DetectionLu Han, Boyu Chen, Zherui Zhang, Li Guo 0004, Shibiao Xu. 1-5 [doi]
- Diffusion Models are Good Unsupervised Class-agnostic Shape Part SegmentatorsZhongbin Jiang, Tianhao Shi, Hao Gao 0005, Jun Liu 0036, Ye Liu 0005. 1-5 [doi]
- M2PAIR: A High-Quality Acoustic Impulse Response Computation ModelZhiyu Li, Xinpei Zhao, Jing Wang 0037, Xinyuan Qian 0001, Xiang Xie. 1-5 [doi]
- Graph Signal Reconstruction via Koopman AutoencoderSivaram Krishnan, Jihong Park, Jinho Choi 0001. 1-5 [doi]
- LLFA: Fusing Global Illumination and Local Priors for Low-Light Face Image Enhancement with AdaptorZiqian Shao, Tao Wang 0052, Kaihao Zhang, Danhuai Zhao, Tong Lu. 1-5 [doi]
- Redefining Well Exposedness for Locally Adaptive Multi-Exposure FusionPrince Arya, Saurabh Kumar, Ashish Agarwal, Nutan Yenneti, Narasimha Pai. 1-5 [doi]
- Dual-Frequency Spatio-Temporal Phase UnwrappingShuo Du, Qin Zou 0001, Chi Chen 0002, Bisheng Yang. 1-5 [doi]
- DRSFANet: Dual-Path CNN with Residual and Frequency Attention for Image DenoisingManish Kumar, Suman Kumar Maji, Anirban Saha. 1-5 [doi]
- UAV-Mounted SIM: A Hybrid Optical-Electronic Neural Network for DoA EstimationShining Lin, Jiancheng An, Lu Gan 0003, Mérouane Debbah. 1-5 [doi]
- Quantum Multimodal Contrastive Learning FrameworkChi-Sheng Chen, Aidan Hung-Wen Tsai, Sheng-Chieh Huang. 1-5 [doi]
- CMGait: Enhancing Cross-Modality Gait Recognition between LiDAR and RGB through Contrastive Identity-consistent Feature AggregationYubo Wang, Bin Liu 0016, Zhiwei Zhao, Jixiang Niu, Qi Chu 0001, Nenghai Yu. 1-5 [doi]
- Speech Recognition Rescoring with Large Speech-Text Foundation ModelsPrashanth Gurunath Shivakumar, Jari Kolehmainen, Aditya Gourav, Yi Gu, Ankur Gandhe, Ariya Rastrow, Ivan Bulyko. 1-5 [doi]
- Perceptual Audio Coding: A 40-Year Historical PerspectiveJürgen Herre, Schuyler Quackenbush, Minje Kim, Jan Skoglund. 1-5 [doi]
- Microtitre Plate Image Augmentation with Generative Adversarial NetworksRu Li 0002, Tingting Chai, Samaneh Kouchaki, David A. Clifton, Yang Yang 0125. 1-5 [doi]
- Identification and Correction of Permutation Errors in Compressed Sensing-Based Group TestingShuvayan Banerjee, Sudhansh Peddabomma, Radhendushka Srivastava, James Saunderson, Ajit Rajwade 0001. 1-5 [doi]
- Transfer Learning for Covert Speech Classification Using EEG Hilbert Envelope and Temporal Fine StructureSaravanakumar Duraisamy, Mateusz Dubiel, Maurice Rekrut, Luis A. Leiva. 1-5 [doi]
- Contrastive Pre-Training and Post-Tuning for Heterogeneous Graph LearningYulan Hu, Sheng Ouyang, Zhirui Yang, Yong Liu. 1-5 [doi]
- FeedbackFuzz: Fuzzing Processors via Intricate Program Generation with Feedback EngineJiashun Wang, Baojiang Cui, Renhai Dong, Rundi Zhai. 1-5 [doi]
- WTU-EVAL: A Whether-or-Not Tool Usage Evaluation Benchmark for Large Language ModelsJian Liu 0032, Kangyun Ning, Yisong Su, Wenjuan Han, Jinan Xu, Yuanzhe Zhang. 1-5 [doi]
- Zero-shot Micro-video Classification with Dual Alignment Topic ModelJialong Wang, Shilong Zhang, Zhiguo Gong. 1-5 [doi]
- TriDE-Net: Triple-Densely Extraction Network for Precise Skin Lesion SegmentationHuan Wan, Taona Deng, Wujian Xu, Xin Wei 0002, Jinshan Zeng. 1-5 [doi]
- Minimalistic Video Saliency Prediction via Efficient Decoder & Spatio Temporal Action CuesRohit Girmaji, Siddharth Jain, Bhav Beri, Sarthak Bansal, Vineet Gandhi. 1-5 [doi]
- Advancing Active Speaker Detection for Egocentric VideosJaesung Huh, Juan Azcarreta Ortiz, Anurag Kumar 0003, Ashutosh Pandey 0004, Ali Aroudi, Daniel D. E. Wong, Francesco Nesta, Buye Xu, Jacob Donley. 1-5 [doi]
- EPCPE: A Real-time End-to-End Pipeline for RGB-based Category-level 6D Pose EstimationXiaofeng Fan, Jie Guo, Shichao Kan, Yixiong Liang. 1-5 [doi]
- FUTGA-MIR: Enhancing Fine-grained and Temporally-aware Music Understanding with Music Information RetrievalJunda Wu, Zachary Novack, Amit Namburi, Hao-Wen Dong, Carol Chen, Jiaheng Dai, Julian J. McAuley. 1-5 [doi]
- Enhancing Vision-Language Tracking by Effectively Converting Textual Cues into Visual CuesXiaokun Feng, Dailing Zhang, Shiyu Hu, Xuchen Li, Meiqi Wu, Jing Zhang, Xiaotang Chen, Kaiqi Huang. 1-5 [doi]
- Extracting Sparse Specialist Models from Generalist ModelsTao Yu 0013, Xu Zhao 0013, Yongqi An, Guibo Zhu, Ming Tang 0001, Jinqiao Wang. 1-5 [doi]
- A Ranking Scheme for Trust Region Multi-agent Reinforcement LearningRuichen Gao, Yi Hu, Deqin Zheng, Mengxuan Shao, Haiqi Zhu, Chenyue Song, Wei Zhang, Feng Jiang. 1-5 [doi]
- Enhancing Session-Based Recommendation with Hypergraph Motifs and Contrastive LearningTingxuan Chen, Liu Yang, Zidong Wang 0005, Guohui Li, Jun Long. 1-5 [doi]
- Dual-Path Consistency Unsupervised Domain Adaptation for Nighttime Semantic SegmentationYuwu Lu, Jicong Lang, Meirong Ding. 1-5 [doi]
- Subjective Fidelity Assessment of Audio- and Video-Driven Talking Head Generation MethodsAnthony Trioux, Yusong Gao, Jiarun Song, Wenjie Wu, Faming Ma, Fuzheng Yang. 1-5 [doi]
- MST-HA: Multi-Modal Signal Fusion with Bayesian Optimization for Robust Industrial Robot Joint Health AssessmentHaoyu Wang 0011, Zilong Yin, Bin Chen, Xiyue Yan, Chenyu Zhou, Beibei Zhang, Xinyuan Li, Guangmeng Xue, Haichao Xu. 1-5 [doi]
- Frank-Wolfe Method with Proximal Regularization for Constrained Federated Learning with Non-iid DataRobin Francis, Sundeep Prabhakar Chepuri. 1-5 [doi]
- Towards Green VAE: A Light Pixel-weighting Technique to Enhance Variational AutoEncoderCheng Zhong, Junlin Wu, Ziming Feng, Boan Chen, Junchi Yan. 1-5 [doi]
- MusicGen-Stem: Multi-stem music generation and edition through autoregressive modelingSimon Rouard, Robin San Roman, Yossi Adi, Axel Roebel. 1-5 [doi]
- Hybrid Feature Global Attention Network for Noisy-reverberant Speech EnhancementZehua Zhang, Shiyun Xu, Yinghan Cao, Changjun He. 1-5 [doi]
- SEAL: Speaker Error Correction using Acoustic-conditioned Large Language ModelsAnurag Kumar, Rohit Paturi, Amber Afshan, Sundararajan Srinivasan. 1-5 [doi]
- A Multi-Label EEG Dataset for Mental Attention State Classification in Online LearningHuan Liu 0012, Yuzhe Zhang 0003, Guanjian Liu, Xinxin Du, Haochong Wang, Dalin Zhang 0001. 1-5 [doi]
- Latent Space Score-based Diffusion Model for Probabilistic Multivariate Time Series ImputationGuojun Liang, Najmeh Abiri, Atiye Sadat Hashemi, Jens Lundström, Stefan Byttner, Prayag Tiwari. 1-5 [doi]
- GPPT: Gaussian Process-infused Prompt Tuning for Vision-language ModelsShijing Si, Haixia Sun 0001, Jiawen Gu. 1-5 [doi]
- A Plug-and-Play Diffusion-Styled Conversion Model for Domain Discrepancies in Medical Image SegmentationDong Liu, Zhiyong Wang, Linlin Guo. 1-5 [doi]
- On the Design of Low-Rank Differential Beamformers with Nonuniform Linear Microphone ArraysHanchen Pei, Kang Chen, Gongping Huang, Jilu Jin, Jacob Benesty, Jingdong Chen. 1-5 [doi]
- LaTeXNet: A Specialized Model for Converting Visual Tables and Equations to LaTeX CodeRenqiu Xia, Hongbin Zhou, Ziming Feng, Huanxi Liu, Boan Chen, Bo Zhang, Junchi Yan. 1-5 [doi]
- A Scale-Adaptive and Background-Robust Method for Surface Defect DetectionJiahao Dong, Zuo Zuo, Zongze Wu 0001, Meiqin Liu. 1-5 [doi]
- Non-invasive Speaker-dependent Continuous Phoneme Recognition with a Radar-based Silent Speech InterfaceJoão Vítor Menezes, Christoph Wagner, Peter Steiner, Petr Schaffer, Dirk Plettemeier, Peter Birkholz. 1-5 [doi]
- Accelerating Computation for Large-Scale Wide-Band RF ImagingZiyu Zhou, Yiming Zhou, Wei Dai. 1-5 [doi]
- Mask-guided Multi-scale Spatial-Spectral Transformer for Snapshot Compressive ImagingHeyuan Yin, Jingxiang Yang, Jia Liu 0020, Liang Xiao 0001. 1-5 [doi]
- Improving Open-vocabulary Video Visual Relation Detection with Decomposed Prompt Learning and Relation AdjustmentMing Pei, Yi Tan 0001, Yanbin Hao, Hao Zhang 0047, Jinmeng Wu, Basura Fernando, Xun Yang 0001. 1-5 [doi]
- A Generalized Graph Signal Processing Framework for Multiple Hypothesis Testing over NetworksXingchao Jian, Martin Gölz, Feng Ji, Wee-Peng Tay, Abdelhak M. Zoubir. 1-5 [doi]
- Integrating Failures in Robot Skill Acquisition with Offline Action-Sequence Diffusion RLHecheng Wang, Lizhe Qi, Yunquan Sun. 1-5 [doi]
- Joint Energy-Based Optimization of Binary Offloading Decisions and Communication Resources in TDMA Systems, via Dynamic ProgrammingRuilin Ji, Sorina Dumitrescu. 1-5 [doi]
- Improving Sidescan Sonar Performance Using Array Upsampling Beamforming Synthetic ApertureWeibo Mao, Dongdong Zhao, Peng Chen, Shihui Liang, Yiran Li, Ronghua Liang. 1-5 [doi]
- Stream-TTS: A Low-Latency Text-to-Speech using Kolmogorov-Arnold Networks for Streaming Speech ApplicationsGiridhar Pamisetty, Riya Ann Easow, Kaustubh Gupta, K. Sri Rama Murty. 1-2 [doi]
- Dynamic SpikFormer: Low-Latency & Energy-Efficient Spiking Neural Networks with Dynamic Time Steps for Vision TransformersGourav Datta, Zeyu Liu 0003, Anni Li, Peter A. Beerel. 1-5 [doi]
- D2-MLP: Dynamic Decomposed MLP Mixer for Medical Image SegmentationJin Yang, Jing Yang, Xiaobing Yu, Peijie Qiu, Sunil Prajapat. 1-5 [doi]
- SSE: A Speaking Style Extractor Based on Fine-Grained Contrastive Learning between Speech and Descriptive TextZixing Zhang 0001, Yimeng Wu, Zhongren Dong, Wulong Xiang, Shengfan Shen, Björn W. Schuller. 1-5 [doi]
- Deep-Relative-Trust-Based Diffusion for Decentralized Deep LearningMuyun Li, Aaron Fainman, Stefan Vlaski. 1-5 [doi]
- Target Speaker ASR with WhisperAlexander Polok, Dominik Klement, Matthew Wiesner, Sanjeev Khudanpur, Jan Cernocký, Lukás Burget. 1-5 [doi]
- Filtering Resistant Large Language Model Watermarking via Style InjectionZhaojun Guo, Guobiao Li, Junqiang Huang, Xinpeng Zhang 0001, Zhenxing Qian, Sheng Li 0006. 1-5 [doi]
- Partial Inference in Structured PredictionChuyang Ke, Deepak Maurya, Jean Honorio. 1-5 [doi]
- ATP-TTS: Adaptive Thresholding Pseudo-Labeling for Low-Resource Multi-Speaker Text-to-SpeechFeng Li, Shen Chen, Hanjin Yang, Shupei Yuan. 1-5 [doi]
- AdaBoost-Based Channel Estimation in One-Bit Millimeter-Wave MIMOMajdoddin Esfandiari, Petteri Pulkkinen, Sergiy A. Vorobyov, Visa Koivunen. 1-5 [doi]
- CGDD: Contrastive Gaussian-Dirac Diffusion ModelChih-Chun Chen, Hsin-Yi Lin, Jen-Tzung Chien. 1-5 [doi]
- Improving Zero-Shot Chinese-English Code-Switching ASR with kNN-CTC and Gated Monolingual DatastoresJiaming Zhou, Shiwan Zhao, Hui Wang 0075, Tian-Hao Zhang, Haoqin Sun, Xuechen Wang, Yong Qin. 1-5 [doi]
- Panorama: An enabling technology for HearablesQiyu Rao, Zdenka Babic, Scott C. Douglas, Danilo P. Mandic. 1-5 [doi]
- ReTD: Reconstruction-Based Traceability Detection for Generated ImagesWeizhuo Chen, Fangfang Yuan, Cong Cao 0001, Kun Peng, Dakui Wang, Yanbing Liu. 1-5 [doi]
- Reducing the Gap Between Pretrained Speech Enhancement and Recognition Models Using a Real Speech-Trained Bridging ModuleZhongjian Cui, Chenrui Cui, Tianrui Wang, Mengnan He, Hao Shi, Meng Ge, Caixia Gong, Longbiao Wang, Jianwu Dang 0001. 1-5 [doi]
- Multivariate Time Series Data Mining for Failure Prediction & Root Cause AnalysisNaman Agarwal, Gagan Raj Gupta, Vishwesh Jatala, Aniket Saha. 1-5 [doi]
- Unveiling Local Well-posedness Influence for Cross-modal Person Re-IdentificationYumeng Yang, Guannan Dong, Aichun Zhu, Mingcheng Ni, Yifeng Li 0002. 1-5 [doi]
- DuCol: Text-Tag Adaptive Colorization of Dual-Character Line ArtJun Liang 0002, Rui Luo, Yang Peng, Hai Su. 1-5 [doi]
- Enhanced Loudspeaker Membrane Excursion Control Method Using Low Latency Distortion Prediction and Efficient LSTM NetworksRishabh Gupta, Sandeep Agri, Yughendaran Palanivel, Omsrinath Chelamkuri, Raj Narayana Gadde. 1-5 [doi]
- Collaborative Dual-Branch Spatial-Frequency Enhancement Network for Low-Light ImagesTao He, Tiecheng Song, Yin Liu, Feng Yang, Ruiyuan Chen, Zhixin Li. 1-5 [doi]
- Relation-aware Semantic Alignment Network for Text-to-Image Person RetrievalYong Wu, Rongxi Zhou, Hongchao Li, Ze Zhou, Feifei Wei, Min Li, Guodui He. 1-5 [doi]
- Past, Present, and Future of Spatial Audio and Room AcousticsShoichi Koyama, Enzo De Sena, Prasanga N. Samarasinghe, Mark R. P. Thomas, Fabio Antonacci. 1-5 [doi]
- Punctuation Restoration: A Case Study of BERT-Based Models' Task-Specific ExcellenceQishuai Zhong, Aixin Sun. 1-5 [doi]
- Spiking Transformer with Spatial-Temporal Spiking Self-AttentionZhaokun Zhou, Jun Niu, Yang Zhang, Li Yuan 0007, Yuesheng Zhu. 1-5 [doi]
- FreeLesion: Synthetic Image-Mask Pairs for Fundus Lesion Segmentation via Curriculum Learning and Feature-Loss Guided FilteringPeilei Fu, Song Guo. 1-5 [doi]
- M2R-Whisper: Multi-stage and Multi-scale Retrieval Augmentation for Enhancing WhisperJiaming Zhou, Shiwan Zhao, Jiabei He 0001, Hui Wang 0075, Wenjia Zeng, Yong Chen, Haoqin Sun, Aobo Kong, Yong Qin. 1-5 [doi]
- U-SAM: Upgrade Segment Anything Model With Semantic-Aware and Memory-EfficientXiaofeng Jin, Jie Hu 0018, Jianghang Lin, Shengchuan Zhang, Liujuan Cao. 1-5 [doi]
- Speaker Embedding Informed Audiovisual Active Speaker Detection for Egocentric RecordingsJason Clarke, Yoshihiko Gotoh, Stefan Goetze. 1-5 [doi]
- Adapt and Feature Translation for Class-Incremental Learning with Pre-Trained ModelsHongfeng Li, GeMing Xia, Yuze Zhang, Hongcheng Li, Hongwei Huang, Jiawen Wu. 1-5 [doi]
- Dynamic Routing and Calibration for Few-Shot Object DetectionJiaQi Wu, Jie Lei 0002, Hao Tian, Xiaoqiang Liu, Zunlei Feng, Ronghua Liang. 1-5 [doi]
- SoundCollage: Automated Discovery of New Classes in Audio DatasetsRyuhaerang Choi, Soumyajit Chatterjee, Dimitris Spathis, Sung-Ju Lee, Fahim Kawsar, Mohammad Malekzadeh. 1-5 [doi]
- Preconditioned Sharpness-Aware Minimization: Unifying Analysis and a Novel Learning AlgorithmYilang Zhang, Bingcong Li, Georgios B. Giannakis. 1-5 [doi]
- Granularity-Aware Contrastive Learning for Fine-Grained Action RecognitionHailun Zhang, Xinrui Wang, Qijun Zhao. 1-5 [doi]
- Deep Learning-Based Perceptual Vibrotactile Codec with Rate ScalabilityLars Nockenberg, Wenxuan Wei, Mariam Navai, Eckehard G. Steinbach. 1-5 [doi]
- AudioCache: Accelerate Audio Generation With Training-Free Layer CachingQingyang Shi, Zhicheng Du, Jiasheng Lu, Yingshan Liang, Xinyu Zhang, Yiran Wang, Jing Peng, Kehong Yuan. 1-5 [doi]
- DeformAvatar: Point-Based Human Avatar Re-targeting and RenderingRenyi Zhan, Zhi-Qi Cheng, Junyao Chen, Xiaojiang Peng. 1-5 [doi]
- MpoxMamba: A Grouped Mamba-based Lightweight Hybrid Network for Mpox DetectionYubiao Yue, Jun Xue, Haihuang Liang, Zhenzhang Li, Yufeng Wang. 1-5 [doi]
- Do Less and Achieve More: Free Condition Video Outpainting with Diffusion ModelHaofan Huang, Yinlin Guo, Yening Lv, Sizhe Shan, Yan Zhang, Yuehai Wang. 1-5 [doi]
- Tracking Network Dynamics using Probabilistic State-Space ModelsVictor M. Tenorio, Elvin Isufi, Geert Leus, Antonio G. Marques. 1-5 [doi]
- Investigating Training Objectives for Generative Speech EnhancementJulius Richter, Danilo de Oliveira, Timo Gerkmann. 1-5 [doi]
- Multi-object detection and tracking algorithm for fry counting based on DV-YOLO and FryMOTHang Yuan, Zhibin Yu 0002, Qiusheng Li, Tianning Fu, Bing Zheng. 1-5 [doi]
- Diff-SAGe: End-to-End Spatial Audio Generation Using Diffusion ModelsSaksham Singh Kushwaha, Jianbo Ma, Mark R. P. Thomas, Yapeng Tian, Avery Bruni. 1-5 [doi]
- Explaining Representations in Correlation-based Deep Multiview Representation LearningMaurice Kuschel, Amr Alkhatib, Tanuj Hasija, Henrik Boström. 1-5 [doi]
- Chat-Driven 3D Human Pose and Shape Editing with Large Language ModelsFeng Zhou 0007, Chi Li, Ju Dai, Mengxiao Zhu 0004, YongMei Zhang, Yu-Kun Lai, Paul L. Rosin. 1-5 [doi]
- Multi-modal Dynamic Point Cloud Geometric Compression Based on Bidirectional Recurrent Scene FlowFangzhe Nan, Frederick W. B. Li, Zhuoyue Wang, Gary K. L. Tam, Zhaoyi Jiang, DongZheng DongZheng, Bailin Yang. 1-5 [doi]
- AdaptiveDrop: A Simple Adaptive Label Noise Filtering Scheme for Enhanced Self-supervised Speaker VerificationAbderrahim Fathan, Xiaolin Zhu, Jahangir Alam 0001. 1-5 [doi]
- Low-Rank Tensors for Multi-Dimensional Markov ModelsMadeline Navarro, Sergio Rozada, Antonio G. Marques, Santiago Segarra. 1-5 [doi]
- Explaining Speaker and Spoof Embeddings via ProbingXuechen Liu, Junichi Yamagishi, Md. Sahidullah, Tomi Kinnunen. 1-5 [doi]
- IEEE 802.11ad-Aided 5-D Sensing With a UAV Swarm in Urban EnvironmentAkanksha Sneh, Shobha Sundar Ram, Kumar Vijay Mishra. 1-5 [doi]
- KGD-GNN: A Knowledge-Guided Graph Neural Network for Myocardial Infarction Localization via 12-lead ECGLin Guo, Yingqi Wu, Nan Ma, Ying An. 1-5 [doi]
- Rethinking Dual-Stream Super-Resolution for Enhancing Remote Sensing Object DetectionAn Luo, Kai Hu, Kai Jiang. 1-5 [doi]
- Pyramid Attention Enhancement Network for Nighttime UAV TrackingXiaomin Huang, Zhenhua Wu, Ying Li 0017, Changjing Shang, Qiang Shen 0001. 1-5 [doi]
- ADC: Enhancing Function Calling Via Adversarial Datasets and Code Line-Level FeedbackWei Zhang, Yi Zhang, Li Zhu, Qianghuai Jia, Feijun Jiang, Hongcheng Guo, Zhoujun Li, Mengping Zhou. 1-5 [doi]
- Transducer-Llama: Integrating LLMs into Streamable Transducer-based Speech RecognitionKeqi Deng, Jinxi Guo, Yingyi Ma, Niko Moritz, Philip C. Woodland, Ozlem Kalinli, Mike Seltzer. 1-5 [doi]
- ZVEFusion: Zero-Shot Visual Enhancement Fusion for Infrared and Visible Images in Low LightDuo Liu, Yiqi Shi, Guoyin Zhang, Sizhao Li, Liguo Zhang. 1-5 [doi]
- Distributed IRSs Mitigate Spatial Wideband & Beam Split EffectsL. Yashvanth, Chandra R. Murthy, Bhaskar D. Rao. 1-5 [doi]
- Non-contact Quickest Abnormal Heart Rate Detection using MIMO RadarPeichao Wang, Qian He 0002. 1-5 [doi]
- Networked ISAC Beamforming Design with Capacity-Limited Fronthaul LinksKexin Zhang, Yanqing Xu, Tsung-Hui Chang. 1-5 [doi]
- UCIL: An Unsupervised Class Incremental Learning Approach for Sound Event DetectionYang Xiao, Rohan Kumar Das. 1-5 [doi]
- Spatio-Temporal Mixed Graph Neural Controlled Differential Equations with Adaptive Connection Sampling for Irregular Multivariate Time Series Anomaly DetectionXudong Jia, Wei Peng 0005, Chiran Shen, BaoKang Zhao, Peng Xun. 1-5 [doi]
- Variance-reduced Clipping for Non-convex OptimizationAmirhossein Reisizadeh, Haochuan Li, Subhro Das, Ali Jadbabaie. 1-5 [doi]
- A Cross-Modal Multi-Attitude Framework for the Generation of Space Target ISAR ImagesDerong Kong, Huaizhang Liao, Jingyuan Xia. 1-5 [doi]
- Towards Unbiased Evaluation of Time-series Anomaly DetectorDebarpan Bhattacharya, Sumanta Mukherjee, Chandramouli Kamanchi, Vijay Ekambaram, Arindam Jati, Pankaj Dayama 0001. 1-5 [doi]
- FDR Control for Complex-Valued Data with Application in Single Snapshot Multi-Source Detection and DOA EstimationFabian Scheidt, Jasin Machkour, Michael Muma. 1-5 [doi]
- Conformal Prediction for Manifold-based Source Localization with Gaussian ProcessesVadim Rozenfeld, Bracha Laufer-Goldshtein. 1-5 [doi]
- JSUnet: A New Hybrid U-shaped Network for Jamming SuppressionShuang Li, Ganggang Dong. 1-5 [doi]
- TIRPL: Tailored-Made Inverse Rendering for Point-Light ScenesZonglin Tian, Sicong Cheng, Junli Zhao, Fuqing Duan. 1-5 [doi]
- Bridging Modality Gap with Large Speech and Language Models for End-to-End Speech-to-Text TranslationWeitai Zhang, Simran Naagar, Zhongyi Ye, Peiwang Tang, Xinyuan Zhou, Junhua Liu, Lirong Dai 0001. 1-5 [doi]
- Span Attention for Entity-Consistent Task-Oriented Dialogue Response GenerationJiale Chen, Xuelian Dong, Wenxiu Xie, Tao Gong 0001, Fu Lee Wang, Tianyong Hao. 1-5 [doi]
- LLaVA-SG: Leveraging Scene Graphs as Visual Semantic Expression in Vision-Language ModelsJingyi Wang, Jianzhong Ju, Jian Luan 0001, Zhidong Deng. 1-5 [doi]
- Label Relationship Graph-Enhanced Class Hierarchy for Incremental Classification of Remote Sensing ImagesYang Chu 0001, Yuntao Qian. 1-5 [doi]
- Audio Decoding by Inverse Problem SolvingPedro J. Villasana T., Lars F. Villemoes, Janusz Klejsa, Per Hedelin. 1-5 [doi]
- V2SFlow: Video-to-Speech Generation with Speech Decomposition and Rectified FlowJeongsoo Choi, Ji-Hoon Kim, Jinyu Li 0001, Joon Son Chung, Shujie Liu 0001. 1-5 [doi]
- LDG: Lightweight Deformable 3D Gaussians for Single-View Dynamic Scene ReconstructionYouhong Peng, Weixing Xie, Jinwen Li, Shaoqi Wu, Zefeng Wang, Bingbing Hu, Junfeng Yao. 1-5 [doi]
- Interactive Robot Action Replanning using Multimodal LLM Trained from Human Demonstration VideosChiori Hori, Motonari Kambara, Komei Sugiura, Kei Ota, Sameer Khurana, Siddarth Jain, Radu Corcodel, Devesh K. Jha, Diego Romeres, Jonathan Le Roux. 1-5 [doi]
- Learning Semantic Facial Descriptors for Accurate Face AnimationLei Zhu, Yuanqi Chen, XiaoHang Liu, Thomas H. Li, Ge Li. 1-5 [doi]
- Density-aware and Depth-aware Visual Representation for Zero-Shot Object CountingFang Nan, Feng Tian 0002, Ni Zhang, Nian Liu, Haonan Miao, Guang Dai, Mengmeng Wang 0005. 1-5 [doi]
- SelaFD: Seamless Adaptation of Vision Transformer Fine-tuning for Radar-based Human Activity RecognitionYijun Wang, Yong Wang, Chendong Xu, Shuai Yao 0002, Qisong Wu. 1-5 [doi]
- Translating Mental Imaginations into Characters with Codebooks and Dynamics-Enhanced DecodingJingyuan Li, Yansen Wang, Nie Lin, Dongsheng Li 0002. 1-5 [doi]
- A Multi-modal Approach to Dysarthria Detection and Severity Assessment Using Speech and Text InformationAnuprabha M, Krishna Gurugubelli, Kesavaraj V, Anil Kumar Vuppala. 1-5 [doi]
- BIAWDiff: Enhancing Low-Light Images with Bio-Inspired Attention and Wavelet DiffusionZeYu Li, Sheng Yang, Hanxiang Yang, Xiongxin Tang, Fengge Wu, Fanjiang Xu. 1-5 [doi]
- Cross-Talk Detection in the IVAS Stereo Codec Based on GCC-PHATVladimir Malenovsky, Tommy Vaillancourt, Milan Jelinek, Eleni Fotopoulou, Emmanuel Ravelli. 1-5 [doi]
- Achieving Robustness in Blind Modulo Analog-to-Digital ConversionAmir Weiss. 1-5 [doi]
- Autoregressive Density Estimation Transformers for Multivariate Time Series Anomaly DetectionMohammed Ayalew Belay, Adil Rasheed, Pierluigi Salvo Rossi. 1-5 [doi]
- FlowMAC: Conditional Flow Matching for Audio Coding at Low Bit RatesNicola Pia, Martin Strauss 0003, Markus Multrus, Bernd Edler. 1-5 [doi]
- Self-supervised Multimodal Speech Representations for the Assessment of Schizophrenia SymptomsGowtham Premananth, Carol Y. Espy-Wilson. 1-5 [doi]
- Found In The Distribution: Utilizing Latent Dirichlet Allocation Improves Long Context Comprehension of Large Language ModelsZhenyu Guan, Xun Liang 0001, Sensen Zhang. 1-5 [doi]
- SEF-PNet: Speaker Encoder-Free Personalized Speech Enhancement with Local and Global Contexts AggregationZiling Huang, Haixin Guan, Haoran Wei, Yanhua Long. 1-5 [doi]
- Easy, Interpretable, Effective: openSMILE for voice deepfake detectionOctavian Pascu, Dan Oneata, Horia Cucu, Nicolas M. Müller. 1-5 [doi]
- Geometric Feature-Driven Metric Learning for 3D Craniofacial SuperimpositionQingdong Long, Junli Zhao, Fuqing Duan, Chengyuan Wang, Xuesong Wang 0004, Lijie Geng, Zhenkuan Pan 0001, Mingquan Zhou. 1-5 [doi]
- COSMIC waveforms for Integrated Communication and ImagingMarco Manzoni, Francesco Linsalata, Maurizio Magarini, Stefano Tebaldini. 1-5 [doi]
- Speaking Without Sound: Multi-speaker Silent Speech Voicing with Facial Inputs OnlyJaejun Lee, Yoori Oh, Kyogu Lee. 1-5 [doi]
- GDDA: Semantic OOD Detection on Graphs under Covariate Shift via Score-Based Diffusion ModelsZhixia He, Chen Zhao 0010, Minglai Shao 0001, Yujie Lin, Dong Li 0034, Qin Tian. 1-5 [doi]
- Dual Multi-Scale GCN with Deformable Temporal Kernel for Skeleton-based Action RecognitionJianan Li 0003, Yangtao Zhou, Hua Chu, Han Wang, Zhifu Zhao, Fei Li, Qingshan Li. 1-5 [doi]
- Self-Supervised Localized Topology Consistency for Noise-Robust Hyperspectral Image ClassificationJie Wang, Liaoyuan Tang, Guanxiong He, Zhe Cao, Zheng Wang, Rong Wang. 1-5 [doi]
- Multi-layer Network Disintegration via Deep Reinforcement LearningZhenhua Liang, Xueqiong Li, Jun-Jie Huang, Nan Hu, Shaowu Yang, Hengzhu Liu. 1-5 [doi]
- TextureDiffusion: Target Prompt Disentangled Editing for Various Texture TransferZihan Su, Junhao Zhuang, Chun Yuan. 1-5 [doi]
- GaussianSlicer: Efficient Surface Reconstruction from Cross-sectional Slices with Gaussian SplattingYuhu Guo, Chenghao Qian, Yuhong Mo, Akkarit Sangpetch. 1-5 [doi]
- Integrating Audio Narrations to Strengthen Domain Generalization in Multimodal First-Person Action RecognitionCagri Gungor, Adriana Kovashka. 1-5 [doi]
- Robust Low-Light Human Pose Estimation through Illumination-Texture ModulationFeng Zhang, Ze Li, Xiatian Zhu, Lei Chen. 1-5 [doi]
- Knowledge Distillation for Image Restoration : Simultaneous Learning from Degraded and Clean ImagesYongheng Zhang 0003, Danfeng Yan. 1-5 [doi]
- Positive Enhanced Preference Alignment for Text-to-Image ModelsHaoyuan Sun, Bo Xia, Yifei Zhao, Yongzhe Chang, Xueqian Wang. 1-5 [doi]
- WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity VerificationJunzuo Zhou, Jiangyan Yi, Yong Ren, Jianhua Tao 0001, Tao Wang 0074, Chuyuan Zhang. 1-5 [doi]
- Enhancing Standard and Dialectal Frisian ASR: Multilingual Fine-tuning and Language Identification for Improved Low-resource PerformanceReihaneh Amooie, Wietse de Vries, Yun Hao, Jelske Dijkstra, Matt Coler, Martijn Wieling 0001. 1-5 [doi]
- MFT: Modal Fusion Transformer for Cross-Modal Fusion in 3D Object DetectionHaojie Cai, Dongfu Yin, Fei Yu, Siting Xiong. 1-5 [doi]
- MX-Font++: Mixture of Heterogeneous Aggregation Experts for Few-shot Font GenerationWeihang Wang 0011, Duolin Sun, Jielei Zhang, Longwen Gao. 1-5 [doi]
- Graph Topology Identification Based on Covariance MatchingYongsheng Han, Alberto Natali, Geert Leus. 1-5 [doi]
- From Voices to Beats: Enhancing Music Deepfake Detection by Identifying Forgeries in BackgroundZhaolin Wei, Dengpan Ye, Jiacheng Deng 0003, Yuhan Lin. 1-5 [doi]
- LINK: Adaptive Modality Interaction for Audio-Visual Video ParsingLangyu Wang, Bingke Zhu, Yingying Chen 0003, Jinqiao Wang. 1-5 [doi]
- An Intra- and Cross-frame Topological Consistency Scheme for Semi-supervised Atherosclerotic Coronary Plaque SegmentationZiheng Zhang, Zihan Li, Dandan Shan, Yuehui Qiu, Qingqi Hong, Qingqiang Wu 0001. 1-5 [doi]
- Revisiting Acoustic Features for Robust ASRMuhammad A. Shah, Bhiksha Raj. 1-5 [doi]
- Planetary gear vibration monitoring using synchronous demodulationRik Vaerenberg, Konstantinos Gryllias. 1-5 [doi]
- Diffusion-based Data Augmentation for Object Counting ProblemsZhen Wang, Yuelei Li, Jia Wan 0001, Nuno Vasconcelos. 1-5 [doi]
- Trainingless Adaptation of Pretrained Models for Environmental Sound ClassificationNoriyuki Tonami, Wataru Kohno, Keisuke Imoto, Yoshiyuki Yajima, Sakiko Mishima, Reishi Kondo, Tomoyuki Hino. 1-5 [doi]
- Black-Box Adversarial Defense Against Voice Conversion Using Latent Space PerturbationJie Gao, Haiyun Li, Zhisheng Zhang, Zhiyong Wu. 1-5 [doi]
- AGIAA-2K: A Fine-grained Dataset for Aesthetic and Alignment Evaluation of AI-Generated ImagesBo Hu 0008, Nanxiang Li, Lihuo He, Wen Lu, Leida Li, Xinbo Gao 0001. 1-5 [doi]
- SFFCE-CD: Spatial And Frequency Feature Cross Enhancement For Change DetectionYan Xing, Jiali Hu, Binbin Jiang, Qingyi Zhao, Longxi Feng, Rui Huang. 1-5 [doi]
- Simultaneous Diarization and Separation of Meetings through the Integration of Statistical Mixture ModelsTobias Cord-Landwehr, Christoph Boeddeker, Reinhold Haeb-Umbach. 1-5 [doi]
- LKConvPose: A Pose Estimation Model with Large Receptive FieldYing Huang 0003, Qiang Chen, Xiu-Xiu Zhan, Jianzhang Zhang, Chuang Liu 0001. 1-5 [doi]
- Joint Training Framework for Accent and Speech Recognition Based on Conformer Low-Rank AdaptationXuyi Zhuang, Yukun Qian, Shiyun Xu, Mingjiang Wang. 1-5 [doi]
- Extract Information from Hybrid Long Documents Leveraging LLMs: A Framework and DatasetChongjian Yue, Xinrun Xu, Xiaojun Ma, Lun Du, Zhiming Ding, Shi Han, Dongmei Zhang 0001, Qi Zhang 0066. 1-5 [doi]
- Enhancing Multimodal Sentiment Analysis for Missing Modality through Self-Distillation and Unified Modality Cross-AttentionYuzhe Weng, Haotian Wang, Tian Gao, Kewei Li, Shutong Niu, Jun Du. 1-5 [doi]
- Task Vector Arithmetic for Low-Resource ASRHaruki Nagasawa, Shinta Otake, Shinji Iwata. 1-5 [doi]
- A Counterfactual Ultrasound Anti-Interference Self-Supervised Network for B-mode Ultrasound Tongue ExtractionYan Jia 0001, Yuqing Cheng, Kele Xu, Yong Dou, Peng Qiao, Zhouyu He. 1-5 [doi]
- Hypothesis Clustering and Merging: Novel MultiTalker Speech Recognition with Speaker TokensYosuke Kashiwagi, Hayato Futami, Emiru Tsunoo, Siddhant Arora, Shinji Watanabe 0001. 1-5 [doi]
- Metadata-assisted Pose Correction for Immersive Audio Split RenderingRishabh Tyagi, Stefan Bruhn, Jeroen Breebaart. 1-5 [doi]
- Training Better Embedding With Perturbed Data Augmentation for Automatic Singing Quality AssessmentPo-Wei Chen, Von-Wun Soo. 1-5 [doi]
- HyperSMOTE: A Hypergraph-based Oversampling Approach for Imbalanced Node ClassificationsZiming Zhao 0010, Tiehua Zhang, Zijian Yi, Zhishu Shen. 1-5 [doi]
- GIST: Guided Interpretable Large Language Model Strategy Transfer for Multi-Task Reinforcement LearningBo Zhao, Zhuo Tang. 1-5 [doi]
- Plaintext-Free Deep Learning for Privacy-Preserving Medical Image Analysis through Frequency Information EmbeddingMengyu Sun, Ziyuan Yang, Maosong Ran, Zhiwen Wang, Hui Yu, Yi Zhang. 1-5 [doi]
- Self-Enhanced Reasoning Training: Activating Latent Reasoning in Small Models for Enhanced Reasoning DistillationYong Zhang, Bingyuan Zhang, Zhitao Li 0002, Ming Li, Ning Cheng 0001, Minchuan Chen, Tao Wei 0003, Jun Ma, Shaojun Wang, Jing Xiao. 1-5 [doi]
- A Domain Adversarial Learning Framework for Major Depression Disorder DiagnosisShaozhe Liu, Leike An, Ziyu Jia. 1-5 [doi]
- Intelligent Target Maneuverability in Presence of Tracking with Multiple RadarsBhavani Shankar Mysore Rama Rao, Jyoti Bhatia, Kunwar Pritiraj Rajput, Björn E. Ottersten. 1-5 [doi]
- What Are They Doing? Joint Audio-Speech Co-ReasoningYingzhi Wang 0002, Pooneh Mousavi, Artem Ploujnikov, Mirco Ravanelli. 1-5 [doi]
- ControlMol: Adding Substructure Control To Molecule Diffusion ModelsZhengyang Qi, Zijing Liu, Jiying Zhang, He Cao, XiaoHua Xu, Yu Li 0003. 1-5 [doi]
- LitePest: Real-Time and Efficient Detection of Agricultural Pests Using an Advanced Lightweight Deep Learning NetworkZhe Tang, Jiajia Lu, Wei Xiang, Wanyu Ling, Lingyan Zhang. 1-5 [doi]
- Learning-Aided Kalman Tracking in Biased Dynamic Systems: The Case of Cable-Driven Robots for SurgeryLinoy Ketashvili, Shachar Ashkenasy, Ilana Nisky, Nir Shlezinger. 1-5 [doi]
- Fine-grained Vital Sign Reconstruction through Machine Learning on Multi-channel Radar SignalsChangming Li, Cong Shi 0004, Athina P. Petropulu, Yingying Chen 0001. 1-5 [doi]
- STA-V2A: Video-to-Audio Generation with Semantic and Temporal AlignmentYong Ren, Chenxing Li, Manjie Xu, Wei Liang, Yu Gu, Rilin Chen, Dong Yu 0001. 1-5 [doi]
- MPAM-3DGS: Multi-Parametric Adversarial Manipulation for 3D Gaussian SplattingWenxiang Jiang 0002, Hanwei Zhang 0001, Weigang Wang, Zhongwen Guo, Tianao Zhang, Hao Wang 0003. 1-5 [doi]
- TextHair3D: Text-driven 3D Hair Editing with Generative PriorsXiaoxue Li, Huilong Pi, Yunchuan Qin, Ruihui Li, Kenli Li 0001. 1-5 [doi]
- Segmentation-Guided Sparse Transformer for Under-Display Camera Image RestorationJingyun Xue, Tao Wang, Pengwen Dai, Kaihao Zhang. 1-5 [doi]
- Multi-Stage Multimodal Distillation for Audio-Visual Speaker TrackingYidi Li, Wenkai Zhao, Zeyu Wang, Zhenhuan Xu, Bin Ren, Nicu Sebe. 1-5 [doi]
- Injecting Visual Features into Whisper for Parameter-Efficient Noise-Robust Audio-Visual Speech RecognitionZhao Yang, Yue Heng Yeo, Rui Jiang, Xiao Fu, Weiguang Chen, Wei Xi, Jizhong Zhao. 1-5 [doi]
- DGJA: Dependency Graph-enhanced Joint Attention Structure for Multimodal Sarcasm DetectionYiming Liu, Rui Song 0008, Lida Shi, Ling Gao, Hao Xu 0012. 1-5 [doi]
- High-Resolution Gait Micro-Doppler Synthesis from Videos Over Diverse TrajectoriesShubo Yang 0002, Soheil Hor, Jaeho Choi, Amin Arbabian. 1-5 [doi]
- Graph Refinement in Latent Space: A Hypergraph Convolution for Underwater Object DetectionMeghna Kapoor, Badri Narayan Subudhi, Ankur Bansal. 1-5 [doi]
- Camouflaged Object Detection via Neural Architecture SearchXin Li, Keren Fu, Qijun Zhao. 1-5 [doi]
- RIS-Enabled Self-Interference Elimination in Monostatic Full-Duplex DFRC SystemsLinlong Wu, Zichao Xiao, Bhavani Shankar M. R., Björn E. Ottersten. 1-5 [doi]
- TIDE-Net: A Physics-Based Graph Model for Predicting Tropical Cyclone Impacts on Estuarine SystemsGaowei Zhang, Wei Wang 0353, Yi Wang 0013. 1-5 [doi]
- Outage Analysis of IRS-Aided Wireless Energy Transfer Under Correlation and Imperfect CSIChandan Kumar, Salil Kashyap. 1-5 [doi]
- Generalizable Real-time Accelerated Dynamic MRISilpa Babu, Wahidul Alam, Rushdi Zahid Rusho, Sajan Goud Lingala, Namrata Vaswani. 1-5 [doi]
- Global Static Pruning via Adaptive Sample Complexity AwarenessMing Ma, Yue Wang, Taoli Du, Qinxu Gao, Ying Wang 0024, Wenhui Li 0002. 1-5 [doi]
- Debiased Training For Semi-supervised Sound Event DetectionShengchang Xiao, Xueshuai Zhang, Pengyuan Zhang, Yonghong Yan 0002. 1-5 [doi]
- Towards An Integrated Approach for Expressive Piano Performance Synthesis from Music ScoresJingjing Tang, Erica Cooper, Xin Wang 0037, Junichi Yamagishi, György Fazekas. 1-5 [doi]
- Compact Neural TTS Voices for AccessibilityKunal Jain, Eoin Murphy, Deepanshu Gupta, Jonathan Dyke, Saumya Shah, Vasilieios Tsiaras, Petko Nikolov Petkov, Alistair Conkie. 1-5 [doi]
- Near-Field FMCW SAR Imaging With Fast BPAZhengguang Xu, Xiaoxu Liu. 1-5 [doi]
- OSR: Toward Developing Efficient Federated Learning-based Human Activity Recognition using Optimal Server RepresentationsEnsieh Khazaei, Bilal Taha, Alireza Esmaeilzehi, Dimitrios Hatzinakos. 1-5 [doi]
- Voxel-sensitive Wavelet-based Approach for Structural Distortion in Low-Dose CT ImagesNaragoni Saidulu, Vinit Kumar, Priya Ranjan Muduli. 1-5 [doi]
- PriorSinger: Singing Voice Synthesis Model with Prior Condition Cross AttentionZehua Zhang, Bosong Yan, Yinghan Cao, Mingjiang Wang. 1-5 [doi]
- A Model Stealing Attack Against Multi-Exit NetworksPan Li, Peizhuo Lv, Kai Chen, Shengzhi Zhang, Yuling Cai, Fan Xiang. 1-5 [doi]
- Frequency Domain Information Integrated Network for Low-Light Image EnhancementNa Li, Xi Luo, Dunlu Peng, Zied Bouraoui. 1-5 [doi]
- Transforming Classification with Federated Learning on Blockchain: A Unique Model Integration ApproachZhihao Hao, Bob Zhang 0001, Haisheng Li 0002. 1-5 [doi]
- M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart GlassesYufeng Yang, Desh Raj, Ju Lin, Niko Moritz, Junteng Jia, Gil Keren, Egor Lakomkin, Yiteng Huang, Jacob Donley, Jay Mahadeokar, Ozlem Kalinli. 1-5 [doi]
- Multi-period Normalization for Long-term Time Series ForecastingJiayu Zhang, Yuantong Dong. 1-5 [doi]
- Situation-aware Space-time Waveform Design for Automotive MIMO RadarsEdoardo Focante, Nitin Jonathan Myers, Geethu Joseph, Ashish Pandharipande. 1-5 [doi]
- Generalizable Articulated Object Perception with SuperpointsQiaojun Yu, Ce Hao, Xibin Yuan, Li Zhang 0104, Liu Liu 0012, Yukang Huo, Rohit Agarwal, Cewu Lu. 1-5 [doi]
- SML: A Backdoor Defense for Non-Intrusive Speech Quality Assessment via Semi-Supervised and Multi-Task LearningYing Ren, Wenjie Zhang, Jiahong Ye, Jie Li, Diqun Yan, Bin Ma 0003. 1-5 [doi]
- FA-GAN: Defense Against Adversarial Attacks in Automatic Modulation RecognitionShilong Zhang, Yu Song, Shubin Wang. 1-5 [doi]
- FedFLD: Heterogeneous Federated Learning via Forget-Less DistillationXiaoyang Yi, Jian Zhang, Jing Chen, Yuru Bao, Lingkai Xing. 1-5 [doi]
- Band Prompting Aided SAR and Multi-Spectral Data Fusion Framework for Local Climate Zone ClassificationHaiyan Lan, Shujun Li, Mingjie Xie, Xuanjia Zhao, Hongning Liu, Pengming Feng, Dongli Xu, Guangjun He, Jian Guan 0001. 1-5 [doi]
- FDR-Controlled Portfolio Optimization for Sparse Financial Index TrackingJasin Machkour, Daniel P. Palomar, Michael Muma. 1-5 [doi]
- MVDC : A Multi-view Dental Completion Model Based on Contrastive LearningXunyu Yang, Qingxin Deng, Minghan Huang, Landu Jiang, Dian Zhang 0001. 1-5 [doi]
- Optimal One-hot Logistic Regression for Tree-based Distribution ClassificationBaptiste Schall, Rodolphe Anty, Lionel Fillatre. 1-5 [doi]
- Disentangle Heart Rate Signals for Improved Stress DetectionPin-Jhao Chen, Woan-Shiuan Chien, Chi-Chun Lee. 1-5 [doi]
- UMETTS: A Unified Framework for Emotional Text-to-Speech Synthesis with Multimodal PromptsXiang Li, Zhi-Qi Cheng, Jun-Yan He, Junyao Chen, Xiaomao Fan, Xiaojiang Peng, Alexander G. Hauptmann. 1-5 [doi]
- ChangeChat: An Interactive Model for Remote Sensing Change Analysis via Multimodal Instruction TuningPei Deng, Wenqian Zhou, Hanlin Wu. 1-5 [doi]
- Local Feature Alignment Prompt-Tuning for Few-shot Multimodal Aspect Sentiment AnalysisMeirong Ding, Chuang Zou, Hongyi Lin. 1-5 [doi]
- Towards Patronizing and Condescending Language in Chinese Videos: A Multimodal Dataset and DetectorHongbo Wang, Junyu Lu, Yan Han, Kai Ma, Liang Yang, Hongfei Lin. 1-5 [doi]
- Confusion-Aware Prototypical Contrastive Learning for Open-Vocabulary Object DetectionHongqiang Cheng, Le Jiang, Guoming Li, Xiaozhou Ye, Ye Ouyang. 1-5 [doi]
- ASANet: Scene Text Recognition With Alternate Self-AttentionWenting Xu, Elham Eli, Alimjan Aysa, Xuebin Xu, Kurban Ubul. 1-5 [doi]
- AVS3P10 Standard for Real-time Speech CodingWei Xiao, Weibei Dou, Wenlong Wang, Gaoxiong Yi, Jingxin Li, Shidong Shang. 1-5 [doi]
- A Novel Underwater Acoustic Signal Denoising Model Based on Complex Convolution Dual-branch Multi-scale Attention NetworkJianxun Tang, Zhe Chen, Mingsong Chen 0003. 1-5 [doi]
- AI-Generated Music Detection and its ChallengesDarius Afchar, Gabriel Meseguer-Brocal, Romain Hennequin. 1-5 [doi]
- Collaborative Personalized Federated Learning via Exponential Moving Average OptimizationYuqing Li, Jintao Liang, Peng Tang, Sen Su. 1-5 [doi]
- Event-Driven Prony: Towards Asynchronous Spectral EstimationRuiming Guo, Yuliang Zhu, Ayush Bhandari. 1-5 [doi]
- Cognitive Load Monitoring via Earable Acoustic SensingJiatao Quan, Khaldoon Al-Naimi, Xijia Wei, Yang Liu 0101, Fahim Kawsar, Alessandro Montanari, Ting Dang. 1-5 [doi]
- Large Language Model-Empowered Adversarial Fusion for Typhoon Track PredictionLei Luo 0008, Yang Lei, Jiahao Luan, Anudeep Vurity, Sumanth Sai Sriram, Jun Guo. 1-5 [doi]
- KANGAN-AVSS: Kolmogorov-Arnold Network Based Generative Adversarial Networks for Audio-Visual Speech SynthesisSubhayu Ghosh, Swapnil Saha, Nanda Dulal Jana. 1-5 [doi]
- DCD-MUSIC: Deep-Learning-Aided Cascaded Differentiable MUSIC Algorithm for Near-Field Localization of Multiple SourcesArad Gast, Luc Le Magoarou, Nir Shlezinger. 1-5 [doi]
- Deep Metamorphic Registration for Tumor-Affected Medical Image AlignmentWei-Jie Pan, Yi Hong. 1-5 [doi]
- A Metric for Predicting the Quality of Ambisonic Spatial Audio Reproduced Using Spatially Interpolated or Extrapolated Room Impulse ResponsesHualin Ren, Christian H. Ritz, Jiahong Zhao, Xiguang Zheng, Daeyoung Jang. 1-5 [doi]
- Effective and Efficient Mixed Precision Quantization of Speech Foundation ModelsHaoning Xu, Zhaoqing Li, Zengrui Jin, Huimeng Wang, Youjun Chen, Guinan Li, Mengzhe Geng, Shujie Hu, Jiajun Deng, Xunying Liu. 1-5 [doi]
- AKI360: Enabling Highly Interactive 360-degree Video Streaming by Adaptive Keyframe IntervalHaitao Liu, Xinyi Zhang 0004, Chuanmin Jia, Yanbiao Li, Gaogang Xie. 1-5 [doi]
- Multi-Scale Parallel Hybrid Network for Palmprint RecognitionHao Yang, Shuyi Li, Bob Zhang 0001, Yuqi Wang. 1-5 [doi]
- RestorMamba: An Enhanced Synergistic State Space Model for Image RestorationZeyu Wang, Chen Li 0025, Huiying Xu, Xinzhong Zhu, Xiao Huang, Hongbo Li. 1-5 [doi]
- BCG data imputation via multimodal feature alignment and semantic sequence predictionJiafeng Qiu, Huadan Wang, Peihan Yao, Gang Shen. 1-5 [doi]
- AudioEditor: A Training-Free Diffusion-Based Audio Editing FrameworkYuhang Jia, Yang Chen 0034, Jinghua Zhao 0004, Shiwan Zhao, Wenjia Zeng, Yong Chen, Yong Qin. 1-5 [doi]
- MSTBI: Head CT Detection and Prognostic Assessment of Traumatic Brain Injury DatasetYang Xu, Menghao Fang, Qiuyu Fu, Mengqi Qu, Junyao Chen, Zexian Xie, Shike Hou, Lu Lu. 1-5 [doi]
- Information-Theoretic Minimax Regret Bounds for Reinforcement Learning based on DualityRaghav Bongole, Amaury Gouverneur, Borja Rodríguez Gálvez, Tobias J. Oechtering, Mikael Skoglund. 1-5 [doi]
- 2Former: Gated Feature Selection and Expert Modeling in Multimodal Emotion RecognitionWeixiang Xu, Zhongren Dong, Runming Wang, Xinzhou Xu, Zixing Zhang 0001. 1-5 [doi]
- Rhythmic Foley: A Framework For Seamless Audio-Visual Alignment In Video-to-Audio SynthesisZhiqi Huang 0005, Dan Luo, Jun Wang, Huan Liao, Zhiheng Li, Zhiyong Wu 0001. 1-5 [doi]
- Smr-Awarenet: An Adaptive Smr-Aware Neural Network for EEG Auditory Attention Guided Target Speech ExtractionXuefei Wang, Yuting Ding, Lei Wang, Fei Chen. 1-5 [doi]
- Multi-domain fusion network for underwater image enhancementJunbin Zhuang 0001, Jiajia Zhou, Yan Zheng, Yasheng Chang, Suleman Mazhar. 1-5 [doi]
- What Affects the Performance of Fake Audio Detection? Analyzing Factors in a Continual Learning SettingYixuan Xiao, Ngoc Thang Vu. 1-5 [doi]
- Efficient Co-Approximate Parallel Compressive Depth Reconstruction on FPGAYun Wu 0003, John McAllister. 1-5 [doi]
- Carver: Learning to Reconstruct Right Ventricle from Sparse Multi-View 2D EchocardiogramsYida Li, Jun Shi 0007, Zhaohui Wang, Tiantong Wang, Ziqi Zhu, Minfan Zhao, Junshi Chen, Hong An. 1-5 [doi]
- Optimizing Biomarkers from Earbud Ballistocardiogram: Calibration and Calibration-Free Algorithms for Accelerometer Axis Selection and FusionYunzhi Li, Md. Mahbubur Rahman, Mehrab Bin Morshed, Md Saiful Islam, Hao Zhou, Weinan Wang, Holland Ernst, Li Zhu 0004, Jilong Kuang. 1-5 [doi]
- ClingTP: Curriculum Learning based Multi-style Title Prefix GenerationYusong Wang, Dongyuan Li, Jialun Shen, Yicheng Xu, Shuai Zhong, Mingkun Xu. 1-5 [doi]
- Fourth-Order Cumulant Based 3-D Near-Field Underdetermined Parameter Estimation With Exact Spatial Propagation ModelLongsheng Jin, Hua Chen 0004, Jiaxiong Fang, Wei Liu 0001, Ye Tian 0014, Gang Wang 0007. 1-5 [doi]
- Improving Acoustic Scene Classification in Low-Resource ConditionsZhi Chen, Yun-Fei Shao, Yong Ma, Mingsheng Wei, Le Zhang, Wei-Qiang Zhang. 1-5 [doi]
- Comparing Self-Supervised Learning Models Pre-Trained on Human Speech and Animal Vocalizations for Bioacoustics ProcessingEklavya Sarkar, Mathew Magimai-Doss. 1-5 [doi]
- Short-time quantum Fourier transform processingSreeraj Rajindran Nair, Benjamin J. Southwell, Christopher Ferrie. 1-5 [doi]
- Diffusion Model with Multi-layer Wavelet Transform for Low-Light Image EnhancementHaiyan Jin, Jing Wang, Fengyuan Zuo, Haonan Su, Zhaolin Xiao, Bin Wang 0046, Yuanlin Zhang 0003. 1-5 [doi]
- E-URES 2.0: Efficient User-Centric Residual-Echo Suppression with a Lightweight Neural NetworkAmir Ivry, Israel Cohen. 1-5 [doi]
- Revelio: A Real-World Screen-Camera Communication System with Visually Imperceptible Data EmbeddingAbbaas Alif Mohamed Nishar, Shrinivas Kudekar, Bernard Kintzing, Ashwin Ashok. 1-5 [doi]
- Evaluation of Wearable Head BCG for PTT Measurement in Blood Pressure InterventionWeinan Wang, Li Zhu 0004, Mehrab Bin Morshed, Md. Mahbubur Rahman, Jungmok Bae, Jilong Kuang. 1-5 [doi]
- Evaluating the Impact of Discriminative and Generative E2E Speech Enhancement Models on Syllable Stress PreservationRangavajjala Sankara Bharadwaj, Jhansi Mallela, Sai Harshitha Aluru, Chiranjeevi Yarra. 1-5 [doi]
- Detecting OOD Samples via Optimal Transport Scoring FunctionHeng Gao, Zhuolin He, Jian Pu. 1-5 [doi]
- Soft Augmentation for Graph ClassificationWeihuang Zheng, Xiaotong Zhang, Rui Dong, Youyong Kong. 1-5 [doi]
- Zero-shot Document Retrieval with Hybrid Pseudo-document RetrieverDong Sun, Wenya Guo, Xumeng Liu, Ying Zhang, Zhaoxiang Hou, Zengxiang Li. 1-5 [doi]
- Dual-Path Contrastive Short Text Clustering with High-order Random WalkZhengzhong Zhu, Binjie Sun, Xuejie Zhang, Jin Wang, Xiaobing Zhou. 1-5 [doi]
- AIDC: Benchmark for Analytical Learning in Incremental Disease ClassificationRongchang Zhao, Jianyu Qi, Rui Li, Zhijie Zheng, Jian Zhang, Jiaxu Li. 1-5 [doi]
- EEG-Music Emotion Recognition: Challenge OverviewSalvatore Calcagno 0002, Simone Carnemolla, Isaak Kavasidis, Simone Palazzo, Daniela Giordano, Concetto Spampinato. 1-3 [doi]
- ERGNN: Spectral Graph Neural Network With Explicitly-Optimized Rational Graph FiltersGuoming Li, Jian Yang 0035, Shangsong Liang. 1-5 [doi]
- SCF-Stega: Controllable Linguistic Steganography Based on Semantic Communications FrameworkYilin Long, Zhongliang Yang, Zhuang Wang, Zhili Zhou, Yongfeng Huang 0001, Linna Zhou. 1-5 [doi]
- Dynamic Graph Multi-granularity Attribute Scene Evolution Sequence RecommendationLongtao Wang, Qingtian Zeng, Guiyuan Yuan, Hua Duan, Cheng Cheng, Kai Jiang. 1-5 [doi]
- Towards Context-aware EEG-based Emotion Recognition Models: Personality and Emotional Intelligence as ContextKannadasan Kalidasan, Nikita Rajesh Verma, Jainendra Shukla. 1-5 [doi]
- Implementing Finite Impulse Response Filters on Quantum ComputersAishwarya Majumdar, Bojko N. Bakalov, Dror Baron, Yuan Liu. 1-5 [doi]
- Preventing output saturation in active noise control: An output-constrained Kalman filter approachJunwei Ji, Dongyuan Shi, Boxiang Wang, Xiaoyi Shen, Zhengding Luo, Woon-Seng Gan. 1-5 [doi]
- Low-rank Adaptation Method for Respiratory Sound Classification: A necessary road towards Large ModelsGaoyang Dong, Yufei Shen, Jianhong Wang, Shunwang Xie, Minghui Zhang, Ping Sun. 1-5 [doi]
- A Teacher Action Quality Assessment Method Based on Label Constraint StrategyMing Fang, Yunpeng Zhou, Jianping Ren, Chunsheng Qin, Shuhua Liu. 1-5 [doi]
- Global Enhanced Frame Prompt Tuning for Sound Event DetectionShiyu Yu, Lijian Gao, Qirong Mao. 1-5 [doi]
- An Explainable Probabilistic Attribute Embedding Approach for Spoofed Speech CharacterizationManasi Chhibber, Jagabandhu Mishra, Hye-jin Shim, Tomi H. Kinnunen. 1-5 [doi]
- MorphFader: Enabling Fine-grained Controllable Morphing with Text-to-Audio ModelsPurnima Kamath, Chitralekha Gupta, Suranga Nanayakkara. 1-5 [doi]
- Maximum Likelihood Estimation for Bivariate Joint Distribution Recovery from Max-Aggregated DataTianjian Zhang, Feng Yin, Yue Sun, Qi Yan. 1-5 [doi]
- GEMD-UNet: Graph Structure Enhanced Multi-dimensional Learning Unet for Cloud DetectionJianing Chen, Chuhao Chen 0002, Junze Yang, Wei Li 0109, Rahul Yadav, Wenqi Zheng. 1-5 [doi]
- Generate E-commerce Product Background by Integrating Category Commonality and Personalized StyleHaohan Wang, Wei Feng, Yaoyu Li, Zheng Zhang, Jingjing Lv, Junjie Shen 0008, Zhangang Lin, JingPing Shao. 1-5 [doi]
- Prompt-to-Correct: Automated Test-Time Pronunciation Correction with Voice PromptsAyan Kashyap, Neil Kumar Shah, Vineet Gandhi. 1-5 [doi]
- Self-Supervised Monocular Depth Estimation from Videos via Pose-Adaptive ReconstructionXin Sun, Boqian Liu, Xinchen Ye, Rui Xu, Haojie Li. 1-5 [doi]
- TSP: Task-Specific Pruning for Personalized Image Classification on Edge DevicesYanting Wang, Bojie Shi, Han Zhang. 1-5 [doi]
- Harnessing the Potential of Omnidirectional UAVs in RIS-Enabled Wireless NetworksAbdoul Karim A. H. Saliah, Hajar El Hammouti, Daniel Bonilla Licea. 1-5 [doi]
- BRDIA: Bidirectional Reasoning with Dynamic Instruction Adjustment for Multi-hop KGQAChuanyang Gong, Zhihua Wei 0001. 1-5 [doi]
- Boolean Matrix Tri-FactorizationChristos Kolomvakis, Arnaud Vandaele, Nicolas Gillis. 1-5 [doi]
- Identifying Adversarial Attacks in Crowdsourcing via Dense Subgraph DetectionAbdullah Karaaslanli, Panagiotis A. Traganitis, Aritra Konar. 1-5 [doi]
- RemoteTrimmer: Adaptive Structural Pruning for Remote Sensing Image ClassificationGuangwenjie Zou, Liang Yao, Fan Liu, Chuanyi Zhang, Xin Li, Ning Chen, Shengxiang Xu, Jun Zhou. 1-5 [doi]
- Diffusion Models are Zero-Shot Generative Text-Vision RetrieversBao Li, Zeke Xie, Xiaomei Zhang, Xiangyu Zhu 0001, Zhen Lei 0001. 1-5 [doi]
- Large Covariance Matrix Estimation for Groups of Highly Correlated Variables via Nonconvex OptimizationShanshan Zou, Ziping Zhao. 1-5 [doi]
- Explicit Mutual Information Maximization for Self-Supervised LearningLele Chang, Peilin Liu, Qinghai Guo, Fei Wen. 1-5 [doi]
- Subspace-Based Range-Angle Tracking for Coherent FDA RadarYan Sun, Wen-Qin Wang, Maria Sabrina Greco, Fulvio Gini. 1-5 [doi]
- Target parameter estimation using the Capon method in a MIMO OFDM DFRC systemSatwika Bhogavalli, Eric J. Grivel, Vincent Corretja. 1-5 [doi]
- Enhancing Zero-Shot Emotional Voice Conversion via Speaker Adaptation and Duration PredictionShiyan Wang, Tianhua Qi, Cheng Lu, Zhaojie Luo, Wenming Zheng. 1-5 [doi]
- Textual and Visual Prompt Fusion for Image Editing via Step-Wise AlignmentZhanbo Feng, Zenan Ling, Xinyu Lu, Ci Gong, Feng Zhou 0011, Wugedele Bao, Jie Li 0002, Fan Yang 0087, Robert C. Qiu. 1-5 [doi]
- Boosting Movie and TV Tag Accuracy with Knowledge GraphsHongxun Jiang, Lin Zhang 0009, Lifeng Zhang. 1-5 [doi]
- Lunar Tracking: A New Benchmark For Nighttime Tiny Object TrackingMohammed Leo, Ding Zhang, Hai-Tao Zheng, Haiye Lin. 1-5 [doi]
- Sound-Based Recognition of Touch Gestures and Emotions for Enhanced Human-Robot InteractionYuanbo Hou, Qiaoqiao Ren, Wenwu Wang 0001, Dick Botteldooren. 1-5 [doi]
- Follow-Your-MultiPose: Tuning-Free Multi-Character Text-to-Video Generation via Pose GuidanceBeiyuan Zhang, Yue Ma, Chunlei Fu, Xinyang Song, Zhenan Sun, Ziqiang Li. 1-5 [doi]
- Collaborative Automotive Radar Sensing via Mixed-Precision Distributed Array CompletionArian Eamaz, Farhang Yeganegi, Yunqiao Hu, Mojtaba Soltanalian, Shunqiao Sun. 1-5 [doi]
- Self-supervised Prosody Learning at Phoneme-level with Momentum Contrast for Speech SynthesisZhaoci Liu, Ya-Jun Hu, Liping Chen, Zhen-Hua Ling. 1-5 [doi]
- DiffGAP: A Lightweight Diffusion Module in Contrastive Space for Bridging Cross-Model GapShentong Mo, Zehua Chen, Fan Bao, Jun Zhu. 1-5 [doi]
- InsectMamba: State Space Model with Adaptive Composite Features for Insect RecognitionQianning Wang, Chenglin Wang, Zhixin Lai, Yucheng Zhou. 1-5 [doi]
- The Potential of Speech Features to Discriminate between Original and Machine-Translated TextsYongjian Chen, Mireia Farrús, Antonio Toral. 1-5 [doi]
- Archetypal Analysis for Binary DataAnna Emilie J. Wedenborg, Morten Mørup. 1-5 [doi]
- FUVAS: Few-shot Unsupervised Video Anomaly Segmentation via Low-Rank Factorization of Spatio-Temporal FeaturesJiaxiang Jiang, Ibrahima J. Ndiour, Mahesh Subedar, Omesh Tickoo. 1-5 [doi]
- Self-Distillation Prototypes Network: Learning Robust Speaker Representations without SupervisionYafeng Chen, Siqi Zheng, Hui Wang 0030, Luyao Cheng, Qian Chen 0003, Chong Deng, Shiliang Zhang, Wen Wang 0001. 1-5 [doi]
- ES-NeRF: Enhancing Segmentation in NeRF with CLIPChong Zhao, Pengcheng Hou, Yan Zhai, Xing Wei, Fan Yang, Xiang Bi. 1-5 [doi]
- Near-Field ISAC in 6G: Addressing Phase Nonlinearity via Lifted Super-ResolutionSajad Daei, Amirreza Zamani, Saikat Chatterjee, Mikael Skoglund, Gábor Fodor 0001. 1-5 [doi]
- HCoTT: Hierarchical Chain-of-Thought DistillationZhichang Wang, Xianwei Zhuang, Zhihong Zhu, Yuexian Zou. 1-5 [doi]
- Seek and Solve Reasoning for Table Question AnsweringRuya Jiang, Chun Wang, Weihong Deng. 1-5 [doi]
- VN-GT: Optimizing Virtual Network Deployment via Game TheoryWeijie Wang, Yan Wang 0081, Guokun Xu, Zuxin Chen, Siyuan Li, Min Yu, Weiqing Huang, Degang Sun. 1-5 [doi]
- Explainable Detection of Alzheimer's Disease Through Analysis of Human Behavior in VideoBao-Hsuan Huang, Po-Chih Kuo, Likai Huang, Chaur-Jong Hu, Cheng-Yu Chen. 1-5 [doi]
- ValSub: Subsampling Validation Data to Mitigate Forgetting during ASR PersonalizationHaaris Mehmood, Karthikeyan Saravanan, Pablo Peso Parada, David Tuckey, Mete Ozay, Gil Ho Lee, Jungin Lee, Seokyeong Jung. 1-5 [doi]
- Hyper-adapter for Parameter-Efficient Multilingual ASR AdaptationZejiang Hou, Daniel Garcia-Romero, Kyu J. Han. 1-5 [doi]
- Maintaining Prosodic Consistency in Automatic Dubbing for Better IsochronyParnia Bahar, Alejandro Pérez, Javier Iranzo-Sánchez. 1-5 [doi]
- Modular Prompt Learning Improves Vision-Language ModelsZhenhan Huang, Tejaswini Pedapati, Pin-Yu Chen, Jianxi Gao. 1-5 [doi]
- A Dynamic Edge-Selection Mechanism in HRV Hypergraph Learning for Improved Stress DetectionJing-Chun Wang, Woan-Shiuan Chien, Chi-Chun Lee. 1-5 [doi]
- Retrieval-Augmented Neural Field for HRTF Upsampling and PersonalizationYoshiki Masuyama, Gordon Wichern, François G. Germain, Christopher Ick, Jonathan Le Roux. 1-5 [doi]
- Uncertainty prediction for prominence classification with chroma featuresJulian Linke, Sophie Steger, Philipp Steinwender, Gernot Kubin, Franz Pernkopf, Barbara Schuppler. 1-5 [doi]
- LSTM-QGAN: Scalable NISQ Generative Adversarial NetworkCheng Chu, Aishwarya Hastak, Fan Chen 0001. 1-5 [doi]
- Towards Dynamic Skeleton-based Handshape Subunits for Sign Language AssessmentSandrine Tornay, Mathew Magimai-Doss. 1-5 [doi]
- Dual-Triple Transformer Networks for Accurate CT Pleural Effusion SegmentationJianwei Yang, Wenkang Fan, Hao Fang, Zirui Zhu, Xiongbiao Luo. 1-5 [doi]
- Sparse Bayesian Network for Fast Micro-Doppler AnalysisJiongge Zhang, Hang Dong, Long Tian, Xiongpeng He, Huimin Sun, Yuan Liu. 1-5 [doi]
- GoLoColor: Towards Global-Local Semantic Aware Image ColorizationTianai Yue, Xiangcheng Du, Jing Liu, Zhongli Fang. 1-5 [doi]
- Metadata-assisted spatial audio coding in IVAS codecAdriana Vasilache, Tapani Pihlajakuja, Mikko-Ville Laitinen. 1-5 [doi]
- A Hybrid Probabilistic-Deterministic Model Recursively Enhancing SpeechTomohiro Nakatani, Naoyuki Kamo, Marc Delcroix, Shoko Araki. 1-5 [doi]
- Proximity Detection and Trajectory Recognition with Machine Learning for UHF RFID SystemsThomas M. Pohl, Christoph F. Mecklenbräuker, Holger Arthaber. 1-4 [doi]
- Regularized Domain Adaptation for Estimation Tasks in Partially Observed Target DomainsVarun Kelkar, H. S. Melihcan Erol, Muhammad Aneeq uz Zaman, Omer Tanovic, Ravi Kiran Raman. 1-5 [doi]
- Event Masked Autoencoder: Point-wise Action Recognition with Event-Based CamerasJingkai Sun, Qiang Zhang, Jiaxu Wang, Jiahang Cao, Hao Cheng, Renjing Xu. 1-5 [doi]
- Self-Convolutional Attention-Based Uncertainty-Aware Network for Single-Image Super-ResolutionJinbin Wang, Aiping Yang, Zihao Wei, Qinghua Hu. 1-5 [doi]
- Self-Supervised Graph Representation Learning for In-The-Wild Wearable and Smartphone based Emotion RecognitionIoannis Ziogas, Leontios J. Hadjileontiadis, Ahsan H. Khandoker, Aamna Al Shehhi. 1-5 [doi]
- KVPruner: Structural Pruning for Faster and Memory-Efficient Large Language ModelsBo Lv, Quan Zhou, Xuanang Ding, Yan Wang, Zeming Ma. 1-5 [doi]
- Efficient Streaming LLM for Speech RecognitionJunteng Jia, Gil Keren, Wei Zhou, Egor Lakomkin, Xiaohui Zhang 0007, Chunyang Wu, Frank Seide, Jay Mahadeokar, Ozlem Kalinli. 1-5 [doi]
- Joint-Wise Distributed Perception Graph Convolutional Network for Skeleton-Based Action RecognitionQian Huang, Qiang Geng, Zhaoyu Chen, Xin Li, Yangyang Li, Xing Li. 1-5 [doi]
- Do Multimodal Language Models Really Understand Direction? A Benchmark for Compass Direction ReasoningHang Yin 0007, Zhifeng Lin, Xin Liu, Bin Sun 0004, Kan Li 0001. 1-5 [doi]
- Collusion-resistant Black-box Watermarking in Federated Learning through Weight Relevance AnalysisElena Rodríguez Lois, Fernando Pérez-González. 1-5 [doi]
- Integrated Interpolation and Matrix Completion for Radio Map Estimation: A Convex Optimization ApproachHongcheng Dong, Wenqiang Pu, Rui Zhou, Xiao Fu 0001, Feng Yin. 1-5 [doi]
- Enhancing Multi-Channel Speech with Limited Microphones via Spherical Harmonic TransformJiahui Pan, Hui Zhang 0031, Xueliang Zhang 0001. 1-5 [doi]
- Multi-modal Entity Alignment under Imbalanced Visual Modality InformationXin Zhang, Yu Liu, Shimin Shan. 1-5 [doi]
- On Momentum Acceleration for Randomized Coordinate Descent in Matrix CompletionMatthew Callahan, Trung Vu 0001, Raviv Raich. 1-5 [doi]
- Self-supervised Learning for Acoustic Few-Shot ClassificationJingyong Liang, Bernd Meyer 0001, Issac Ning Lee, Thanh-Toan Do. 1-5 [doi]
- E1 TTS: Simple and Fast Non-Autoregressive TTSZhijun Liu, Shuai Wang, Pengcheng Zhu 0004, Mengxiao Bi, Haizhou Li 0001. 1-5 [doi]
- VisTa: Visual-contextual and Text-augmented Zero-shot Object-level OOD DetectionBin Zhang, Xiaoyang Qu, Guokuan Li, Jiguang Wan, Jianzong Wang. 1-5 [doi]
- Text-Infused Audio-Visual Video Parsing with Semantic-Aware Multimodal Contrastive LearningPengcheng Zhao, Yanxiang Chen, Dan Guo 0001, Yuanzhi Yao. 1-5 [doi]
- Automatic Numbering and Pathological Recognition of Pediatric Teeth Using CNN and Attention MechanismsHongzhou Zhu, Yuhao Qiu, Renjie Hu, Ang Li, Shengji Zhu, Lei Wang. 1-5 [doi]
- Semantic Attention and LLM-based Layout Guidance for Text-to-Image GenerationYuxiang Song, Zhaoguang Long, Man Lan, Changzhi Sun, Aimin Zhou, Yuefeng Chen, Hao Yuan, Fei Cao. 1-5 [doi]
- Exploring the Differences between Deaf and Hearing Infant CriesEnjamamul Hoq, Ifeoma Nwogu. 1-5 [doi]
- A Federated Learning-Based Intrusion Detection System for Satellite-Terrestrial Integrated NetworksMengke Wan, Jiang Fang, Chen Guo, Liru Geng, YinLong Liu, Wei Ma, Chao Xu, Mohan Su. 1-5 [doi]
- Contextual ASR with Retrieval Augmented Large Language ModelCihan Xiao, Zejiang Hou, Daniel Garcia-Romero, Kyu J. Han. 1-5 [doi]
- FreqSense: Universal and Low-Latency Adversarial Example Detection for Speaker Recognition with Interpretability in Frequency DomainYihuan Huang, Yuanzhe Li, Yanzhen Ren, Weiping Tu, Yuhong Yang 0001. 1-5 [doi]
- FBI-Net: Frequency Band Integration Network for Infrared Small Target SegmentationBiqiao Xin, Qiang Li, Qianchen Mao, Jinbao Wang, Bingshu Wang. 1-5 [doi]
- AMNS: Attention-Weighted Selective Mask and Noise Label Suppression for Text-to-Image Person RetrievalRunqing Zhang, Xue Zhou. 1-5 [doi]
- Learning Two-factor Representation for Magnetic Resonance Image Super-resolutionWeifeng Wei, Heng Chen, Pengxiang Su. 1-5 [doi]
- MULiving: Towards Real-time Multi-User Survival State Monitoring Using Wearable RFID TagsShang Gao, Dawei Yan, Yubo Yan. 1-5 [doi]
- Training-Free Point Cloud Recognition Based on Geometric and Semantic Information FusionYan Chen, Di Huang, Zhichao Liao, Xi Cheng, Xinghui Li, Long Zeng 0001. 1-5 [doi]
- UniIVFT: Towards a Unified Framework for Infrared-Visible Fusion and TranslationHonglin Wu, Xueqiong Li, Shaowu Yang, Huibin Tan, Yuhua Tang, Tianrui Liu. 1-5 [doi]
- Kernel-Based Anomaly Detection Using Generalized Hyperbolic ProcessesPauline Bourigault, Danilo P. Mandic. 1-5 [doi]
- Multi-range Adaptive Perception Transformer for Iterative Homography EstimationTianming Li, Qing Zhu, Zhen Zhou, Jianqiao Luo, Yaonan Wang 0001. 1-5 [doi]
- Improving Irregular Text Recognition with Adaptive Feature CompressionYin Liu, Zhineng Chen. 1-5 [doi]
- Enhancing Robustness of Implicit Neural Representations Against Weight PerturbationsWenyong Zhou, Yuxin Cheng, Zhengwu Liu, Taiqiang Wu, Chen Zhang, Ngai Wong. 1-5 [doi]
- Frequency-Space Margin Perception for Open Set Knowledge DistillationLijun Liu, Lihua Jing, Rui Wang, Yuan Wang, Zhishen Wang. 1-5 [doi]
- SlotFusion: Object-Centric Audiovisual Feature Fusion with Slot Attention for Remote Sensing Scene RecognitionFangzhou Han, Tianyi Yu, Lamei Zhang, Lingyu Si, Yiqi Zhang. 1-5 [doi]
- Tessellated Linear Model for Age Prediction from VoiceDareen Alharthi, Mahsa Zamani, Bhiksha Raj, Rita Singh. 1-4 [doi]
- Transfer Learning via Functional Balancing in Reproducing Kernel Hilbert SpacesBoyan Gu, Sheng Zheng, Xiaojun Mao, Zhonglei Wang. 1-5 [doi]
- SACR: Self-training with Saliency-Augmented Consistency Regularization for Few-Shot LearnersYanyan Feng, Yue Zhou, Yun Xue, Fenghuan Li, Zehong Lin. 1-5 [doi]
- iDANSE: Iterative Data-driven Nonlinear State Estimation of Model-free Hidden SequencesHang Qin, Anubhab Ghosh, Saikat Chatterjee. 1-5 [doi]
- Assessing Robustness of Multi-Modal Large Language Models in Image Classification through Hierarchical WordNet-Based EvaluationChang Liu 0077, Hai Chen, Boxiang Wang, Shibao Zheng. 1-5 [doi]
- Cluster-Refined Optimal Transport for Unsupervised Action SegmentationShijie Wang, Jinrong Zhang, Yule Liu, Shiyao Li, Lin Feng 0001. 1-5 [doi]
- Hyper-Refinement for Low-Rank AdaptationSavas Özkan, Taha Ceritli, Jeongwon Min, Eunchung Noh, Jung Min Cho, Dookun Park, Mete Ozay. 1-5 [doi]
- Text Enhancement Network for Complex Multi-line Scene Text Image Super-resolutionYang Liu, Yuliang Huang, Yiming Liu. 1-5 [doi]
- AC-Mix: Self-Supervised Adaptation for Low-Resource Automatic Speech Recognition using Agnostic Contrastive MixupCarlos Carvalho, Alberto Abad. 1-5 [doi]
- An Abnormal Audio Generation Method for Fault Diagnosis of Power TransformersBen Niu 0011, Yangjie Wei, Ke Zhang, Zhuoran Yu. 1-5 [doi]
- A Convolutional Recurrent Mixer Network For Radar Meteorological Image Super-ResolutionRafael Gonçalves Pires, Daniel F. S. Santos, Roberto V. Calheiros, João Paulo Papa, Ik Hyun Lee, Sambit Bakshi, Khan Muhammad 0001. 1-5 [doi]
- Hierarchical Nash Equilibrium over Variational Equilibria via Fixed-point Set Expression of Quasi-nonexpansive OperatorShota Matsuo, Keita Kume, Isao Yamada. 1-5 [doi]
- Decision-Aided Progressive Symbol Phase Equalizer in Sweep Spread Carrier Underwater Acoustic CommunicationsAnoop R., Manju M. Raj, K. P. Arunkumar, Chandra R. Murthy. 1-5 [doi]
- SONIQUE: Video Background Music Generation Using Unpaired Audio-Visual DataLiqian Zhang, Magdalena Fuentes. 1-5 [doi]
- GBA-Net: A Method for 3D Brain Tumor Segmentation Based on Multi-scale Gaussian Boundary AttentionWei Wang, Longrun Wang, Xin Wang. 1-5 [doi]
- DiffSR: Learning Radar Reflectivity Synthesis via Diffusion Model from Satellite ObservationsXuming He, Zhiwang Zhou, Wenlong Zhang, Xiangyu Zhao, Hao Chen, Shiqi Chen, Lei Bai 0001. 1-5 [doi]
- Personalized Speech Enhancement without User Enrollment for Real-World Audio Replay ScenariosHaoran Wei, Shilin Wang, Yanhua Long. 1-5 [doi]
- High-Fidelity Editable Portrait Synthesis with 3D GAN InversionJindong Xie, Jiachen Liu, Yupei Lin, Jinbao Wang, Xianxu Hou, LinLin Shen. 1-5 [doi]
- EMMeTT: Efficient Multimodal Machine Translation TrainingPiotr Zelasko, Zhehuai Chen, Mengru Wang, Daniel Galvez, Oleksii Hrinchuk, Shuoyang Ding, Ke Hu, Jagadeesh Balam, Vitaly Lavrukhin, Boris Ginsburg. 1-5 [doi]
- Stable-TTS: Stable Speaker-Adaptive Text-to-Speech Synthesis via Prosody PromptingWooseok Han, Minki Kang, Changhun Kim, Eunho Yang. 1-5 [doi]
- Instance Segmentation of Airway Anatomies Using Mask R-CNN Prompt Adaptation-SAMYinzhou Ling, Jingjing Luo, Yuan Han, Wenxian Li, Hongbo Wang. 1-5 [doi]
- Event-based Video Person Re-identification via Cross-Modality and Temporal CollaborationRenkai Li, Xin Yuan, Wei Liu, Xin Xu. 1-5 [doi]
- Analysis and Calibration of Nonlinear Power Amplifiers in Wideband OFDM-Based LEO Satellite Communication SystemKai Ying, Linshan Zhao, Pengcheng Jia, Ming Zhou, Kai Kang. 1-5 [doi]
- Tracking Time-Varying Parameters in Massive MIMO IoT Networks: A Linear Coherent Decentralized ApproachKunwar Pritiraj Rajput, Linlong Wu, Bhavani Shankar M. R., Björn E. Ottersten, Pramod K. Varshney. 1-5 [doi]
- Reexamining the Efficacy of MetricGAN for Speech EnhancementHaibin Wu, Ali Aroudi, Buye Xu, Ashutosh Pandey 0004, Francesco Nesta, Anurag Kumar 0003, Alexander Reich, Ke Tan 0001. 1-5 [doi]
- KAN-HyperpointNet for Point Cloud Sequence-Based 3D Human Action RecognitionZhaoyu Chen, Xing Li, Qian Huang, Qiang Geng, Tianjin Yang, Shihao Han. 1-5 [doi]
- Mutual-View Contrastive Generative Framework for Attribute-Missing Graph ClusteringShijun Li, Changjian Wang, Kele Xu, Xiaojin Li, Gaojin He, Xu Liu. 1-5 [doi]
- Joint Automatic Speech Recognition And Structure Learning For Better Speech UnderstandingJiliang Hu, Zuchao Li, Mengjia Shen, Haojun Ai, Sheng Li 0010, Jun Zhang. 1-5 [doi]
- LLGS: Illuminating Gaussian Splatting via absorptance ModulationJianwen Gan, Wenxin Li, Bo Zheng, Chengliang Wang, Yingbo Wu. 1-5 [doi]
- Code Drift: Towards Idempotent Neural Audio CodecsPatrick O'Reilly, Prem Seetharaman, Jiaqu Su, Zeyu Jin, Bryan Pardo. 1-5 [doi]
- Effective Pre-Training of Audio Transformers for Sound Event DetectionFlorian Schmid, Tobias Morocutti, Francesco Foscarin, Jan Schlüter, Paul Primus, Gerhard Widmer. 1-5 [doi]
- M-MoE: Mixture of Mixture-of-Expert Model for CTC-based Streaming Multilingual ASRSongjun Cao, Xiong Wang, Yike Zhang, Xiaoming Zhang, Long Ma. 1-5 [doi]
- Evaluating Snippet Significance: A Framework for Audio and Text-Based Dialogue SummarizationAnderson de Lima Luiz, Raviteja Boddu, Munir Georges. 1-5 [doi]
- The Missing Piece in Model Editing: A Deep Dive into the Hidden Damage Brought By Model EditingJianchen Wang, Zhouhong Gu, Xiaoxuan Zhu, Lin Zhang, Haoning Ye, Zhuozhi Xiong, Sihang Jiang, Hongwei Feng, Yanghua Xiao. 1-5 [doi]
- DFNeRF: Disentangled Facial Neural Radiance Fields for Text-based Editing of Free-view Talking HeadBenwang Chen, Xiaoyu Li, Xuan Wang, Qi Zhang, Haoqian Wang. 1-5 [doi]
- ARM : nnU-Net with Arena Mechanism for Medical Image SegmentationHaoran Luo, Cong Guan, Tengfei Shao, Shenglei Li, Tomoji Kishi, Osamu Yoshie. 1-5 [doi]
- The CDC Problem: Distributed Spatial Sampling and Detection of Poisson ProcessesVanlalruata Ralte, Amitalok J. Budkuley, Stefano Rini. 1-5 [doi]
- Dynamic Graph Recommendation via Sparse Augmentation and Singular AdaptationZhen Tao, Yuehang Cao, Yang Fang, Yunhui Liu, Xiang Zhao, Tieke He. 1-5 [doi]
- FreeSVC: Towards Zero-shot Multilingual Singing Voice ConversionAlef Iury Siqueira Ferreira, Lucas Rafael Stefanel Gris, Augusto Seben da Rosa, Frederico Santos de Oliveira, Edresson Casanova, Rafael Teixeira Sousa, Arnaldo Candido Jr., Anderson da Silva Soares, Arlindo R. Galvão Filho. 1-5 [doi]
- CA-UAP: Content-Agnostic Universal Adversarial Perturbation for Enhanced GeneralizationRui Lu, Ziqiang He, Jingyang Wen, Xiangui Kang, Z. Jane Wang 0001. 1-5 [doi]
- MPOT: Manifold Preserving Optimal Transport for Visual Recognition Under Severe Distribution ShiftYou-Wei Luo, Zhi-Hao Li, Chuan-Xian Ren. 1-5 [doi]
- SCBot: Building Lightweight and Flexible C&C Based on Smart ContractChaoge Liu, Zhi Wang, Jie Yin, Yinsheng Liu, Kai Mao, Zhe Huang, Chumeng Deng. 1-5 [doi]
- SGRAND: Stochastic Graph Neural DiffusionKaihang Dou, Fan Li, Suixiang Gao. 1-5 [doi]
- A Hierarchical Reasoning Framework for Complex Question Answering over Knowledge Graph with Reinforcement LearningZhiqiang Zhang, Zhiyi Zhang, Yunxiao Zhang, Wen Zhao. 1-5 [doi]
- Diffusion Learning Over Adaptive Competing NetworksYike Zhao, Haoyuan Cai, Ali H. Sayed. 1-5 [doi]
- PANDA: Patch-Aware Graph Network with Dual Alignment for Time Series ForecastingChen Li, Hongyang Zhang, Saqlain Abbas, Chenyu Ma, Yinhao Liu, Xiaotong Tu. 1-5 [doi]
- Speech Recognition with LLMs Adapted to Disordered Speech Using Reinforcement LearningChirag Nagpal, Subhashini Venugopalan, Jimmy Tobin, Marilyn A. Ladewig, Katherine A. Heller, Katrin Tomanek. 1-5 [doi]
- Speech Retrieval-Augmented Generation without Automatic Speech RecognitionDo June Min, Karel Mundnich, Andy Lapastora, Erfan Soltanmohammadi, Srikanth Ronanki, Kyu J. Han. 1-5 [doi]
- Label Dependency Aware Loss for Reliable Multi-Label Medical Image ClassificationAditya Shankar Pal, Arkapal Panda, Utpal Garain. 1-5 [doi]
- Adaptive Sparse Feature Location Activation Strategy for Sparse Detectors on Drone ImagesYixuan Li, Yulong Xu, Pengnian Wu, Xuqi Yang, Meng Zhang. 1-5 [doi]
- Second-Order Wireless Federated Leaning via Nonparametric Hessian EstimationShayan Mohajer Hamidi, Ali Bereyhi. 1-5 [doi]
- Multi-View Image Enhancement Inconsistency Decoupling Guided 3D Gaussian SplattingLu Xiao, Jiahao Wu, Zhanke Wang, Guanhua Wu, Runling Liu, Zhiyan Wang, Ronggang Wang. 1-5 [doi]
- Audio-Driven Reinforcement Learning for Head-Orientation in Naturalistic EnvironmentsWessel Ledder, Yuzhen Qin, Kiki van der Heijden. 1-5 [doi]
- Unified Audio Event DetectionYidi Jiang, Ruijie Tao, Wen Huang, Qian Chen, Wen Wang. 1-5 [doi]
- AnCoGen: Analysis, Control and Generation of Speech with a Masked AutoencoderSamir Sadok, Simon Leglaive, Laurent Girin, Gaël Richard, Xavier Alameda-Pineda. 1-5 [doi]
- A 3D Attenuation Coefficient based Degradation Estimation for Real Non-Homogeneous DehazingHanyu Jiang, Zemin Ren, Tao Tan. 1-5 [doi]
- Weak-to-Strong Generalization in Speech RecognitionSoheil Khorram, Qian Zhang, Rohit Prabhavalkar, Kartik Audhkhasi, Bhuvana Ramabhadran. 1-5 [doi]
- RAFDet: A Novel Camera-Radar Fusion Framework for Robust 3D Object Detection in Autonomous DrivingXingjian Cao, Ping Wang, Zhitao Zhang, Huizhao Tu, Yong Chen, Zhenbao Liang. 1-5 [doi]
- Automatic Speech Recognition and Spoken Language Understanding of Maritime Radio Communications: A case study with Singapore dataPhuong Dat, Jayakrishnan Melur Madhathil, Tran Huy Dat. 1-5 [doi]
- Modulo Sampling and Recovery with Unknown and Time-Varying Folding ParameterYhonatan Kvich, Alperen Yasar, Eyyup Tasci, Rabia Tugce Yazicigil, Yonina C. Eldar. 1-5 [doi]
- *Amir Leshem. 1-5 [doi]
- Hierarchical Similarity Loss Enhanced Depth and Structural Fidelity in Monocular RGB-to-Depth Mapping with Adversarial TrainingChangzeng Fu, Yikai Su, Kaifeng Su, Le Yang, Peng Shan, Xiaoyong Lv, Yuliang Zhao. 1-5 [doi]
- Aesthetic Perception Prompting for Interpretable Image Aesthetics Assessment with MLLMsLanjun Wang, Zheyu Qiao, Ruidong Chen, Jingqiu Li, Wenjie Wang, Xiaoqiong Wang, Wei Rao 0003, Shuai Chen, An-An Liu. 1-5 [doi]
- Efficient Extreme Large-Scale Speaker Verification: Dynamic Active Sub Fully-Connected Layers for Faster Training and Memory OptimizationFulin Zhang, Chenguang Hu, Yao Shen, Yingying Gao, Shilei Zhang, Junlan Feng. 1-5 [doi]
- COCOLA: Coherence-Oriented Contrastive Learning of Musical Audio RepresentationsRuben Ciranni, Giorgio Mariani, Michele Mancusi, Emilian Postolache, Giorgio Fabbro, Emanuele Rodolà, Luca Cosmo. 1-5 [doi]
- Integrating Pause Information with Word Embeddings in Language Models for Alzheimer's Disease Detection from Spontaneous SpeechYu Pu, Wei-Qiang Zhang. 1-5 [doi]
- SpeechTaxi: On Multilingual Semantic Speech ClassificationLennart Keller, Goran Glavas. 1-5 [doi]
- Flow-TSVAD: Target-Speaker Voice Activity Detection via Latent Flow Matching for Speaker DiarizationZhengyang Chen, Bing Han 0008, Shuai Wang 0016, Yidi Jiang, Yanmin Qian. 1-5 [doi]
- Constraint-Awareness and Graph Reasoning for Temporal Question AnsweringZheng Sun, Kai Zhang, Xiulong Zhang, Jianting Liu. 1-5 [doi]
- Automatic Parkinson's disease detection from speech: Layer selection vs adaptation of foundation modelsTilak Purohit, Barbara Ruvolo, Juan Rafael Orozco-Arroyave, Mathew Magimai-Doss. 1-5 [doi]
- Hyperedge Representations with Hypergraph Wavelets: Applications to Spatial TranscriptomicsXingzhi Sun 0003, Charles Xu 0004, João F. Rocha, Chen Liu 0020, Benjamin Hollander-Bodie, Laney Goldman, Marcello DiStasio, Michael Perlmutter, Smita Krishnaswamy. 1-5 [doi]
- RAW Data: A Key Component for Effective Deepfake DetectionSahar Husseini, Jean-Luc Dugelay. 1-5 [doi]
- Self-Tuning Spectral Clustering for Speaker DiarizationNikhil Raghav, Avisek Gupta, Md. Sahidullah, Swagatam Das. 1-5 [doi]
- Multi-View Radar Detection Transformer with Differentiable Positional EncodingRyoma Yataka, Pu (Perry) Wang, Petros Boufounos, Ryuhei Takahashi. 1-5 [doi]
- GDFDNet: A Novel Graph-Based Dynamically Fused Dual-Stream Network for Accuracy Prohibited Items DetectionXiaomeng Li, Hongxia Gao, Yaobin Huang, Zhenming Guan, Litao Li. 1-5 [doi]
- A Grouping Strategy-Based Progressive Fusion Network for Hyperspectral Image Super-ResolutionGuohua Lv, Baodong Zhang, Yongbiao Gao, Guixin Zhao, Guotao Wang, Juncan Wang. 1-5 [doi]
- Steering Large Language Models for Vulnerability DetectionJiayuan Li, Lei Cui 0003, Jie Zhang, Haiqiang Fei, Yu Chen, Hongsong Zhu. 1-5 [doi]
- Online Contrastive Continual Learning with Hard Negative SamplesGuanglu Wang, Xinyue Liu 0002, Wentao Yang, Wenxin Liang, Linlin Zong, Xianchao Zhang 0001. 1-5 [doi]
- Fine-portraitist: Visualizing the Speaker's Face Portrait during Speech ListeningJinting Wang, Li Liu 0036, Jun Wang. 1-5 [doi]
- Revolutionizing Disease Diagnosis with simultaneous functional PET/MR and Deeply Integrated Brain Metabolic, Hemodynamic, and Perfusion NetworksLuoyu Wang, Yitian Tao, Qing Yang, Yan Liang, Siwei Liu, Hongcheng Shi, Dinggang Shen, Han Zhang. 1-5 [doi]
- ViolinDiff: Enhancing Expressive Violin Synthesis with Pitch Bend ConditioningDaewoong Kim, Hao-Wen Dong, Dasaem Jeong. 1-5 [doi]
- CTGDiff: A Conditional Diffusion Model for Cardiotocography Signal SynthesisXiaoqing Li 0005, Pufan Cai, Yu Lu 0001, Shijie Shi, Liangkun Ma, Xianghua Fu. 1-5 [doi]
- Generating Targeted Universal Adversarial Perturbation against Automatic Speech Recognition via Phoneme TailoringYujun Zhang, Yanqu Chen, Jiakai Wang, Jin Hu, Renshuai Tao, Xianglong Liu 0001. 1-5 [doi]
- Distilling Generative-Discriminative Representations for Very Low-Resolution Face RecognitionJunzheng Zhang, Weijia Guo, Bochao Liu, Ruixin Shi, Yong Li, Shiming Ge. 1-5 [doi]
- Two-in-One: Unified Multi-Person Interactive Motion Generation by Latent Diffusion TransformerBoyuan Li, Xihua Wang, Ruihua Song, Wenbing Huang 0001. 1-5 [doi]
- LipReading for Low-resource Languages by Language Dynamic LoRAShuai Zou, Xuefeng Liang, Yiyang Huang. 1-5 [doi]
- Reversible Data Hiding in Encrypted Images Based on Variational Lossless Compression and Dual EncryptionYaolin Yang, HongJie He. 1-5 [doi]
- EgoNet: An Unified Egocentric Active Speaker Detection Framework for both Camera Wearer and Visible CandidatesYongqian Li, Xin Zhou, Zheng He, Wei Yu, Yong Luo. 1-5 [doi]
- Deformable Attention-Based Edge-Aware Network for Single Image Super-ResolutionJu Zhang, Baojiang Zhong, Kai-Kuang Ma. 1-5 [doi]
- Estimation of Multi-Attribute Differential Graphs with Non-Convex PenaltiesJitendra K. Tugnait. 1-5 [doi]
- Accurate 3D Facial Paralysis Analysis Using Multi-View Infrared Structured Light SystemDi Wu, Yuping Ye, Jixin Liang, Shiyang Long, Zhan Song. 1-5 [doi]
- Variable Bitrate Residual Vector Quantization for Audio CodingYunkee Chae, Woosung Choi, Yuhta Takida, Junghyun Koo, Yukara Ikemiya, Zhi Zhong, Kin Wai Cheuk, Marco A. Martínez Ramírez, Kyogu Lee, Wei-Hsiang Liao 0001, Yuki Mitsufuji. 1-5 [doi]
- Zero-shot Quantization for Large-kernels via Shape-based Distribution and Diversity Self-distillationYao Li, Zhuozhen Yu, Xinrui Chen, Shunzhou Wang, Hang Yuan, Wei Gao. 1-5 [doi]
- TKA-MIL: Top-K Attention Multiple Instance Learning for Whole Slide Image Classification and Instance Probability DerivationSicheng Yu, Xingshu Chen, Fangzhou Cao, Ting Tian. 1-5 [doi]
- MAFD: Fine-Grained Motion Style Transfer with Adaptive Signal FusionZiyun Qian, Dingkang Yang, Mingcheng Li, Dongliang Kou, Lihua Zhang. 1-5 [doi]
- Analysis of Speech Temporal Dynamics in the Context of Speaker Verification and Voice AnonymizationNatalia A. Tomashenko, Emmanuel Vincent 0001, Marc Tommasi. 1-5 [doi]
- ROME: Radar Sparsity Improvement and Omnimodal Enhancement for 3D Object Detection in Bird's Eye ViewsYilong Guo, Junyin Wang, Chenghu Du, Shengwu Xiong 0001, Yaxiong Chen. 1-5 [doi]
- SCI-Gaussian: Optimizing 3D Gaussian Radiance Fields from a Snapshot Compressive ImageXiaoyue Li, Yunhao Li, Xiaodong Wang, Xin Yuan, Mark D. Butala, Gaoang Wang. 1-5 [doi]
- GradPFL: Gradient-Driven Adaptive Clustering in Personalized Federated LearningShiyu Song, Hao Zheng, Zhigang Hu, Meiguang Zheng, Liu Yang, Aikun Xu. 1-5 [doi]
- ITW-DehazeFormer: Imaging through Turbid Water Using Improved DehazeFormerQicong Wang, Xiaopin Zhong, Dajiang Lu, Yibin Tian. 1-5 [doi]
- Pan-protein Design Learning Enables Task-adaptive Generalization for Low-resource Enzyme DesignJiangbin Zheng 0002, Ge Wang, Han Zhang, Stan Z. Li. 1-5 [doi]
- Multi-scale Re-weighted Attention Feature Fusion for Non-Intrusive Load MonitoringLingxi Yang, Meijun Sun, Haowei Ran, Yipu Liu, Yan Zhou, Zheng Wang. 1-5 [doi]
- BP-GPT: Auditory Neural Decoding Using fMRI-prompted LLMXiaoyu Chen, Changde Du, Che Liu, Yizhe Wang, Huiguang He. 1-5 [doi]
- Joint Semantic Segmentation of Optical and SAR Image in Hazy Environments via Cross-modal Information Rectification and Cross-attention FusionXinyue Fan, Libao Zhang. 1-5 [doi]
- DETECLAP: Enhancing Audio-Visual Representation Learning with Object InformationShota Nakada, Taichi Nishimura, Hokuto Munakata, Masayoshi Kondo, Tatsuya Komatsu. 1-5 [doi]
- SSFSL: Self-Supervised and Few-Shot Learning for Cross-Domain Hyperspectral Image ClassificationGuohua Lv, Xiang Gao, Qiang Chi, Guixin Zhao, Aimei Dong, Wei Li 0032. 1-5 [doi]
- Training an Anti-KD Model that Cannot Teach Students via Similarity DisruptionZheming Liang, Qi Chu, Tao Gong, Bin Liu, Quanchen Zou, Deyue Zhang, Nenghai Yu. 1-5 [doi]
- Joint Space-Time Adaptive Processing and Beamforming Design for Cell-Free ISAC SystemsRang Liu, Ming Li 0011, Qian Liu 0001. 1-5 [doi]
- PaSTS: Parameter-affined Seasonal-Trend Synthesis for Multi-dimensional Long-Term Time Series Forecasting within LLMQuanfeng Lv, Jingguo Ge, Yifei Xu, Tong Li 0012, Liangxiong Li. 1-5 [doi]
- Frequency-Domain Guided Multiple Parallel Kernels Network for Low-Light Remote Sensing Image EnhancementJingxuan Zhou, Hao Li, JinLong Wang, Xiongxin Tang, Fanjiang Xu. 1-5 [doi]
- HFLR: Optimizing GNN Training via High-Fixed-Low-ResamplingChang Gong 0002, Boyu Yang 0003, Weiguo Zheng. 1-5 [doi]
- Loss-Aware Curriculum Learning for Chinese Grammatical Error CorrectionDing Zhang, Yangning Li, Lichen Bai, Hao Zhang, Yinghui Li, Haiye Lin, Hai-Tao Zheng, Xin Su, Zifei Shan. 1-5 [doi]
- Multi-modal Speech Enhancement with Limited Electromyography ChannelsFuyuan Feng, Longting Xu, Rohan Kumar Das. 1-5 [doi]
- Enhancing Convolutional Models for Indoor Radio Mapping via Ray MarchingMengfan Wu, Marco Skocaj, Mate Boban. 1-2 [doi]
- COAST: Contrastive Learning with Augmented Spatio-Temporal Encoding for Next POI RecommendationBada Xin, Xin Wan, Zhuojun Jiang, Faqiang Liu, Su Chen, Rong Yang, Qingyun Liu 0001. 1-5 [doi]
- Diffusion Features to Bridge Domain Gap for Semantic SegmentationYuxiang Ji, Boyong He, Chenyuan Qu, Zhuoyue Tan, Chuan Qin, Liaoni Wu. 1-5 [doi]
- A Modular-based Strategy for Mitigating Gradient Conflicts in Simultaneous Speech TranslationXiaoqian Liu, Yangfan Du, Jianjin Wang, Yuan Ge, Chen Xu, Tong Xiao, Guocheng Chen, Jingbo Zhu. 1-5 [doi]
- Density-Adaptive Fuzzy Clustering with Isolation KernelParitosh Tiwari, Rankit Kachroo, Punit Rathore. 1-5 [doi]
- MF-BERT: A Siamese Pre-training Framework for Motion ForecastingJianxin Shi, Jinhao Chen, Xiaolong Chen, Jun Ma, Tianyu Wo. 1-5 [doi]
- Representative Arm Identification: A fixed confidence approach to identify cluster representativesSarvesh Gharat, Aniket Yadav, Nikhil Karamchandani, Jayakrishnan Nair 0001. 1-5 [doi]
- Simplified one-sided Image-to-Image Translation with Reconstruction-Constrained Generative Adversarial NetworksShuocheng Wang, Qingfeng Wu, Mengyuan Ge, Yingdong Wang. 1-5 [doi]
- Efficient Data-Dependent Random Projection for Least Square RegressionsJacob Sturges, Luyuan Yang, Shayan Shafaei, Chao Lan. 1-5 [doi]
- Hazy Remote Sensing Image Semantic Segmentation with Weak Annotations via Pre-training Optimization and Co-trainingJunda Xu, Libao Zhang. 1-5 [doi]
- Structural Similarity-Aware Cross-domain Transformer for Improved Seismic Fault DetectionTiash Ghosh, Razeen A. Rasheed, Sanjai Kumar Singh, Mamata Jenamani, Aurobinda Routray. 1-5 [doi]
- Perceptual Noise-Masking with Music through Deep Spectral Envelope ShapingClémentine Berger, Roland Badeau, Slim Essid. 1-5 [doi]
- Reduced Spatial Dependency for More General Video-level Deepfake DetectionBeilin Chu, Xuan Xu, Yufei Zhang, Weike You, Linna Zhou. 1-5 [doi]
- A Mamba-based Network for Semi-supervised Singing Melody Extraction Using Confidence Binary RegularizationXiaoliang He, Kangjie Dong, Jingkai Cao, Shuai Yu 0002, Wei Li 0012, Yi Yu 0001. 1-5 [doi]
- MKD-YOLO: Multi-Scale and Knowledge-Distilling YOLO for Efficient PPE Compliance DetectionJuntao Zan, Yang Fang, Qilie Liu, Uswah Khairuddin, Yan Li, Kaiwei Sun. 1-5 [doi]
- Multi-User Non-Orthogonal Multiple Access in Power Line Communications Under Generalized Gaussian NoiseRoopesh Ramesh, Sanjeev Gurugopinath, R. Muralishankar. 1-5 [doi]
- Lightweight Clustered Federated Learning via Feature ExtractionGuanzhang Lao, Xinglin Zhang, Yun Li 0002, Yue-jiao Gong. 1-5 [doi]
- Principal Curvatures Estimation with Applications to Single Cell DataYanlei Zhang, Lydia Mezrag, Xingzhi Sun 0003, Charles Xu 0004, Kincaid MacDonald, Dhananjay Bhaskar, Smita Krishnaswamy, Guy Wolf, Bastian Rieck. 1-5 [doi]
- An Attentive Dual-Encoder Framework Leveraging Multimodal Visual and Semantic Information for Automatic OSAHS DiagnosisYingchen Wei, Xihe Qiu, Xiaoyu Tan, Jingjing Huang, Wei Chu, Yinghui Xu, Yuan Qi 0001. 1-5 [doi]
- How Machines Perceive Rooms - Regions of Relevance in Room Impulse ResponsesPrachi Sharma, Christian Kehling. 1-5 [doi]
- Enhancing Change Detection in Remote Sensing: Integrating Synthetic Data with Semi-Supervised LearningYafei Luo, Erik Meijering, Yang Song 0001. 1-5 [doi]
- LCE: A Framework for Explainability of Ultrasound Image Based on Concept DiscoveryWeiji Kong, Xun Gong, Juan Wang. 1-5 [doi]
- ARIG-GCN: Anatomical Relationship and Isomorphic Graph Approximation Guided Graph Convolutional Network for Automated ASPECTS Scoring on Non-Contrast CTNing Li, Zhe Qu, Jie Wang, Hulin Kuang. 1-5 [doi]
- Semi-Automatic Labeling for Action Recognition by Diversity Preserving SamplingRyuhei Ando, Takashi Shibata 0001, Toru Takahashi. 1-5 [doi]
- Enhancing Network Calibration for Low-Cost Gas Sensor Networks Through Adaptive Similarity SearchCheng Yang, Saikat Chatterjee, Tobias J. Oechtering. 1-5 [doi]
- DecoupledSynth: Enhancing Zero-Shot Text-to-Speech Via Factors DecouplingJingyuan Xing, Shuaiqi Chen, Xiangmin Xu, Xiaofen Xing. 1-5 [doi]
- FreeSegDiff: Annotation-free Saliency Segmentation with Diffusion ModelsChaofan Ma, Yuhuan Yang, Chen Ju, Yue Shi, Ya Zhang, Yanfeng Wang. 1-5 [doi]
- Improvised Performance Following in Real Time for Automatic AccompanimentJunyan Jiang, Akira Maezawa, Gus Xia. 1-5 [doi]
- Contrast-Unity for Partially-Supervised Temporal Sentence GroundingHaicheng Wang, Chen Ju, Weixiong Lin, Chaofan Ma, Shuai Xiao, Ya Zhang 0002, Yanfeng Wang 0001. 1-5 [doi]
- DuPI: Dual-resolution Pseudo-label Integration for Semi-supervised Instance SegmentationYue Ma, Jie Hu 0018, Chen Chen 0001, Shengchuan Zhang, Xianming Lin, Liujuan Cao. 1-5 [doi]
- Flexible Event-Driven Biological Imaging via Bayesian InferenceYayin Tan, Tianle Zhao, Zhiqin Chu. 1-5 [doi]
- A Noisy Label Filter based on GMM Binary Classification for Speaker VerificationJianglong Yao, Shenghui Lu, Pengyu Ren, Kaidi Wang, Qinyang Hong, Lin Li. 1-5 [doi]
- Single-View Reconstruction via Decoupled 3D Gaussian SplattingSheng Liu, Shiming Zhu, Huilong Pi, Yunchuan Qin, Zhuo Tang, Ruihui Li. 1-5 [doi]
- Know Your Heart Better: Multimodal Cardiac Output Monitoring using EarbudsHao Zhou, Md Mahbubur Rahman, Mehrab Bin Morshed, Yunzhi Li, Md Saiful Islam, Larry Zhang, Jungmok Bae, Christina Rosa, Wendy Berry Mendes, Jilong Kuang. 1-5 [doi]
- Exploiting the Relationship within the Unlabelled Samples by Set Matching for Generalized Category DiscoveryQiubo Ma, Hang Yu, Yuan Shan, Pinzhuo Tian. 1-5 [doi]
- Salmon: A Suite for Acoustic Language Model EvaluationGallil Maimon, Amit Roth, Yossi Adi. 1-5 [doi]
- Automatic Adaption of the Step Size in Gradient Descent TrainingAlbino Nogueiras Rodríguez, Ignasi Nogueiras-Marco. 1-5 [doi]
- Multi Queue for Unsupervised Person Re-identificationZhenyuan Lin, Shengyong Xie, Danhua Liu, Weikun Li, Ang Gao, Yubo Dong. 1-5 [doi]
- Deep Unfolded Approximate Message Passing for Quantitative Acoustic Microscopy Image ReconstructionOdysseas A. Pappas, Jonathan Mamou, Adrian Basarab, Denis Kouamé, Alin Achim. 1-5 [doi]
- SSDViT: Exploring Siamese and Self Distillation in ViTs for Generalizable Person Re-identificationJieru Jia, Jianchao Yang. 1-5 [doi]
- A Unified Joint Contrastive Triplet Loss with Temporal and Frequency Signal Fusion for Diagnosing Heart MurmursAyushi Pal, Arka Roy, Udit Satija. 1-5 [doi]
- Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching ModelJialong Zuo, Shengpeng Ji, Minghui Fang 0002, Ziyue Jiang 0001, Xize Cheng, Qian Yang 0006, Wenrui Liu 0003, Guangyan Zhang, Zehai Tu, Yiwen Guo, Zhou Zhao 0001. 1-5 [doi]
- Robust Deepfake Detection via Perturbation Domain AlignmentLin Lu, Yunhong Wang 0001, Liang Zhang, Yuanfang Guo. 1-5 [doi]
- Foundation Model and Temporal Priors-guided Transductive Few-shot Action RecognitionBach Vu, Hoang Nguyen, Quang Minh Nguyen, Duong Le, Hieu Pham, Phi-Le Nguyen, Lam M. Nguyen. 1-5 [doi]
- MuRL-DTI: A Multimodal Feature Fusion Reinforcement Learning Approach for Cold Start in Drug-Target InteractionsYao Liu, Xin Wang, Ye Liu, Dandan Dou. 1-5 [doi]
- Guiding Inter-domain Class Balancing With Salient Features For Domain Adaptive Object DetectionHaiming Peng, Dingkang Yang, Mingxu Wang, Weilong Lin, Xinhua Zeng. 1-5 [doi]
- Speaker-IPL: Unsupervised Learning of Speaker Characteristics with i-Vector based Pseudo-LabelsZakaria Aldeneh, Takuya Higuchi, Jee-weon Jung, Li-Wei Chen, Stephen Shum, Ahmed Hussen Abdelaziz, Shinji Watanabe 0001, Tatiana Likhomanenko, Barry-John Theobald. 1-5 [doi]
- Accelerating Codec-based Speech Synthesis with Multi-Token Prediction and Speculative DecodingTan Dat Nguyen, Ji-Hoon Kim, Jeongsoo Choi, Shukjae Choi, Jinseok Park, Younglo Lee, Joon Son Chung. 1-5 [doi]
- MixSense : Mixture of Vision SenseJian Lin, Zhuoran Wang, Qibo Qiu, Jianzhong Chen, Zixian Ge, Weizhong Jin, Yuchao Yan, Li Yu. 1-5 [doi]
- Dual-PST: Dual-Branch SpatioTemporal-Planar Network for Video Forgery DetectionSiyu Liu, Zhida Zhang, Junxian Duan, Jie Cao 0002, Aihua Zheng. 1-5 [doi]
- Importance-Awareness Masking Network for Robust Document RetrievalJunping Liu, Jiaqi He, Xinrong Hu, Wangli Yang, Jie Yang 0009, Yi Guo 0001. 1-5 [doi]
- FlowSE: Flow Matching-based Speech EnhancementSeonggyu Lee, Sein Cheong, Sangwook Han, Jong Won Shin. 1-5 [doi]
- Fractional-Order Hyperbolic Tangent Based Adaptive Algorithm for Feedback Control in Hearing AidsVanitha Devi R, Vasundhara, Asutosh Kar, Mads Græsbøll Christensen. 1-5 [doi]
- Efficient Fine-tuning Strategies for Enhancing Face Recognition Performance in Challenging ScenariosYin Lin, Ziyang Wu, Qidong Huang, Xinran Liu, Baocai Yin, Jinshui Hu, Bing Yin, Zengfu Wang. 1-5 [doi]
- GPT-C: Generative PrompT CompressionLijun Liu, Rui Wang 0032, Lihua Jing, Feixiao Lv, Zixuan Zhu 0002. 1-5 [doi]
- Delayed Fusion: Integrating Large Language Models into First-Pass Decoding in End-to-end Speech RecognitionTakaaki Hori, Martin Kocour, Adnan Haider, Erik McDermott, Xiaodan Zhuang. 1-5 [doi]
- KMG-LL: Knowledge-enhanced Multimodal Graph for Dialogue GenerationYuezhou Dong, Tao He 0007, Qian Dong, Ke Qin. 1-5 [doi]
- DAEF-VS: An Efficient Universal VoIP Steganalysis Framework Based on Domain-Aware KnowledgeZhengyang Fang, Pengcheng Zhou, Zhongliang Yang, Zhili Zhou, Linna Zhou. 1-5 [doi]
- Sparse-VQ Transformer: An FFN-Free Framework with Vector Quantization for Enhanced Time SeriesYanjun Zhao, Tian Zhou 0004, Chao Chen, Liang Sun 0001, Yi Qian 0004, Rong Jin 0001. 1-5 [doi]
- Time-Space-Interlaced Spatiotemporal Graph Forecasting via Two-Stage Summarized AttentionZhaoyang Sun, Yudong Zhang 0005, Xuan Yu, Kai Wang 0036, Binwu Wang, Yang Wang 0015, Xu Wang 0029. 1-5 [doi]
- Multi-source Data Lossless Compression via Parallel Expansion Mapping and xLSTMHuidong Ma, Hui Sun 0002, Liping Yi, Xiaoguang Liu 0001, Gang Wang 0001. 1-5 [doi]
- RETAIN: Reliable Topology Augmentation for both Heterophilic and Homophilic GraphsZiyun Zou, Lian Shen, Yanhao Li, Yifan Lu, Juan Liu, Xiangrong Liu. 1-5 [doi]
- Dense-Sparse Dynamic Time Warping for Customizing Piano Concerto AccompanimentsTJ Tsai, Kavi Dey, Yigitcan Özer, Meinard Müller. 1-5 [doi]
- Class Semantic Prompts Enhanced Prototypical Fusion Method for Few-shot Named Entity RecognitionMei Yu 0004, Yuang Tao, Mankun Zhao, Tianyi Xu, Zechen Meng, Wenbin Zhang 0010, Jian Yu 0003. 1-5 [doi]
- Sub-band Domain Multi-Hypothesis Acoustic Echo Canceler Based Acoustic Scene AnalysisBenjamin J. Southwell, Yin-Lee Ho, David Gunawan. 1-5 [doi]
- Cooperative and Competitive Functional Connectivity Based on Improved Ising ModelGengqian Wei, Chuang Liang, Tülay Adali, Rongtao Jiang, Daoqiang Zhang, Vince D. Calhoun, Shile Qi. 1-5 [doi]
- Multimodal and Multiple Prompts for BiologyChonghuinan Wang, Xi Chen, Hongxun Yao. 1-5 [doi]
- Enhancing 6D Pose Estimation with Cross-modal Fusion Network and Density-peak Keypoint LocalizationLiming Zhang 0007, Qing Li, Zhenhong Chen, Chuan Yan, Xiaojiang Peng. 1-5 [doi]
- Contactless Nighttime Stress Monitoring with mmWave RadarXiaohan Xu, Dongheng Zhang, Zhi Lu, Jinbo Chen, Zhi Wu, Ruixu Geng, Qibin Sun, Yan Chen. 1-5 [doi]
- LoRATEE: A Secure and Efficient Inference Framework for Multi-Tenant LoRA LLMs Based on TEEZechao Lin, Sisi Zhang, Xingbin Wang, Yulan Su, Yan Wang, Rui Hou 0001, Dan Meng. 1-5 [doi]
- NAT3DSound: 3D Spatial Sound Field Synthesis with Multi-Modal Non-Autoregressive TransformerFuming You, Rongjie Huang 0001, Boyang Zhang, Yongqi Wang, Zhiqing Hong, Zhimeng Zhang, Zhou Zhao 0001. 1-5 [doi]
- Sharpness-Aware Minimization with Adaptive Regularization for Training Deep Neural NetworksJinping Zou, Xiaoge Deng, Tao Sun 0005. 1-5 [doi]
- NanoGen: A High-affinity Nanobody Generation Model with Guided DiffusionDezhi Wu, Xuejiao Liu, Yiming Qin, Stephanie M. Linker, Karin Hrovatin, Alexander V. Hopp, Feng Tan. 1-5 [doi]
- Graph Learning with Low-rank and Diagonal Structures: A Riemannian Geometric ApproachXiang Zhang, Qiao Wang. 1-5 [doi]
- DEP-SLAM: A Dynamic Environment Perception SLAM System with Large Language ModelsYing He 0006, E. Richard Yu, Fei Ma 0006, Ming Li 0011, Guang Zhou. 1-5 [doi]
- TDMF: Text-Guided Denoising and Interactive Medical Image FusionAimei Dong, Jingyuan Xu, Long Wang, Guohua Lv, Guixin Zhao, Jinyong Cheng. 1-5 [doi]
- MLNet: Mutual Learning Network to Improve Self-Supervised Representation for Fine-Grained Visual RecognitionPeipei Zhao, Jiaxuan Wang, Zixiang Lu, Qiguang Miao. 1-5 [doi]
- DDNet: Deformable Convolution and Dense FPN for Surface Defect Detection in Recycled BooksJun Yu, Wenjian Wang. 1-5 [doi]
- Unveiling the Pruning Risks on Privacy Vulnerabilities of Deep Neural NetworksWenxin Kuang, Qizhuang Liang, Peng Sun, Wei Fu, Qiao Hu, Yupeng Hu. 1-5 [doi]
- BloomCoreset: Fast Coreset Sampling using Bloom Filters for Fine-Grained Self-Supervised LearningPrajwal Singh, Gautam Vashishtha, Indra Deep Mastan, Shanmuganathan Raman. 1-5 [doi]
- Learning from Ambiguous Data with Hard LabelsZeke Xie, Zheng He, Nan Lu, Lichen Bai, Bao Li, Shuo Yang, Mingming Sun 0001, Ping Li 0001. 1-5 [doi]
- Joint Edge and Regional Depth Enhancement Network for Camouflaged Object DetectionMiao Qi, Zheng Wang, Cheng Liu, Yan Zhou, Meijun Sun. 1-5 [doi]
- Efficient Object Placement Via LLM and Diffusion ModelWei Liu, Liuan Wang, Jun Sun. 1-5 [doi]
- PiCNet: Physics-infused Convolution Network for Radar-Based Precipitation NowcastingZheng Wang 0059, Hanyi Zhang, Cong Bai. 1-5 [doi]
- Robust Fusion of Bone and Air-Conducted Sensors for Speech Enhancement with Adaptive Temporal-Frequency AttentionZhenglong Liu, Zhe Chen, Fuliang Yin. 1-5 [doi]
- LNLFace: Enhanced Blind Face Restoration With Local and Non-local LookupsWeidan Yan, Wenze Shao, Dengyin Zhang. 1-5 [doi]
- Low-Correlation OFDM Waveform Design With Optimally Coded Sub-Carriers for the Joint Sensing and CommunicationsChunxuan Shi, Yongzhe Li, Ran Tao 0003. 1-5 [doi]
- Causal Debiasing for Visual Commonsense ReasoningJiayi Zou, Gengyun Jia, Bing-Kun Bao. 1-5 [doi]
- Spiking Generative Models Based on Variational Autoencoder and Adversarial TrainingWenchuan Zhang, Jie Zhang, Ricky Yuen-Tan Hou, Weifeng Su, Wentao Fan. 1-5 [doi]
- Peer-to-Peer Learning Dynamics of Wide Neural NetworksShreyas Chaudhari, Srinivasa Pranav, Emile Anand, José M. F. Moura. 1-5 [doi]
- Self-Support Prototype-Aware For Few-Shot Semantic SegmentationJiaxiang Fang, Shiqiang Ma, Shengfeng He, Fei Guo 0001. 1-5 [doi]
- Enhancing Zero-Shot Cross-Lingual Event Argument Extraction with Language-Independent InformationXiruijie Yi, Xiaoxu Zhu, Peifeng Li. 1-5 [doi]
- Multi-view Feature Discrepancy Attack for Single Object TrackingZhiheng Li, ZhiMin Weng, Yuehuan Wang. 1-5 [doi]
- STGE-Former: Spatial-Temporal Graph-Enhanced Transformer for EEG-Based Major Depressive Disorder DetectionYu Chen, Chunfeng Yang. 1-5 [doi]
- Enhancing Continual Learning for Medical Imaging: Efficient Knowledge Transfer and Multi-Disease PredictionEnzhi Wang, Qicheng Li, Di Liu, Bo Yang. 1-5 [doi]
- A Spherical-Harmonic Domain Selective Spatial Active Noise Control System Based on Sound Field ReproductionHuawei Zhang, Huiyuan Sun, Jihui Aimee Zhang, Prasanga N. Samarasinghe, Yile Angela Zhang. 1-5 [doi]
- Enabling DMG Wi-Fi Sensing in Data Transmission Intervals by Exploiting Beam Training CodebookKareem M. Attiah, Pu Perry Wang, Hassan Mansour, Toshiaki Koike-Akino, Petros Boufounos. 1-5 [doi]
- Optical Authenticity in Pushbroom System for Spectral Information ProtectionPablo Gomez, Roman Jacome, Emmanuel Martínez 0002, Hans Garcia, Henry Arguello. 1-5 [doi]
- Partial Reconstruction Error for Deepfake DetectionYufei Zhang, Zheling Meng, Bo Peng, Jing Dong, Beilin Chu, Wei Wang. 1-5 [doi]
- Optimized Self-supervised Training with BEST-RQ for Speech RecognitionIlja Baumann, Dominik Wagner 0002, Korbinian Riedhammer, Tobias Bocklet. 1-5 [doi]
- Which Prosodic Features Matter Most for Pragmatics?Nigel G. Ward, Divette Marco, Olac Fuentes. 1-5 [doi]
- A Novel Self-Supervised Contrastive Learning Framework for Masked EEG Motor Imagery ModelingKunkun Zhang, Qianwei Zhou, Haigen Hu. 1-5 [doi]
- Audio-Visual Deepfake Detection With Local Temporal InconsistenciesMarcella Astrid, Enjie Ghorbel, Djamila Aouada. 1-5 [doi]
- ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial TrainingXinfa Zhu, Lei He 0005, Yujia Xiao, Xi Wang 0016, Xu Tan 0003, Sheng Zhao, Lei Xie 0001. 1-5 [doi]
- InstantSpeech: Instant Synchronous Text-to-Speech Synthesis for LLM-driven Voice ChatbotsMuyang Du, Chuan Liu, Junjie Lai. 1-5 [doi]
- Fooling the Forgers: A Multi-Stage Framework for Audio Deepfake DetectionGautam Siddharth Kashyap, Zohaib Hasan Siddiqui, Mohammad Anas Azeez, Rafiq Ali, Shantanu Kumar, Navin Kamuni, Jiechao Gao. 1-5 [doi]
- NeuroIncept Decoder for High-Fidelity Speech Reconstruction from Neural ActivityOwais Mujtaba Khanday, José L. Pérez-Córdoba, Mohd Yaqub Mir, Ashfaq Ahmad Najar, Jose A. Gonzalez-Lopez. 1-5 [doi]
- GAN-Based Speech Enhancement for Low SNR Using Latent Feature ConditioningShrishti Saha Shetu, Emanuël A. P. Habets, Andreas Brendel. 1-5 [doi]
- Exploring Antenna Placement Configurations with a Radar-based Silent Speech InterfaceJoão Vítor Menezes, Martin Schütze, Petr Schaffer, Dirk Plettemeier, Peter Birkholz. 1-5 [doi]
- Invariant Model Learning on Local-Aware Wasserstein Geodesic for Domain AdaptationYou-Wei Luo, Yi-ming Zhai, Chuan-Xian Ren. 1-5 [doi]
- Retrieval-Augmented Multilingual Citation GenerationXun Liang 0001, Simin Niu, Sensen Zhang, Zhiyu Li, Xuan Zhang 0009, Bo Wu 0026, Feiyu Xiong, Bo Tang, Hanyu Wang, Shichao Song, Mengwei Wang, Jiawei Yang. 1-5 [doi]
- Low-Rank Tucker Decomposition of Multi-Subject Complex-Valued fMRI DataBin-Hua Zhao, Qiu-Hua Lin, Yue Han, Jia-Yang Song, Yan-Wei Niu, Vince D. Calhoun. 1-5 [doi]
- Star Operation in Self-Attention for 3D Human Pose EstimationHao Wang, Xiaochuan Wang, Ruijun Liu, Xiaoming Chen, Haisheng Li. 1-5 [doi]
- Multiband Unlimited Sampling: Super-Nyquist or Sub-Nyquist, That is the Question!Ruiming Guo, Gal Shtendel, Ayush Bhandari. 1-5 [doi]
- Boosting Stereo Image Noise Removal by Learning Uncertainty and Enriched FeaturesBingcai Wei, Hui Liu, Chuang Qian, Xinyu Ren, Xintong Xu. 1-5 [doi]
- Joint Semantic Knowledge Distillation and Masked Acoustic Modeling for Full-band Speech Restoration With Improved IntelligibilityXiaoyu Liu 0003, Xu Li, Joan Serrà, Santiago Pascual. 1-5 [doi]
- Point Cloud Resampling With Learnable Heat DiffusionWenqiang Xu, Wenrui Dai, Duoduo Xue, Ziyang Zheng, Chenglin Li, Junni Zou, Hongkai Xiong. 1-5 [doi]
- A Bilinear Source Separation, Dereverberation, and Background Noise Suppression Algorithm for Augmented Reality ApplicationsAlon Nemirovsky, Gal Itzhak, Israel Cohen. 1-5 [doi]
- Topological Scattering over Product Cell ComplexesAyushman Raghuvanshi, Sravanthi Gurugubelli, Sundeep Prabhakar Chepuri. 1-5 [doi]
- Task-Aware Unified Source SeparationKohei Saijo, Janek Ebbers, François G. Germain, Gordon Wichern, Jonathan Le Roux. 1-5 [doi]
- Open Automatic Speech Recognition Models for Classical and Modern Standard ArabicLilit Grigoryan, Nikolay Karpov, Enas Albasiri, Vitaly Lavrukhin, Boris Ginsburg. 1-5 [doi]
- KARST: Multi-Kernel Kronecker Adaptation with Re-Scaling Transmission for Visual ClassificationYue Zhu 0012, Haiwen Diao, Shang Gao 0012, Long Chen 0016, Huchuan Lu. 1-5 [doi]
- Spatio-Temporal Mapping Generative Adversarial Network for Functional Connectivity Network Reconstruction across Brain AtlasesHongzheng Guan, Tao Jin, Li Xiao, Gang Qu, Yu-Ping Wang. 1-5 [doi]
- Optimization of Chirality Variation in Carbon Nanotube Field Effect Transistor Spiking NeuronsShelby Williams, Prosen Kirtonia, Kasem Khalil, Magdy A. Bayoumi. 1-4 [doi]
- Deepfake Detection of Singing Voices With Whisper EncodingsFalguni Sharma, Priyanka Gupta. 1-5 [doi]
- On Class Separability Pitfalls In Audio-Text Contrastive Zero-Shot LearningTiago Fernandes Tavares, Fabio José Ayres, Zhepei Wang, Paris Smaragdis. 1-5 [doi]
- Robust Multi-Pitch Estimation via Optimal Transport ClusteringAnton Björkman, Filip Elvander. 1-5 [doi]
- Pseudo-Labeling for Enhanced User Privacy in Approximate Machine UnlearningShunichi Watanabe. 1-5 [doi]
- Exploring the Interpretability of EEG-Inception Convolutional Neural Networks for Epilepsy PredictionGuanglong Zhang, Tianren Wang, Jinjie Guo, Zhiyuan Yang, YiLian Wu, Guixia Kang. 1-5 [doi]
- Spatiotemporal-Aware Visual Captioning using Vision-Language Pre-Training ModelShuai Wu, Weidong Yang, Shuyan Wu. 1-5 [doi]
- Lightweight Self-Supervised Monocular Depth Estimation for All-Day Scenes Using Generative Adversarial NetworkJunding Zhang, Di Rao, Youssef Akoudad, Wei Gao 0021, Jie Chen 0022. 1-5 [doi]
- Anchored Monotonic Alignment and Representation Substitution for Rare Spontaneous Behaviors in Spontaneous Speech SynthesisNing-Qian Wu, Ya-Jun Hu, Liping Chen, Zhen-Hua Ling. 1-5 [doi]
- SSCM: Self-Supervised Critical Model for Reducing Hallucinations in Chinese Financial Text GenerationKeyan Jin, Yapeng Wang, Leonel Santos, Tao Fang, Xu Yang 0010, Sio Kei Im. 1-5 [doi]
- VibeGait: Enhancing Structural-Vibration based Gait Recognition using VisionMainak Chakraborty, Chandan, Bodhibrata Mukhopadhyay, Sahil Anchal, Subrat Kar. 1-5 [doi]
- Adversarial Training and Cross-modal Feature Fusion in Multimodal Sentiment AnalysisJunhuai Li, Chuang Lin, Huaijun Wang, Yuxing Zhi, Jing Chen, Tao Huang. 1-5 [doi]
- TGCA: A Transformer GNN-based Approach with Cross-Attention Mechanism for Steganographic Text Detection in Social NetworksJunkai Lu, Zhongliang Yang, Kaibo Huang, Zhuang Wang, Zhili Zhou, Linna Zhou. 1-5 [doi]
- A Frequency-aware Augmentation Network for Mental Disorders Assessment from AudioShuanglin Li, Siyang Song, Rajesh Nair, Syed Mohsen Naqvi. 1-5 [doi]
- Enhancing Precision in Image-Guided Spine Surgery through the Prediction of Occluded Fiducials Utilizing ResNet ArchitectureDurga R, Darshan B, Pranav Hyagreev S, Abhilash Chakkaravarthy, Aparna Purayath, Vivek Maik, Manojkumar Lakshmanan, Mohanasankar Sivaprakasam. 1-5 [doi]
- 3Rec: Selective State Space Models with Mixture-of-Modality Experts for Multi-Modal Sequential RecommendationXu Guo, Tong Zhang, Yufei Xue, Chenxu Wang, Fuyun Wang, Zhen Cui. 1-5 [doi]
- Nonlinear Anisotropic Diffusion-Based Channel Estimation in 5G Wireless NetworksKanwardeep Singh Gahlot, Sandeep Joshi, Ke Wang. 1-5 [doi]
- Spectrum Blind Unlimited Sampling of Multi-Band SignalsRuiming Guo, Ayush Bhandari. 1-5 [doi]
- Addressing Emotion Ambiguity and Annotator Subjectivity for Enhanced Speech Emotion LabelingPooja Kumawat, Aurobinda Routray. 1-5 [doi]
- Quantum Run-length Encoding: Optimizing Data Compression on Quantum Computers with Exponential Resource EfficiencyJiale Zhang, Xilong Che, Shiyong Jin, Kaifan Pan, Shun Peng, Juncheng Hu. 1-5 [doi]
- Intrusion Detection for Intelligent Transportation Systems: A lightweight interpretable modelYuxi Zhou, Tao Feng, Yazhuo Gao, Yixuan Wu, Lin Yang, Jiaqi Lin. 1-5 [doi]
- Dynamic Object Queries for Transformer-based Incremental Object DetectionJichuan Zhang, Wei Li 0110, Shuang Cheng, Yali Li 0001, Shengjin Wang. 1-5 [doi]
- Reward Generation via Large Vision-Language Model in Offline Reinforcement LearningYounghwan Lee, Tung Minh Luu, Donghoon Lee, Chang D. Yoo. 1-5 [doi]
- Leveraging Chain of Thought towards Empathetic Spoken Dialogue without Corresponding Question-Answering DataJingran Xie, Shun Lei, Yue Yu, Yang Xiang, Hui Wang, Xixin Wu, Zhiyong Wu. 1-5 [doi]
- Knowledge Distillation Based Training of Unified Conformer CTC Models for Multi-form ASRTakashi Fukuda, Gakuto Kurata, George Saon. 1-5 [doi]
- Knowledge Transfer Across Modalities for Weakly Supervised Point Cloud Semantic SegmentationZihan Wang, Yunhang Shen, Mengtian Li, Ke Li, Xing Sun, Shaohui Lin, Lizhuang Ma. 1-5 [doi]
- AdaCS: Adaptive Normalization for Enhanced Code-Switching ASRThe Chuong Chu, Vu Tuan Dat Pham, Trung-Kien Dao, Ngoc Hoang Nguyen 0001, Steven Q. H. Truong. 1-5 [doi]
- MambaTrack: Exploiting Dual-Enhancement for Night UAV TrackingChunhui Zhang, Li Liu 0036, Hao Wen, Xi Zhou, Yanfeng Wang 0001. 1-5 [doi]
- MMCD: Memory-Based Multimodal Change DetectionLimeng Zhang, Zenghui Zhang, Juanping Wu, Weiwei Guo, Tao Zhang, Wenxian Yu. 1-5 [doi]
- Improving Adversarial Transferability through Channel-wise Scaling and Frequency-random DroppingPei Chen, Zhiyong Feng 0002, Meng-xing, Yiming Zhang, Jinqing Zheng. 1-5 [doi]
- Dual-Space Augmented Intrinsic-LoRA for Wind Turbine SegmentationShubh Singhal, Raül Pérez-Gonzalo, Andreas Espersen, Antonio Agudo. 1-5 [doi]
- Exploration of Sequence-wise Optimized Parameters for Low Complexity Enhancement Video Coding (LCEVC) on 4K ContentMartin Benjak, Jörn Ostermann. 1-5 [doi]
- MIB: Mixed Information Bottleneck for Out-of-Distribution Keyword SpottingYashas Malur Saidutta, Rakshith Sharma Srinivasa, Jaejin Cho, Ching Hua Lee, Chouchang Yang, Yilin Shen, Hongxia Jin. 1-5 [doi]
- Unsupervised Domain Adaptation Via Data PruningAndrea Napoli, Paul R. White. 1-5 [doi]
- AdaPPA: Adaptive Position Pre-Fill Jailbreak Attack Approach Targeting LLMsLijia Lv, Weigang Zhang, Xuehai Tang, Jie Wen 0007, Feng Liu, Jizhong Han, Songlin Hu 0001. 1-5 [doi]
- PB-UAP: Hybride Universal Adversarial Attack for Image SegmentationYufei Song, Ziqi Zhou 0001, Minghui Li, Xianlong Wang 0001, Hangtao Zhang, Menghao Deng, Wei Wan, Shengshan Hu, Leo Yu Zhang. 1-5 [doi]
- Less Is More: Embracing Sparsity and Interpolation with Esiformer for Time Series ForecastingYangyang Guo, Yanjun Zhao, Sizhe Dang, Tian Zhou 0004, Liang Sun 0001, Yi Qian 0004. 1-5 [doi]
- Zero-Shot Cross-Domain Slot Filling with Retrieval Augmented In-Context LearningMengxiao Song, Tingwen Liu, Quangang Li, Duohe Ma, Ming Sun, Ling Tian. 1-5 [doi]
- Gaussian Difference: Find Any Change Instance in 3D ScenesBinbin Jiang, Rui Huang, Qingyi Zhao, Yuxiang Zhang. 1-5 [doi]
- Improving Knowledge Distillation via Cross-Modal Insights from CLIPJingtao Zhou, Hao Zheng, Wenkai Zhong, Zhiqiang Bao. 1-5 [doi]
- Learning Primitive Relations for Compositional Zero-Shot LearningInsu Lee, Jiseob Kim, Kyuhong Shim, Byonghyo Shim. 1-5 [doi]
- Real-time Adversarial Attack to Deep Learning-based Wi-Fi Human Activity RecognitionByungjun Kim, Amogh Panchagatti, Peter Gerstoft. 1-5 [doi]
- Synergistic Spotting and Recognition of Micro-Expression via Temporal State TransitionBochao Zou, Zizheng Guo, Wenfeng Qin, Xin Li 0034, Kangsheng Wang, Huimin Ma 0001. 1-5 [doi]
- DMKPN: Image Deblurring Under Multi-Factor Aliasing Diffusion DegradationYing Zhang, Xiongxin Tang, Hanxiang Yang, Qiao Chen, Fanjiang Xu. 1-5 [doi]
- Norm Augmented Graph AutoEncoders for Link PredictionYunhui Liu 0002, Huaisong Zhang, Xinyi Gao 0001, Liuye Guo, Zhen Tao, Tieke He. 1-5 [doi]
- SSCL-AMC: A Self-supervised Automatic Modulation Classification Method via Dynamic Augmentation and Ensemble LearningYilin Cai, Dingzhao Li, Sheng Wu, Mingyuan Shao, Shaohua Hong, Haixin Sun. 1-5 [doi]
- Aligning Text-to-Image Diffusion Models without Human FeedbackTao Liu, Huafeng Kuang, Xianming Lin. 1-5 [doi]
- Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo CancellationFei Zhao, Xueliang Zhang. 1-5 [doi]
- Multi-level Feature Adaptation and Embeddings Alignment for Zero-shot Anomaly DetectionYiqing Liu, Huilin Deng, Gang Zhao. 1-5 [doi]
- Brain MRI Segmentation with Language-Driven Detection and Context-Aware DescriptionsQiang Fu, XinYuan Xia, Yi Hong. 1-5 [doi]
- A Quality-Aware Sampling Framework for Efficient 3D Point Cloud TransmissionPuyue Hou, Qi Yang 0003, Yue Li 0015, Yujie Zhang, Jianchao Yang, Yiling Xu, Tiejun Huang 0001. 1-5 [doi]
- Map-Guided Few-Shot Audio-Visual Acoustics ModelingDiwei Huang, Kunyang Lin, Peihao Chen, Qing Du. 1-5 [doi]
- Uncertainty Estimation for Out-of-Distribution Detection of Whole Slide ImagesSaba Heidari Gheshlaghi, Nasim Yahya Soltani, Masoud Ganji. 1-5 [doi]
- 2Net: Multi-stage Mixed Feature Fusion Network For Remote Sensing Change DetectionBinhao Gu, Lei Song, Youyong Kong, Binjie Gu. 1-5 [doi]
- MS-UFAD: A Large-Scale Dataset for Real-world Unified Face Attack Detection with Text DescriptionsNing Jiang, Dingheng Zeng, Liang Gao, Sheng Chen, Zhifei Kong, Yanhong Liu, Jiao Li, Yue Feng, Tongtong Yuan, Weihong Deng, Quan Lu, Ying Li. 1-5 [doi]
- Dual-level AMR Injection for Prompt-based Event Argument ExtractionXiaojia Huang, Ruifang He, Fei Huang. 1-5 [doi]
- BeatKAN: An Efficient and Drum-Attuned Beat Tracking Method Using Kolmogorov-Arnold NetworksZhicheng Zhang, Ganghui Ru, Wei Li. 1-5 [doi]
- Unlocking Financial Statement Fraud Detection: Tracking Disclosure Changes via Representation LearningYue Yu, Zhen Wu, Yanni Han, Zhuoqun Li, Wenqi Wei. 1-5 [doi]
- Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion ModelsXin Jing, Kun Zhou 0003, Andreas Triantafyllopoulos, Björn W. Schuller. 1-5 [doi]
- An Efficient Pore Annotation Framework for Tight Sandstone Images with Segment Anything ModelDongsheng Li, Chunyan Zang, Huijie Zhang, Yiming Lin, Qiushi Xia. 1-5 [doi]
- Freeze and Learn: Continual Learning with Selective Freezing for Speech Deepfake DetectionDavide Salvi, Viola Negroni, Luca Bondi, Paolo Bestagini, Stefano Tubaro. 1-5 [doi]
- MADiff: Text-Guided Fashion Image Editing with Mask Prediction and Attention-Enhanced DiffusionZechao Zhan, Dehong Gao, Jinxia Zhang, Jiale Huang, Yang Hu, Xin Wang. 1-5 [doi]
- Hierarchical Proxy Learning for Cloth-Changing Person Re-IdentificationChenyang Yu, Xuehu Liu, Ju Dai, Pingping Zhang, Huchuan Lu. 1-5 [doi]
- ChunkFormer: Masked Chunking Conformer For Long-Form Speech TranscriptionKhanh Le, Tuan Vu Ho, Dung Tran, Duc Thanh Chau. 1-5 [doi]
- Online scalable Gaussian processes with conformal prediction for guaranteed coverageJinwen Xu, Qin Lu 0002, Georgios B. Giannakis. 1-5 [doi]
- Generalization Guarantee of Decentralized Learning with Heterogeneous DataHaoxiang Ye, Tao Sun 0005, Qing Ling 0001. 1-5 [doi]
- Deep Generic Representations for Domain-Generalized Anomalous Sound DetectionPhurich Saengthong, Takahiro Shinozaki. 1-5 [doi]
- Advanced Graph-MLPs Distillation based on Global and Local Hyperbolic Geometry LearningYao Guan, Wenzhu Yan, Ting Yuan, Yanmeng Li. 1-5 [doi]
- Macaque-Motion-Monitor Dataset: A New Benchmark for Macaque Action RecognitionJiawei Huang, Zhiyuan Chen, Wenxuan Fan, Xiaomei Zhang, Xibo Ma. 1-5 [doi]
- Mamba for Streaming ASR Combined with Unimodal AggregationYing Fang, Xiaofei Li. 1-5 [doi]
- Efficient Fusion of Computationally Diverse Modalities Using Chunking and Cross-AttentionChristian Flores, Lucas Goncalves, Carlos Busso. 1-5 [doi]
- PEPL: Precision-Enhanced Pseudo-Labeling for Fine-Grained Image Classification in Semi-Supervised LearningBowen Tian, Songning Lai, Lujundong Li, Zhihao Shuai, Runwei Guan, Tian Wu, Yutao Yue. 1-5 [doi]
- ETDE-Net: An End-to-End Time-Domain Enhancement Network for LPI Radar SignalsChen Cheng, Zhi Sun, Haonan Zhang, Zihao Xiao 0003, Guolong Cui. 1-5 [doi]
- HLTCOE Submission to the VoicePrivacy Attacker ChallengeHenry Li Xinyuan, Ashi Garg, Zexin Cai, Kevin Duh, Leibny Paola García-Perera, Sanjeev Khudanpur, Nicholas Andrews, Matthew Wiesner. 1-2 [doi]
- Multi-Class Dementia Detection Using Acoustic Features - ICASSP-2025 PROCESS ChallengeM. Abdullah Zafar, Xiangyu Zhang, Mostafa Shahin, Beena Ahmed. 1-2 [doi]
- An Optimized GPU-based Acceleration of CRYSTALS-DilithiumZhao Chen, Weimin He, Jiafei Wu, Jingjie Liu, Boqin Xu, Xiaoning Bian, Zhe Liu. 1-5 [doi]
- Through-the-Wall Multi-Person Localization using Translation and Rotation Synthetic Aperture RadarS. Sinha, A. Deshwal, A. Azizi, Divyanshu Pandey, Nishant Mehrotra, A. Pal, Ashutosh Sabharwal. 1-5 [doi]
- DU-PMVS: Learned Patchmatch Multi-View Stereo Based on Deformable Feature Pyramid and Uncertainty Awareness ModelingYutong Zheng, Hai Huang, Shan Yue, Hong Chen, Qing Wang. 1-5 [doi]
- MonTransformer: Self-Supervised Phonetic to Glyph Conversion Leveraging Positional Context for Traditional Mongolian TextsChenyang Zhou 0003, Monghjaya, Licheng Wu. 1-5 [doi]
- TELL ME: Tackle Electrocardiogram with Large Language Model EffectivelyTianyi Shi, Siyang Zheng, Zhu Meng, Zhe Cui, Jin Huang, Changrui Ren, Bo Zhang, Zhicheng Zhao. 1-5 [doi]
- Cramér-Rao Bounds for Wideband Near-Field SensingTong Wei, Kumar Vijay Mishra, Linlong Wu, Bhavani Shankar Mysore Rama Rao. 1-5 [doi]
- Learning Structured Compressed Sensing with Automatic Resource AllocationHan Wang, Eduardo Pérez, Iris A. M. Huijben, Hans Van Gorp, Ruud van Sloun, Florian Römer. 1-5 [doi]
- Fast Structured Orthogonal Dictionary Learning using Householder ReflectionsAnirudh Dash, Aditya Siripuram. 1-5 [doi]
- Zero-Shot Image Restoration via Few-Step Guidance of Consistency ModelsTomer Garber, Tom Tirer. 1-5 [doi]
- Identity-aware Feature Decoupling Learning for Clothing-change Person Re-identificationHaoxuan Xu, Bo Li, Guanglin Niu. 1-5 [doi]
- Error Feedback Approach for Quantization Noise Reduction of Distributed Graph FiltersXue Xian Zheng, Tareq Y. Al-Naffouri. 1-5 [doi]
- Learning with Coupled Noisy Labels for Visible-Infrared Person Re-identification via Graph ConsistencyZhen Wang, Wenxin Zhao, Yongfeng Dong. 1-5 [doi]
- Data-Efficient Low-Complexity Acoustic Scene Classification via Distilling and Progressive PruningBing Han, Wen Huang, Zhengyang Chen, Anbai Jiang, Pingyi Fan, Cheng Lu, Zhiqiang Lv, Jia Liu, Wei-Qiang Zhang 0001, Yanmin Qian. 1-5 [doi]
- 3DSignDiff: Towards 3D Sign Language Gesture GenerationRonghao Yu, Yun Liu, Xiyue Bai, Rui Yang, Yingna Wu. 1-5 [doi]
- Hierarchical Relation Distillation for Efficient 3D Visual GroundingJun Liu. 1-5 [doi]
- G-Depth: An Efficient Graph Method for Robust Depth CompletionZhongyu Huang, Yijun Chen, Aizierjiang Aiersilan, Li Li. 1-5 [doi]
- RADCI: A Synchronized Radar-RGBT Object Detecting-Tracking Dataset And A BenchmarkHeng Yu, Ruiheng Zhang, Haoyang Sun, Zhe Cao, Biwen Yang, Jin Zhang, Guanyu Liu. 1-5 [doi]
- CrossHash: Cross-scale Vision Transformer Hashing for Image RetrievalWeigang Wang, Zhongwen Guo, Wenxiang Jiang 0002, Yujun Lan, Wentao Ma. 1-5 [doi]
- Exploring Inter-Variate and Long-Term Dependencies to Boost Multivariate Time Series ForecastingXi Ding, Yifan He 0005, Shuigeng Zhou, Guiyang Liu, Qi Zhou. 1-5 [doi]
- Deep Learning Amplified Early Stopping Bias: Overestimating Performance on Small DatasetsNona Rajabi, Antônio H. Ribeiro, Miguel Vasco, Danica Kragic. 1-5 [doi]
- PromptSeg: Learning to Segment Medical Image via Visual PromptsMinfan Zhao, Ziqi Zhu, Jun Shi 0007, Zhaohui Wang, Junshi Chen, Hong An, Bing Yan. 1-5 [doi]
- Multimodal Emotion Recognition with Target Speaker-Based Facial EmbeddingsSerin Heo, Jehyun Kyung, Joon-Hyuk Chang. 1-5 [doi]
- PAIR: Complementarity-guided Disentanglement for Composed Image RetrievalZhiheng Fu, Zixu Li, Zhiwei Chen, Chunxiao Wang, Xuemeng Song, Yupeng Hu, Liqiang Nie. 1-5 [doi]
- Radar Jamming Recognition via Cross-Modality Contrast LearningGanggang Dong, Zixuan Wang. 1-5 [doi]
- VCSA: Video Copy Localization Via Single Frame AnnotationPeng Li, Shuguo Hu, Yang Yang, Huaiwen Zhang. 1-5 [doi]
- Investigation of perceptual music similarity focusing on each instrumental partYuka Hashizume, Tomoki Toda. 1-5 [doi]
- MP-DPCC: A Motion Proxy-Based Dynamic Point Cloud Compression FrameworkZhaoyi Jiang, Dong Han, Cao Song, Fangzhe Nan, Bailin Yang. 1-5 [doi]
- Under-Counted Matrix Completion Without Detection FeaturesTri Nguyen, Shahana Ibrahim, Rebecca A. Hutchinson, Xiao Fu 0001. 1-5 [doi]
- SSC: 106 bit/s Ultra-low Bitrate Semantic Speech CodingRenjie Jia, Zhiqiang He 0001, Kai Niu 0001, Zixuan Xiao, Yonghui Liu, Jianbing Liu. 1-5 [doi]
- Spatial-Frequency Information Interaction Diffusion for SAR ColorizationXupei Zhang, Hanlin Qin, Jingjing Li, Jinni Geng, Zihan Gao, Yue Yu. 1-5 [doi]
- Reconstruction of EEG and ECG from Single Channel Mixture using Branched Autoencoder based Separable RepresentationsShreyasi Datta, Jayavardhana Gubbi, Arpan Pal 0001. 1-5 [doi]
- Boosting Open-Vocabulary Object Detection Performance via Class-Agnostic Pseudo-Labels and MultiModal Hybrid KnowledgeZiyang Chen, Dongqin Liu, Jiao Dai, Songlin Hu 0001. 1-5 [doi]
- ChatCAD: An MLLM-Guided Framework for Zero-shot CAD Drawing RestorationJing Tang, Hongru Xiao, Xiang Li, Wei Wang, Zeyu Gong. 1-5 [doi]
- Near-field AoA estimation with Complex Convolutional Kolmogorov-Arnold NetworkJiayi Wang, Disheng Xiao, Yingkai Cao, Kai Ying, Ming Xiao. 1-5 [doi]
- TeXBLEU: Automatic Metric for Evaluate LaTeX FormatKyudan Jung, Nam-Joon Kim, Hyun Gon Ryu, Sieun Hyeon, Seung-Jun Lee, Hyuk-Jae Lee. 1-5 [doi]
- Knowledge Is Powerful: Art Knowledge-Driven Framework for Painting Style Classification Integrating Multimodal KnowledgeHaoyang Chen, Chongjun Wang, Yi Xin 0003, Lei Zhang 0086. 1-5 [doi]
- ThicknessVAE: Learning a Lateral Prior for Clothed Human Body ReconstructionXiaotao Wu, Zhaoxin Fan, Huiguang He, Dinggang Shen. 1-5 [doi]
- HDMoLE: Mixture of LoRA Experts with Hierarchical Routing and Dynamic Thresholds for Fine-Tuning LLM-based ASR ModelsBingshen Mu, Kun Wei, Qijie Shao, Yong Xu, Lei Xie. 1-5 [doi]
- Contrastive Learning via Randomly Generated Deep SupervisionShibo Wang, Zili Ma, Ka Hou Chan, Yue Liu, Tong Tong 0001, Qinquan Gao, Guangtao Zhai, Xiaohong Liu 0001, Tao Tan. 1-5 [doi]
- Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary DetectionHaoxuan Wang, Qingdong He, Jinlong Peng, Hao Yang, Mingmin Chi, Yabiao Wang. 1-5 [doi]
- Volatile MAB-based Configuration Selection for Offloading Video Analytics Tasks to EdgesYu Liang, Sheng Zhang, Jie Wu. 1-5 [doi]
- The Impact of Deployment Height on the Mean Delay in Large UAV NetworksS. Tayyaba, Sundaram Vanka. 1-5 [doi]
- Hierarchical Label Propagation: A Model-Size-Dependent Performance Booster for AudioSet TaggingLudovic Tuncay, Etienne Labbé, Thomas Pellegrini. 1-5 [doi]
- OF-AR Relation Aware Representation Learning for Lesion Image Segmentation and GradingXinpan Yuan, Siming Jin, Liujie Hua, Guihu Zhao, Changhong Zhang, Yuan Guo. 1-5 [doi]
- Automatic Text Pronunciation Correlation Generation and Application for Contextual BiasingGaofeng Cheng, Haitian Lu, Chengxu Yang, Xuyang Wang 0002, Ta Li, Yonghong Yan 0002. 1-5 [doi]
- DiffAttack: Imperceptible and Transferable Audio Adversarial Attack via Diffusion ModelJiayuan Chen, Yunshu Dai, Fangjun Huang. 1-5 [doi]
- SpecWav-Attack: Leveraging Spectrogram Resizing and Wav2Vec 2.0 for Attacking Anonymized SpeechYuqi Li, Yuanzhong Zheng, Zhongtian Guo, Yaoxuan Wang, Jianjun Yin, Haojun Fei. 1-2 [doi]
- Learning a Sparse Polynomial Approximation to the Transition Function of General State-Space ModelsBenjamin Cox, Émilie Chouzenoux, Víctor Elvira. 1-5 [doi]
- Imperceptible Transfer Attack on Large Vision-Language ModelsXiaowen Cai 0001, Daizong Liu, Runwei Guan, Pan Zhou 0001. 1-5 [doi]
- Conditional-Balanced Adversarial Delta Tuning for Cross-Domain Implicit Discourse Relation RecognitionChang Liu, Haowen Sun, Haodong Zhao, Ruifang He, Xu He, Bo Wang. 1-5 [doi]
- K-HashFed: Communication Efficient Federated Learning through Gradient Clustering and HashingAyshika Kapoor, Dheeraj Kumar. 1-5 [doi]
- Complementary Learning System Theory-based Active Learning for Audio ClassificationHui Geng, Zijian Gao, Tianjiao Wan, Dawei Feng, Changjian Wang, Kele Xu. 1-5 [doi]
- SmartExp: An Adaptive Data Expansion Strategy for Improving Handwritten Text RecognitionYiming Wang, Hongxi Wei, Shiwen Sun. 1-5 [doi]
- SigmoidGS: To Guide Depth More EffectivelyHo Ngai Chow, Licheng Shen, Lingyun Wang, Tong Zhang, Mengqiu Wang, Yuxing Han 0001. 1-5 [doi]
- Steerable Differential Polynomial BeamformingFeng Jiang, Huawei Chen. 1-5 [doi]
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion TransformerHelin Wang, Jiarui Hai, Yen-Ju Lu, Karan Thakkar, Mounya Elhilali, Najim Dehak. 1-5 [doi]
- GADACE: Graph Anomaly Detection Combining Attribute Contrast and Structure ReconstructionShihui Wang, Yulan Yang, Zixin Tan, Zhigao Zheng 0001, Jiawei Jiang 0001, Hao Huang 0001, Quanqing Xu, Chuanhui Yang. 1-5 [doi]
- Naturalistic Music Decoding from EEG Data via Latent Diffusion ModelsEmilian Postolache, Natalia Polouliakh, Hiroaki Kitano, Akima Connelly, Emanuele Rodolà, Luca Cosmo, Taketo Akama. 1-5 [doi]
- Cooperative ISAC for Localization and Velocity Estimation Using OFDM Waveforms in Cell-Free MIMO SystemsZihuan Wang, Vincent W. S. Wong 0001. 1-5 [doi]
- Enhancing Remote Sensing Vision-Language Models for Zero-Shot Scene ClassificationKarim El Khoury, Maxime Zanella, Benoît Gérin, Tiffanie Godelaine, Benoît Macq, Saïd Mahmoudi, Christophe De Vleeschouwer, Ismail Ben Ayed. 1-5 [doi]
- LFSRDiff: Light Field Image Super-Resolution via Diffusion ModelsWentao Chao, Junli Zhao, Fuqing Duan, Guanghui Wang 0001. 1-5 [doi]
- ConcealGS: Concealing Invisible Copyright Information in 3D Gaussian SplattingYifeng Yang, Hengyu Liu 0007, Chenxin Li, Yining Sun, Wuyang Li, Yifan Liu 0010, Yiyang Lin, Yixuan Yuan, Nanyang Ye 0001. 1-5 [doi]
- Improved Feature Extraction Network for Neuro-Oriented Target Speaker ExtractionCunhang Fan, Youdian Gao, Zexu Pan, Jingjing Zhang, Hongyu Zhang, Jie Zhang, Zhao Lv. 1-5 [doi]
- HYB-VITON: A Hybrid Approach to Virtual Try-On Combining Explicit and Implicit WarpingKosuke Takemoto, Takafumi Koshinaka. 1-5 [doi]
- Pathological Section Staining Transferring with Tailored Metric-based Model SelectionYiming Ji, Suyang Zhu, Dong Zhang, Shoushan Li. 1-5 [doi]
- VLIMNet: A Visible Light And Infrared Image Matching Network Based On Segment Anything Model And SuperPointZhongyuan Chen, Zhan Zhang, Decheng Zuo, Ning Wang, Liufeng Fan, Zhiwei Liu. 1-5 [doi]
- Multiple Consistency-guided Test-Time Adaptation for Contrastive Audio-Language Models with Unlabeled AudioGongyu Chen, Haomin Zhang, Chaofan Ding, Zihao Chen, Xinhan Di. 1-5 [doi]
- GLST-GCN: Global-Local Spatio-Temporal Graph Convolutional Network for Skeleton-based Hand Motion PredictionWenrui Yang, Xinchun Yu, Xiao-Ping Zhang 0002. 1-5 [doi]
- Harmonizing for defect visibility with Fine-Grained Hierarchical Interaction LearningZhongze Wu, Yitian Long, Xiu Su, Yueyi Luo, Shan You, Jun Long. 1-5 [doi]
- MambaMOT: State-Space Model as Motion Predictor for Multi-Object TrackingHsiang-Wei Huang, Cheng-Yen Yang, Wenhao Chai, Zhongyu Jiang, Jenq-Neng Hwang. 1-5 [doi]
- Improving 5G Positioning Through Signal-to-Noise Ratio Recognition TrainingWenqi Zheng, Jianing Chen, Junze Yang, Chuhao Chen 0002, Wei Li 0109, Rahul Yadav, Xiangxu Meng. 1-5 [doi]
- Private Semantic Communications with Separate Blind EncodersAmirreza Zamani, Mikael Skoglund. 1-5 [doi]
- ConPCO: Preserving Phoneme Characteristics For Automatic Pronunciation Assessment Leveraging Contrastive Ordinal RegularizationBi-Cheng Yan, Yi-Cheng Wang, Jiun-Ting Li, Meng-Shin Lin, Hsin-Wei Wang, Wei-Cheng Chao, Berlin Chen. 1-5 [doi]
- Guess What I Think: Streamlined EEG-to-Image Generation with Latent Diffusion ModelsEleonora Lopez, Luigi Sigillo, Federica Colonnese, Massimo Panella, Danilo Comminiello. 1-5 [doi]
- Bridging Speech and Text Foundation Models with ReShape AttentionTakatomo Kano, Atsunori Ogawa, Marc Delcroix, William Chen, Ryo Fukuda, Kohei Matsuura, Takanori Ashihara, Shinji Watanabe 0001. 1-5 [doi]
- MAP Image Recovery with Guarantees using Locally Convex Multi-Scale Energy (LC-MUSE) ModelJyothi Rikhab Chand, Mathews Jacob. 1-5 [doi]
- MLSDET: Multi-LLM Statistical Deep Ensemble for Chinese AI-Generated Text DetectionDianhui Mao, Denghui Zhang, Ao Zhang, Zhihua Zhao. 1-5 [doi]
- Exploring Temporal Constraints for Unsupervised Iris Motion Tracking in AS-OCT VideosLingxi Hu, Xiao Wu, Risa Higashita, Xiaoli Xing, Menglan Zhou, Song Lin, Xiaorong Li, Xiaoling Li, Jinming Duan 0001, Jiang Liu 0001. 1-5 [doi]
- Enhancing Text Annotation Through Rationale-Driven Collaborative Few-Shot PromptingJianfei Wu, Xubin Wang, Weijia Jia 0001. 1-5 [doi]
- TimeRAG: Boosting LLM Time Series Forecasting via Retrieval-Augmented GenerationSilin Yang, Dong Wang, Haoqi Zheng, Ruochun Jin. 1-5 [doi]
- Proud-SLAM: Neural Point-based Hybrid RGBD Monocular Dense SLAMWenzhi Guo, Lijun Chen 0006. 1-5 [doi]
- CASC-XVC: Zero-Shot Cross-Lingual Voice Conversion with Content Accordant and Speaker Contrastive LossesHan-jie Guo, Hui-Peng Du, Zheng-Yan Sheng, Li-Ping Chen, Yang Ai, Zhen-Hua Ling. 1-5 [doi]
- Enhancing Emotion Recognition in Incomplete Data: A Novel Cross-Modal Alignment, Reconstruction, and Refinement FrameworkHaoqin Sun, Shiwan Zhao, Shaokai Li, Xiangyu Kong, Xuechen Wang, Jiaming Zhou, Aobo Kong, Yong Chen, Wenjia Zeng, Yong Qin. 1-5 [doi]
- Intra-modal Relation and Emotional Incongruity Learning using Graph Attention Networks for Multimodal Sarcasm DetectionDevraj Raghuvanshi, Xiyuan Gao, Zhu Li, Shubhi Bansal, Matt Coler, Nagendra Kumar 0001, Shekhar Nayak. 1-5 [doi]
- MTPareto: A MultiModal Targeted Pareto Framework for Fake News DetectionKaiying Yan, Moyang Liu, Yukun Liu, Ruibo Fu, Zhengqi Wen, Jianhua Tao 0001, Xuefei Liu, Guanjun Li. 1-5 [doi]
- TDOA Localization via Fixed-Point IterationYanbin Zou, Yangpeng Xiao, Zekai Zhang, Huaping Liu 0002. 1-5 [doi]
- NeRF-3DTalker: Neural Radiance Field with 3D Prior Aided Audio Disentanglement for Talking Head SynthesisXiaoxing Liu, Zhilei Liu, Chongke Bi. 1-5 [doi]
- Blind Spatial Impulse Response Generation from Separate Room- and Scene-Specific InformationFrancesc Lluís, Nils Meyer-Kahlen. 1-5 [doi]
- IdentityLock: An Identity-aware Backdoor strategy for Face Swapping DefenseZhitao Huang, Peipeng Yu, Zhihua Xia, Xi Yang, Run Lu. 1-5 [doi]
- Dynamically Causal-Enhanced Exercise Representations for Adaptive Knowledge TracingYanhong Bai, Jiabao Zhao, Tingjiang Wei, Jinxin Shi, Liang He 0001. 1-5 [doi]
- Geometry-Constrained EEG Channel Selection for Brain-Assisted Speech EnhancementKeying Zuo, Qingtian Xu, Jie Zhang 0042, Zhenhua Ling. 1-5 [doi]
- Multi-step Fusion of Relation Type Information and Multi-Task Decoding for Entity Relation Extraction in ancient ChineseChuanlong Tu, Jiali Zuo, Yiyu Hu, Jianyi Wan, Mingwen Wang 0001. 1-5 [doi]
- Human Action Recognition in Multi-Level Convolutional Temporal Attention NetworkQian Huang, Zhongqi Chen, Chang Li, Weiwen Qian. 1-5 [doi]
- MonoIR: Inpainting and Reconstruction for Monocular Endoscope Deformation ScenesZiteng Zhang, Wenyu Li, Sidun Liu, Peng Qiao, Yong Dou. 1-5 [doi]
- ExVC: Leveraging Mixture of Experts Models for Efficient Zero-shot Voice ConversionObed Irihose, Le Zhang. 1-5 [doi]
- RaLU-Net: Deep Unfolded Radar Localization of Humans for Precise Multi-Person Non-Contact Vital Signs MonitoringYonathan Eder, Yhonatan Kvich, Rui Guo, Yonina C. Eldar. 1-5 [doi]
- Incorporate Global Information from Entire Datasets for Knowledge Tracing via Mini-Batch InputHui Zhao, Tingyu Fu. 1-5 [doi]
- Globally Normalizing the Transducer for Streaming Speech RecognitionRogier C. van Dalen. 1-5 [doi]
- Diffusion Counterfactual-Based Anomaly Detection in Class-Imbalanced DataXinyun Shen, Min Li, Zhengmao Ye, Zhenyang Yu, Lei Duan. 1-5 [doi]
- Leveraging Multimodal Diffusion Models to Accelerate Imaging with Side InformationTimofey Efimov, Harry Dong, Megna Shah, Jeff P. Simmons, Sean Donegan, Yuejie Chi. 1-5 [doi]
- Sound-VECaps: Improving Audio Generation with Visually Enhanced CaptionsYi Yuan, Dongya Jia, Xiaobin Zhuang, Yuanzhe Chen, Zhuo Chen 0006, Yuping Wang 0005, Yuxuan Wang 0002, Xubo Liu 0001, Xiyuan Kang, Mark D. Plumbley, Wenwu Wang 0001. 1-5 [doi]
- INN-PAR: Invertible Neural Network for PPG to ABP ReconstructionSoumitra Kundu, Gargi Panda, Saumik Bhattacharya, Aurobinda Routray, Rajlakshmi Guha. 1-5 [doi]
- ID-RWKV: Image Deraining RWKVYong Yang, Jiaxuan Yang, Shuying Huang. 1-5 [doi]
- TS-Net: Assembling Task-specific Features from Multiple Feature Levels for Multi-task LearningChen Liu, Zhaolin Wan, Penghong Wang, Xingtao Wang, Xiaopeng Fan. 1-5 [doi]
- CFSum: A Transformer-Based Multi-Modal Video Summarization Framework With Coarse-Fine FusionYaowei Guo, Jiazheng Xing, Xiaojun Hou, Shuo Xin, Juntao Jiang, Demetri Terzopoulos, Chenfanfu Jiang, Yong Liu 0007. 1-5 [doi]
- BiMA: Bidimensional multi-level attention embedded network for single-frame infrared small target detectionHe Deng, Xiaojie Yin, Xianmin Lan. 1-5 [doi]
- Universal Training of Neural Networks to Achieve Bayes Optimal Classification AccuracyMohammadreza Tavasoli Naeini, Ali Bereyhi, Morteza Noshad, Ben Liang 0001, Alfred O. Hero III. 1-5 [doi]
- Meta-Learning for Finger Vein Recognition in Internet of Things Smart Home SecurityHengyi Ren, Lijuan Sun, Jinting Ren, Xing Li, Ying Cao. 1-5 [doi]
- Distance Based Single-Channel Target Speech ExtractionRunwu Shi, Benjamin Yen 0001, Kazuhiro Nakadai. 1-5 [doi]
- Robust Over-The-Air Federated Learning In Heterogeneous NetworksZubair Shaban, Nazreen Shah, Ranjitha Prasad. 1-5 [doi]
- Annealing Distillation Algorithm for Transferring Unsupervised Clustering Knowledge to Supervised Student ModelsJiaying Gao, Fausto Giunchiglia, Tongyu Zhao, Hao Xu 0012. 1-5 [doi]
- Exploring Facial Kinship Verification through Contactless Heart Activity AnalysisXiaoting Wu, Xiaoyi Feng, Constantino Álvarez Casado, Lili Liu, Miguel Bordallo López. 1-5 [doi]
- YOLO-KED: A Novel Framework for Rotated Object Detection in Complex EnvironmentsZhaoyu Zhuang, Penglei Liu, Dejia Xu, Jun Cheng 0002. 1-5 [doi]
- Multi-Task Learning for Ultrasonic Echo-based Depth Estimation with Audible Frequency RecoveryJunpei Honma, Akisato Kimura, Go Irie. 1-5 [doi]
- Low Complexity DoA-ToA Signature Estimation for Multi-Antenna Multi-Carrier SystemsChandrashekhar Rai, Debarati Sen. 1-5 [doi]
- Difference Bonds Consistency and Complementarity to Enhance Multimodal Representation LearningCongbing He, Sensen Song, Zhenhong Jia, Hui Zhao. 1-5 [doi]
- Emotion-Preserving Prosody Anonymization Network for Voice Privacy ProtectionJiabei He 0001, Shiwan Zhao, Jiaming Zhou, Haoqin Sun, Hui Wang 0075, Yong Qin. 1-5 [doi]
- SIP2Net: Situational-Aware Indoor Pathloss-Map Prediction Network for Radio Map GenerationWenlihan Lu, Ziyi Lu, Jia Yan, Shijian Gao. 1-2 [doi]
- LoVA: Long-form Video-to-Audio GenerationXin Cheng 0008, Xihua Wang, Yihan Wu, Yuyue Wang 0003, Ruihua Song. 1-5 [doi]
- On Noise-Sensitivity of Unlimited Sampling in Line Spectral EstimationWenlong Wang, Zai Yang. 1-5 [doi]
- Cross-Lingual Speech Emotion Recognition: Humans vs. Self-Supervised ModelsZhichen Han, Tianqi Geng, Hui Feng, Jiahong Yuan, Korin Richmond, Yuanchao Li. 1-5 [doi]
- Hybrid predictive and parametric stereo coding for voice and audio communicationsGuillaume Fuchs, Emmanuel Ravelli, Franz Reutelhuber, Eleni Fotopoulou. 1-5 [doi]
- Keep what you need : extracting efficient subnetworks from large audio representation modelsDavid Genova, Philippe Esling, Tom Hurlin. 1-5 [doi]
- Structured Random Model for Fast and Robust Phase RetrievalZhiyuan Hu, Julián Tachella, Michael Unser, Jonathan Dong. 1-5 [doi]
- ANASETC: Automatic Neural Architecture Search for Encrypted Traffic ClassificationHeng Zhang, Ziqian Chen, Wei Xia, Gang Xiong 0001, Gaopeng Gou, Zhen Li 0011, Guangyan Huang, Yunpeng Li. 1-5 [doi]
- DiffuseFIST: A Fast Image-guided Style Transfer Method for Adapting Large-scale Diffusion ModelsMiaomiao Dai, Qianyu Zhou 0001, Ran Yi, Lizhuang Ma. 1-5 [doi]
- Enhancing Complex Formula Recognition with Hierarchical Detail-Focused NetworkJiale Wang, Junhui Yu, Huanyong Liu, Chenanran Kong. 1-5 [doi]
- Enhanced Weakly Supervised Few-shot Classification & SegmentationSrinivasa Rao Nandam, Sara Atito 0001, Zhenhua Feng 0001, Josef Kittler, Muhammad Awais 0001. 1-5 [doi]
- Attention Augmented Structure-centric Bias Mitigation with Feature DisentanglementXuege Hou, Yali Li 0001, Shengjin Wang. 1-5 [doi]
- Source-free Domain Adaptation with Multiple Alignment for Efficient Image RetrievalJinghui Ni, Hui Cui 0004, Lihai Zhao, Fengling Li 0001, Xiaohui Han, Lijuan Xu 0001. 1-5 [doi]
- SPNet: Sparse-mask Prompt-learning Network for Cerebrovascular SegmentationWenqi Shan, Qiang Li, Zhiwei Wang. 1-5 [doi]
- CapT: A Hierarchical Capsule Representation Learning Approach for Class Continual LearningAnkita Chatterjee, Saransh Patel, Jayanta Mukhopadhyay, Partha Pratim Das 0001. 1-5 [doi]
- Matrix Decomposition By Additive and Subtractive FactorsTakumi Kobayashi 0001, Kenji Watanabe. 1-5 [doi]
- SINET: Sparsity-driven Interpretable Neural Network for Underwater Image EnhancementGargi Panda, Soumitra Kundu, Saumik Bhattacharya, Aurobinda Routray. 1-5 [doi]
- Deep Enhancement Spotting Network for Low-complexity Keyword Spotting in Noisy EnvironmentsYongqiang Chen, Qianhua He, Yanxiong Li, Zunxian Liu, Mingru Yang, Jinxin Huang. 1-5 [doi]
- An Adaptive Framework for Multi-View Clustering Leveraging Conditional Entropy OptimizationLijian Li 0003, Yuanpeng He, Chi-Man Pun. 1-5 [doi]
- A Weakly Supervised Semantic Segmentation Model with Enhanced CLIP Feature ExtractionFanxuan Kong, Jun Lu. 1-5 [doi]
- Disentangled Representation Learning for Chinese Handwriting RecognitionTianqi Zhao, Liangrui Peng, Gang Yao, Di Wu, Yao Tao. 1-5 [doi]
- GMM-Based Bootstrap Prototype-Aware Learning For Weakly Supervised Semantic SegmentationYahui Wu. 1-5 [doi]
- Perception-Enhanced Network for Accurate Human Pose EstimationXiaodi Sun, Baojiang Zhong, Kai-Kuang Ma. 1-5 [doi]
- Distributed ATC Particle Filters for Cooperative Quaternion TrackingClaudio J. Bordin, Marcelo G. S. Bruno, Stiven S. Dias. 1-5 [doi]
- F-StrIPE: Fast Structure-Informed Positional Encoding for Symbolic Music GenerationManvi Agarwal, Changhong Wang, Gaël Richard. 1-5 [doi]
- Non-Pharmacological Interventions: A Virtual Training Framework for Fine Motor LearningXiaohang Dong, Qicheng Li, Yawen Zhang. 1-5 [doi]
- Deep Feedback Cancellation for Hearing Aids with Improved System Stability and Sound QualityEleftheria Lydaki, Zheng-Hua Tan, Jesper Jensen 0001, Meng Guo 0001. 1-5 [doi]
- Benchmarking Music Generation Models and Metrics via Human Preference StudiesFlorian Grötschla, Ahmet Solak, Luca A. Lanzendörfer, Roger Wattenhofer. 1-5 [doi]
- Can Fairness and Robustness Be Simultaneously Achieved Under Byzantine Attacks?Huigan Zheng, Runhua Wang, Xiao Wang, Qing Ling. 1-5 [doi]
- Formula-Supervised Sound Event Detection: Pre-Training Without Real DataYuto Shibata, Keitaro Tanaka, Yoshiaki Bando, Keisuke Imoto, Hirokatsu Kataoka, Yoshimitsu Aoki. 1-5 [doi]
- Heterogeneous Graph Dual-structure Optimization Based Attribute-aware for RecommendationLongtao Wang, Qingtian Zeng, Guiyuan Yuan, Hua Duan, Cheng Cheng, Zilong Wang. 1-5 [doi]
- Trainable Adaptive Score Normalization for Automatic Speaker VerificationJeong Hwan Choi, Ju-Seok Seong, Ye-Rin Jeoung, Joon-Hyuk Chang. 1-5 [doi]
- Surface Defect Detection Algorithm for Strip Alloy Material Based on Improved YOLOv8Wei Yang, Jun Yang, Yajin Xia. 1-5 [doi]
- Joint Multi-Scale Contextual and Noise Suppression for Group Emotion RecognitionWangdong Guo, Qing Zhu, Qirong Mao. 1-5 [doi]
- Weakly-Supervised Video Highlight Detection by Characteristic and Commonality ModelingChengze Zhao, Zixuan Zhao, Xu Zhao 0001. 1-5 [doi]
- Indoor Airflow Imaging Using Physics-Informed Schlieren TomographyArjun Teh, Wael H. Ali, Joshua Rapp, Hassan Mansour. 1-5 [doi]
- Mean Delay in Poisson Wireless Networks with Optimal Packet FragmentationSundaram Vanka, Martin Haenggi. 1-4 [doi]
- Content and Salient Semantics Collaboration for Cloth-Changing Person Re-IdentificationQizao Wang, Xuelin Qian, Bin Li 0015, Lifeng Chen, Yanwei Fu 0001, Xiangyang Xue 0001. 1-5 [doi]
- Understanding Neural Networks in Profiled Side-Channel AnalysisYimeng Chen, Bo Wang, Changshan Su, Ao Li, Gen Li, YuXing Tang. 1-5 [doi]
- DFingerNet: Noise-Adaptive Speech Enhancement for Hearing AidsIosif Tsangko, Andreas Triantafyllopoulos, Michael Müller, Hendrik Schröter, Björn W. Schuller. 1-5 [doi]
- Rethinking Mamba in Speech Processing by Self-Supervised ModelsXiangyu Zhang, Jianbo Ma, Mostafa Shahin, Beena Ahmed, Julien Epps. 1-5 [doi]
- Efficient Gridless Wideband Direction-of-Arrival Estimation From Many FrequenciesYiming Zhou, Huayu Fu, Wei Dai. 1-5 [doi]
- Harnessing Contrastive Learning and Neural Transformation for Time Series Anomaly DetectionJiazhen Chen, Mingbin Feng, Tony S. Wirjanto. 1-5 [doi]
- Hybrid Precoding in mmWave Multiuser MIMO Systems with Delay Alignment Modulation (DAM)Priyanka Maity, Monali Chakraborty, Suraj Srivastava, Aditya K. Jagannatham. 1-5 [doi]
- Towards Detecting LLMs Hallucination via Markov Chain-based Multi-agent Debate FrameworkXiaoxi Sun, Jinpeng Li 0003, Yan Zhong, Dongyan Zhao 0001, Rui Yan 0001. 1-5 [doi]
- Electrocardiogram Report Generation and Question Answering via Retrieval-Augmented Self-Supervised ModelingJialu Tang, Tong Xia, Yuan Lu, Cecilia Mascolo, Aaqib Saeed. 1-5 [doi]
- FAF-Filt: Frequency-aware Fourier Filter for Sound Event DetectionSiyu Sun, Xiaohuai Le, Zhuangqi Chen, Xianjun Xia, Chuanzeng Huang. 1-5 [doi]
- Continual Self-supervised Learning Considering Medical Domain Knowledge in Chest CT ImagesRen Tasai, Guang Li 0008, Ren Togo, Minghui Tang, Takaaki Yoshimura, Hiroyuki Sugimori, Kenji Hirata, Takahiro Ogawa 0001, Kohsuke Kudo, Miki Haseyama. 1-5 [doi]
- CascadePAIE: Reallocating Relevance for Event Roles and Event Text in Event Argument ExtractionChunyu Yao, Yi Guo. 1-5 [doi]
- Target Localization With a Coprime Multistatic MIMO Radar via Coupled Canonical Polyadic Decomposition Based on Joint EVDGuozhao Liao, Xiao-Feng Gong, Wei Liu 0001, Hing-Cheung So. 1-5 [doi]
- HazeCLIP: Towards Language Guided Real-World Image DehazingRuiyi Wang, Wenhao Li, Xiaohong Liu 0001, Chunyi Li, Zicheng Zhang, Xiongkuo Min, Guangtao Zhai. 1-5 [doi]
- Shabdh: A multi lingual zero-shot voice cloning approach with speaker disentanglementSreeram Manghat, Sreeja Manghat, Tanja Schultz. 1-2 [doi]
- CR-CLIP: Image-Text Contrastive Regression for Generalized Gaze EstimationYitong Zhu, Xurong Xie, Naiming Yao, Hui Chen 0020, Feng Tian 0001. 1-5 [doi]
- HYMAN: Hybrid Memory and Attention Network for Unsupervised Anomaly DetectionJiahao Li, Yiqiang Chen 0001, Yunbing Xing, Yang Gu 0001, Xiangyuan Lan. 1-5 [doi]
- OCTAMamba: A State-Space Model Approach for Precision OCTA Vasculature SegmentationShun Zou, Zhuo Zhang, Guangwei Gao. 1-5 [doi]
- Semi-supervised Iterative Learning Network for Camouflaged Object DetectionGuowen Yue, Ge Jiao, Jiahao Xiang. 1-5 [doi]
- Basis Function Learning for Variable-Length and Continuous-Indexed SignalsSiyuan Li, Lei Cheng, Feng Yin, Jianlong Li, Peter Gerstoft. 1-5 [doi]
- Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech SynthesisYe-Xin Lu, Hui-Peng Du, Zheng-Yan Sheng, Yang Ai, Zhen-Hua Ling. 1-5 [doi]
- SPTU-Lite: An Efficient Auxiliary Diagnostic Approach Combining Spiking Neural Networks And Large Kernel Extractors For ECG SignalsChuxuan Shan, Xiaohua Wang, Hang Qi, Qingxu Meng, Jiabao Wang, Weijiang Wang, Yueting Shi. 1-5 [doi]
- Foundation Models Boost Low-Level Perceptual Similarity MetricsAbhijay Ghildyal, Nabajeet Barman, Saman Zadtootaghaj. 1-5 [doi]
- Complete Reconstruction of the Tongue Contour Through Acoustic to Articulatory Inversion Using Real-Time MRI DataSofiane Azzouz, Pierre-André Vuissoz, Yves Laprie. 1-5 [doi]
- How Redundant Is the Transformer Stack in Speech Representation Models?Teresa Dorszewski, Albert Kjøller Jacobsen, Lenka Tetková, Lars Kai Hansen. 1-5 [doi]
- Modality-Invariant Bidirectional Temporal Representation Distillation Network for Missing Multimodal Sentiment AnalysisXincheng Wang, Liejun Wang, Yinfeng Yu, Xinxin Jiao. 1-5 [doi]
- MusicLIME: Explainable Multimodal Music UnderstandingTheodoros Sotirou, Vassilis Lyberatos, Orfeas Menis-Mastromichalakis, Giorgos Stamou. 1-5 [doi]
- HieClip: Hierarchical CLIP with Explicit Alignment for Zero-Shot Anomaly DetectionLiujie Hua, Xiu Su, Yueyi Luo, Shan You, Jun Long. 1-5 [doi]
- A Domain-Specific Multilingual Speech Translation Corpus via Simultaneous InterpretationSeunghee Han, Gary Geunbae Lee, Hung Soon Kim, SunHee Kim, Minhwa Chung. 1-5 [doi]
- Unveiling Deepfakes with Latent Diffusion Counterfactual ExplanationsChen Yang, Bo Peng, Jing Dong, Xiaoyu Zhang. 1-5 [doi]
- On-device hand model for robust gesture detectionIvan Bondarets, Veronika Prokhorchuk, Andrii Kozyr, Oleksandr Trunov. 1-5 [doi]
- MAJoR: Visual Emotion Analysis via Multi-Attribute Joint ReasoningYuxin Fei, Jinlan Xu, Maoying Qiao, Fei Gao 0006. 1-5 [doi]
- Transfer Learning with Transformer and LSTM for Digital Pre-distortion of Terahertz/mmWave TransceiverGouheng Zhao, Kai Ying, Qingsong Wen, Junwen Zhang, Lin Gui 0001. 1-5 [doi]
- Can Large Language Models Grasp Event Signals? Exploring Pure Zero-Shot Event-based RecognitionZongyou Yu, Qiang Qu, Xiaoming Chen, Chen Wang. 1-5 [doi]
- DLM-VMTL: A Double LayerMapper For Heterogeneous Data Video Multi-Task Prompt LearningZeyi Bo, Ye Jin, Wuxi Sun. 1-5 [doi]
- TSCheater: Generating High-Quality Tibetan Adversarial Texts via Visual SimilarityXi Cao, Quzong Gesang, Yuan Sun, Nuo Qun, Tashi Nyima. 1-5 [doi]
- Retention Enhanced Cross-modal Attention for Multi-Hop VQAZijie Zhu, Feng Ding, Chenglong Chu, Fangming Zhong. 1-5 [doi]
- Music2Latent2: Audio Compression with Summary Embeddings and Autoregressive DecodingMarco Pasini, Stefan Lattner, György Fazekas. 1-5 [doi]
- Debiased Estimation for Cross-Domain Cold Start RecommendationFengxin Li, Hongyan Liu 0002, Jun He 0008, Xiaoyong Du 0001. 1-5 [doi]
- Supervisor Alignment Framework: Enhancing LLM Alignment with Query-Ignoring Strategy and Multi-Agent InteractionZiqun Bao, Yu Ji, Wen Wu 0006, Xi Chen, Liang He 0001. 1-5 [doi]
- Unbiased Multimodal Audio-to-Intent RecognitionQian Dong, Yuezhou Dong, Ke Qin, Guiduo Duan, Tao He 0007. 1-5 [doi]
- Dual Position Attention Time-Frequency Network for Binaural Audio SynthesisChangjun He, Weiping Chen, Mingjiang Wang. 1-5 [doi]
- Advancing Non-intrusive Suppression on Enhancement Distortion for Noise Robust ASRWei Wang 0374, Siyi Zhao, Yanmin Qian. 1-5 [doi]
- Stealthy Backdoor Attack against Video Recognition ModelsJiale Yan, Bo Zhao, Chunyu Yang. 1-5 [doi]
- No Data Required: Zero-Shot Domain Adaptation for Automatic Music TranscriptionAndrew Mcleod. 1-5 [doi]
- MDDNet: Multilevel Difference-Enhanced Denoise Network for Unsupervised Change Detection in SAR ImagesHe Zong, Erlei Zhang, Xinyu Li 0013, Hongming Zhang 0002, Jinchang Ren. 1-5 [doi]
- Gram: A Large-Scale General EEG Model for Raw Data Classification and Restoration TasksZiyi Li 0003, Wei-Long Zheng, Bao-Liang Lu. 1-5 [doi]
- Exploring Prediction Targets in Masked Pre-Training for Speech Foundation ModelsLi-Wei Chen, Takuya Higuchi, He Bai 0013, Ahmed Hussen Abdelaziz, Shinji Watanabe 0001, Alexander Rudnicky, Tatiana Likhomanenko, Barry-John Theobald, Zakaria Aldeneh. 1-5 [doi]
- RoRA: Efficient Fine-Tuning of LLM with Reliability Optimization for Rank AdaptationJun Liu, Zhenglun Kong, Peiyan Dong, Xuan Shen, Pu Zhao 0001, Hao Tang 0005, Geng Yuan, Wei Niu 0002, Wenbin Zhang 0002, Xue Lin 0001, Dong Huang, Yanzhi Wang. 1-5 [doi]
- Classifier-Guided Captioning Across ModalitiesAriel Shaulov, Tal Shaharabany, Eitan Shaar, Gal Chechik, Lior Wolf. 1-5 [doi]
- Individual Fairness for Fuzzy C-Means ClusteringZhijing Yang, Boyang Yan, Junjie Zheng, Yiding Tang, Chuan Qian, Hui Zhang 0055. 1-5 [doi]
- DDNet: Exploring Dual Dependencies for Long-Term Time Series ForecastingZihang Guo, Zijian Li, Zhiyong Yang, Zhenping Mou, Jieru Guo. 1-5 [doi]
- Decoding the Unintelligible: Neural Speech Tracking in Low Signal-to-Noise RatiosXiaomin He, Vinay S. Raghavan, Nima Mesgarani. 1-5 [doi]
- Adversarial Training and Gradient Optimization for Partially Deepfake Audio LocalizationSiding Zeng, Jiangyan Yi, Jianhua Tao 0001, Jiayi He, Zheng Lian, Shan Liang, Chuyuan Zhang, Yujie Chen, Xiaohui Zhang 0006. 1-5 [doi]
- Virtual Leader-based Safe Formation-Switching Control for Dense EnvironmentsAamna Zahid Piracha, Bernhard Rinner. 1-5 [doi]
- SSVEP-BiMA: Bifocal Masking Attention Leveraging Native and Symmetric-Antisymmetric Components for Robust SSVEP DecodingYuxin Liu, Zhenxi Song, Guoyang Xu, Zirui Wang, Feng Wan, Yong Hu, Min Zhang, Zhiguo Zhang 0001. 1-5 [doi]
- Exploring Meta Evidence for Prompt OptimizationYao He, Jiajun Liu, Wenjun Ke, Peng Wang, Congda Xiao, Yining Li. 1-5 [doi]
- Evaluating the security of public surrogate watermark detectorsChloé Imadache, Eva Giboulot, Teddy Furon. 1-5 [doi]
- Self-Supervised Learning for Detecting AI-Generated Faces as AnomaliesMian Zou, Baosheng Yu, Yibing Zhan, Kede Ma. 1-5 [doi]
- Sparse PCA with Oracle Rate in High DimensionsWenfu Zhong, Ziping Zhao 0002. 1-5 [doi]
- Multi-Relational Variational Contrastive Learning for Next POI RecommendationLin Chen, Peipei Wang, Xiaohui Han, Lijuan Xu 0001. 1-5 [doi]
- The Misspecified Cramér-Rao Bound for DOA Estimation with One-Bit Quantized DataNadav E. Rosenthal, Joseph Tabrikian. 1-5 [doi]
- Federated Smoothing ADMM for Quantile Regression with Non-Convex Sparse PenaltiesReza Mirzaeifard, Stefan Werner 0001. 1-5 [doi]
- Multi-Objective Representation based Dynamic Prototype Learning for Unsupervised DCE-MRI Breast Tumor SegmentationLi Wang, Lihui Wang, Lei Tang, Zi-Xiang Kuai, Jian Zhang, Hongjiang Wei. 1-5 [doi]
- Noise-Agnostic Multitask Whisper Training for Reducing False Alarm Errors in Call-for-Help DetectionMyeonghoon Ryu, June-Woo Kim, Minseok Oh, Suji Lee, Han Park. 1-5 [doi]
- Cross-lingual Evaluation Of Hypernasality Using Wav2Vec2 FeaturesKrupaben Kothadia, Vikram C. M., Ajish K. Abraham, Pushpavathi M, S. R. Mahadeva Prasanna, Nancy Scherer, Kathy Chapman, Julie Liss, Visar Berisha. 1-5 [doi]
- Flare-Aware RWKV for Flare RemovalWanying Zhang, Wei Shang 0001, Dongwei Ren, Wangmeng Zuo. 1-5 [doi]
- Sequential Contrastive Audio-Visual LearningIoannis Tsiamas, Santiago Pascual, Chunghsin Yeh, Joan Serrà. 1-5 [doi]
- Sound Source Distance Estimation Utilizing Physics-informed Prior for Sound Event Localization and DetectionNao Sato, Masahiro Yasuda, Shoichiro Saito, Noboru Harada. 1-5 [doi]
- MSEMG: Surface Electromyography Denoising with a Mamba-based Efficient NetworkYu-Tung Liu, Kuan-Chen Wang, Rong Chao, Sabato Marco Siniscalchi, Ping-Cheng Yeh, Yu Tsao 0001. 1-5 [doi]
- Instantaneous Trajectory Prediction via Latent Bidirectional Cooperative DiffusionKun Ma, Qilong Han, Jingzheng Yao, Changmao Wu, Chunrui Na. 1-5 [doi]
- On the Restricted Isometry Property of Kronecker-structured MatricesYanbin He, Geethu Joseph. 1-5 [doi]
- PlantPCC: Dual Sampling and Multi-level Geometry-aware Contrastive Regularization for Plant Point Cloud CompletionXiaomeng Li, Wenxu Wang, Haoxiang Sun, Yanhao Ding, Zhenbo Li. 1-5 [doi]
- Directional Source Separation for Robust Speech Recognition on Smart GlassesTianTian Feng, Ju Lin, Yiteng Huang, Weipeng He, Kaustubh Kalgaonkar, Niko Moritz, Li Wan, Xin Lei, Ming Sun, Frank Seide. 1-5 [doi]
- BDCKD: Unlocking the Power of Brownian Distance Covariance in Knowledge DistillationGuoming Lu, Heng Yin, Zhiyong Shu, Jielei Wang, Guangchun Luo. 1-5 [doi]
- Multi-Agent Decision Transformer for Power Control in Wireless NetworksYiming Zhang, Kun Yang, Cong Shen, Dongning Guo. 1-5 [doi]
- Fine-Grained Global Modeling Learning for Personalized Federated Sequential RecommenderYicheng Di, Hongjian Shi, Ruhui Ma, Yuan Liu, Weiyu Wang. 1-5 [doi]
- An Efficient Sample Utilization Method for Deep Learning Based on Class UncertaintyJinxin Huang, Qianhua He, Jiezhi Xu, Sam Kwong, Mingru Yang. 1-5 [doi]
- CAT-Net: A Co-Adaptive Transfer Learning Network for BCI-Assisted NeurorehabilitationShuailei Zhang, Yi Ding 0012, Muyun Jiang, Ning Tang, Effie Chew, Kai Keng Ang, Cuntai Guan. 1-5 [doi]
- Don't Lose Yourself: Boosting Multimodal Recommendation via Reducing Node-neighbor Discrepancy in Graph Convolutional NetworkZheyu Chen 0003, Jinfeng Xu 0003, Haibo Hu. 1-5 [doi]
- DCASI: A Sequence-based Attack Investigation Method Using DTW Contrastive LearningYun Li, Wei Qiao, Yan Zhu, Yunxiang Wang, Bo Jiang 0013, Zhigang Lu 0002. 1-5 [doi]
- DS-BTIAN: A Novel Deep-Shallow Bidirectional Transformer Interactive Attention Network for Multimodal Emotion RecognitionZengzhao Chen, Chuanxu Zhao, Zhifeng Wang 0001, Chuan Liu, Qiuyu Zheng, Cheng Zou. 1-5 [doi]
- A Dual-Perspective Metaphor Detection Framework Using Large Language ModelsYujie Lin, Jingyao Liu, Yan Gao, Ante Wang, Jinsong Su. 1-5 [doi]
- Mixed Spiking NeRF: Towards a More Efficient Neural Radiance FieldsKaian Wang, Longhao Zou. 1-5 [doi]
- Intra- and Inter-modal Context Interaction Modeling for Conversational Speech SynthesisZhenqi Jia, Rui Liu. 1-5 [doi]
- N3C: Towards Replay-based Novelty Continual Clustering with Class-OverlappingYan Zhang, Guoqiang Wu, Bingzheng Wang, Teng Pang, Yilong Yin. 1-5 [doi]
- Lenna: Language Enhanced Reasoning Detection AssistantFei Wei, Xinyu Zhang 0015, Ailing Zhang, Bo Zhang 0046, Xiangxiang Chu. 1-5 [doi]
- persoDA: Personalized Data Augmentation for Personalized ASRPablo Peso Parada, Spyros Fontalis, Md Asif Jalal, Karthikeyan Saravanan, Anastasios Drosou, Mete Ozay, Gil Ho Lee, Jungin Lee, Seokyeong Jung. 1-5 [doi]
- CroPrompt: Cross-task Interactive Prompting for Zero-shot Spoken Language UnderstandingLibo Qin 0001, Fuxuan Wei, Qiguang Chen, Jingxuan Zhou, Shijue Huang, Jiasheng Si, Wenpeng Lu, Wanxiang Che. 1-5 [doi]
- L2G: Head Gesture Animation Using an Emotion Guided Language ModelRishabh Agrawal, Apurva Narayan. 1-5 [doi]
- HyperSDT: HyperNetwork Slide Decision Tree for Interpretable Tabular LearningNan Hu, Xueqiong Li, Jun-Jie Huang, Zhenhua Liang, Shaowu Yang, Ji Wang. 1-5 [doi]
- Dynamic Language Group-based MoE: Enhancing Code-Switching Speech Recognition with Hierarchical RoutingHukai Huang, Shenghui Lu, Yahui Shan, He Qu, Fengrun Zhang, Wenhao Guan, Qingyang Hong, Lin Li. 1-5 [doi]
- TGDrag: Adding Semantic Control into Point-based Image Editing via Text GuidanceChenhao Lin, Yanjie Zhu, Yingmao Miao, Zhengyu Zhao 0001, Shuai Liu 0016, Chao Shen 0001. 1-5 [doi]
- RAOCSL: A BERT-Based Strategy for Identifying Learner Confusion under Class ImbalanceTongyu Zhao, Jiaying Gao, Yu Feng, Yatong Zu, Adriano Tavares, Tiago Gomes, Sandro Pinto 0001, Hao Xu 0012. 1-5 [doi]
- FedDiffRec: A Module-wise Training Approach for Diffusion-Based Recommendation in Federated LearningGuohui Li, Lu Zhang, Qian Rong, Xuanang Ding, Ling Yuan. 1-5 [doi]
- PD-SDF: Dynamic Surface Reconstruction Based on Plane Decomposition for Single View RGB-D VideosJinwen Li, Weixing Xie, Junfeng Yao, Shaoqi Wu, Youhong Peng, Mengyuan Ge, Xiao Dong. 1-5 [doi]
- Joint Feature and Kernel Fusion for Improved Depth-Aware Panoptic SegmentationYulong Bai, Shu Tian, Xin Zhao, Xu-Cheng Yin. 1-5 [doi]
- A Prompt Learning Framework with Large Language Model Augmentation for Few-shot Multi-label Intent DetectionNing Zhuang, Xiao Wei, Junlei Li, Xiaobao Wang, Chenyang Wang, Longbiao Wang, Jianwu Dang 0001. 1-5 [doi]
- Automatic Labelling & Semantic Segmentation with 4D Radar TensorsBotao Sun, Ignacio Roldan, Francesco Fioranelli. 1-5 [doi]
- Translational Generative Retrieval via Potential Query GenerationYihan Guo, Tingwen Liu, Jiawei Sheng, Duohe Ma, Ming Sun, Ling Tian. 1-5 [doi]
- 2ViT: Finetuning-free Token Reduction for Dense Prediction Through a Refinement-Reactivation ArchitectureHaipeng Fang, Ziheng Wu, Xinyi Zou, Jun Huang, Juan Cao, Sheng Tang. 1-5 [doi]
- Enhancing Large Language Models on Domain-specific Tasks: A Novel Training Strategy via Domain Adaptation and Preference AlignmentJingyang Deng, Zeren Zhang, Jo-Ku Cheng, Jinwen Ma. 1-5 [doi]
- FashionFAE: Fine-grained Attributes Enhanced Fashion Vision-Language Pre-trainingJiale Huang, Dehong Gao, Jinxia Zhang, Zechao Zhan, Yang Hu, Xin Wang. 1-5 [doi]
- Towards Interactive Deepfake AnalysisLixiong Qin, Ning Jiang, Yang Zhang, Yuhan Qiu, Dingheng Zeng, Jiani Hu, Weihong Deng. 1-5 [doi]
- InjectTST: Injecting Global Information into Independent Channels for Long Time Series ForecastingCe Chi, Xing Wang, Kexin Yang, Zhiyan Song, Di Jin, Lin Zhu, Chao Deng, Junlan Feng. 1-5 [doi]
- WeightedKV: Attention Scores Weighted Key-Value Cache Merging for Large Language ModelsJian Yuan, Ziwei He, Haoli Bai, Jingwen Leng, Bo Jiang. 1-5 [doi]
- ViCo: A Multitask Video-enhanced and Cognition-preserving Modality Alignment Training FrameworkZhenda Yu, Jin Chen, Jiayu Shen, Lanxiang Zhou, Han Fang, Xianghao Zang, Chao Ban, Jingfeng Chen, Zhongjiang He, Hao Sun, Zerui Li, Yanmei Kang. 1-5 [doi]
- Efficient Co-clustering via Anchor-refined Label SpreadingFangyuan Xie, Feiping Nie 0001, Weizhong Yu, Xuelong Li 0001. 1-5 [doi]
- Fusion-OSR: Cross-Domain Contrastive Learning with Weibull Calibration for Time Series Open Set RecognitionShuguo Hu, Xudong Zhao, Shuwei Hu, Xuan Gao. 1-5 [doi]
- Leveraging Visual Captions for Enhanced Zero-Shot HOI DetectionYanqing Zeng, Yunyao Mao, Zhenbo Lu, Wengang Zhou 0001, Houqiang Li. 1-5 [doi]
- YOLO-TCT: An Effective Network For Long-Tailed Cervical Cell DetectionDi Lv, Lin Yi, Li Liu, Yuze Chen, Xin Chen, Ran Liu 0006. 1-5 [doi]
- MQAD: A Large-Scale Question Answering Dataset for Training Music Large Language ModelsZhihao Ouyang, Ju-Chiang Wang, Daiyu Zhang, Bin Chen, Shangjie Li, Quan Lin. 1-5 [doi]
- On the Design of Weakly-Convex Regularizers for Solving Linear Inverse ProblemsAbijith Jagannath Kamath, Abhishek Shreekant Bhandiwad, Chandra Sekhar Seelamantula. 1-5 [doi]
- Continuously Learning New Words in Automatic Speech RecognitionChristian Huber, Alexander Waibel. 1-5 [doi]
- HRTF Estimation using a Score-based PriorEtienne Thuillier, Jean-Marie Lemercier, Eloi Moliner, Timo Gerkmann, Vesa Välimäki. 1-5 [doi]
- Frequency-Domain Popularity Forecasting with Shape-Based RetrievalCanhua Guan, Zongxia Xie, Haoyu Wang, Haoyu Xing. 1-5 [doi]
- Generating Editable Head Avatars with 3D Gaussian GANsGuohao Li 0010, Hongyu Yang, Yifang Men, Di Huang 0001, Weixin Li 0001, Ruijie Yang, Yunhong Wang 0001. 1-5 [doi]
- PNP-RKD: A Positive-Negative Pair based Relational Knowledge Distillation Method for Cross-Domain Speaker VerificationQing Gu 0002, Yan Song 0001, Nan Jiang 0022, Pengfei Cai, Ian McLoughlin 0001. 1-5 [doi]
- Classification of Zhuang Dialect combined with Bert and SimAMMin Huang, Xuejun Zhang, Wenkang Chen. 1-5 [doi]
- Data Glove-based Personalized Continuous Gesture SegmentationLiufeng Fan, Zhan Zhang, Yiwei Wang, Decheng Zuo, Yinran Wang, Zhongyuan Chen. 1-5 [doi]
- Rethinking Camouflaged Object Detection via Foreground-Background Interactive LearningChenxi Zhang, Qing Zhang, Jiayun Wu. 1-5 [doi]
- Exploiting Beam-Split in IRS-aided Systems via OFDMAP. Siddhartha, L. Yashvanth, Chandra R. Murthy. 1-5 [doi]
- Instance-wise Feature Acquisition with Classifier Selection Option for Structured Data InstancesSachini Piyoni Ekanayake, Daphney-Stavroula Zois. 1-5 [doi]
- Harnessing Light Field Angular Cues and Spatial Geometries for Semantic SegmentationChen Jia, Fan Shi, Xu Cheng. 1-5 [doi]
- Open-Vocabulary Saliency-Guided Progressive Refinement Network for Unsupervised Video Object SegmentationZhidong Han, Shenglong Hu, Huihui Song, Kaihua Zhang 0001. 1-5 [doi]
- Data-Driven Mispronunciation Pattern Discovery for Robust Speech RecognitionAnna Seo Gyeong Choi, Jonghyeon Park, Myungwoo Oh. 1-5 [doi]
- Pre-training, Fine-tuning and Re-ranking: A Three-Stage Framework for Legal Question AnsweringShiwen Ni, Hao Cheng, Min Yang 0007. 1-5 [doi]
- Harnessing Dimensional Contrast and Information Compensation for Sentence Embedding EnhancementKang He, Yuzhe Ding, Bobo Li, Haining Wang, Fei Li 0021, Chong Teng, Donghong Ji. 1-5 [doi]
- HyperSF: A Hypergraph Representation Learning Method Based on Structural FusionXiangfei Fang, Chengying Huan, Boying Wang, Shaonan Ma, Heng Zhang, Chen Zhao. 1-5 [doi]
- Knowledge-Enhanced Poetry-Image Synthesis with Large Language ModelSimin Yang, Yuqing Li, Bin Wu. 1-5 [doi]
- Adaptive Canonical Correlation Analysis With Application to Time Synchronization for Signal AlignmentSpyridon Peppas, Nicholas D. Sidiropoulos. 1-5 [doi]
- Exploring Graph-aware Reasoning and Bidirectional Selection for Vision-Language NavigationDongming Zhou, Jinsheng Deng, Zhengbin Pang, Wei Li. 1-5 [doi]
- Audiogram-Informed End-to-End Noise Reduction and Wide Dynamic Range Compression for Hearing AidsHuiyong Zhang, Brian C. J. Moore, Lingling Dai, Fengyuan Hao, Xiaodong Li 0002, Chengshi Zheng. 1-5 [doi]
- Long-tailed Oracle Character Recognition Based on Convolutional Neural Networks and Vision TransformersZhongyuan Yang, Zhiwang Han, Alimjan Aysa, Ghalip Ibrahim, Kurban Ubul. 1-5 [doi]
- Robust Hybrid Beamforming for Integrated Sensing and Communications via Learned OptimizationLei Wang, Sergiy A. Vorobyov, Esa Ollila. 1-5 [doi]
- Signal Processing Challenges in Automotive RadarSandeep Rao, Rajan Narasimha, Shunqiao Sun. 1-5 [doi]
- GSMM: Efficient Global Sparsification for Resource-Conscious Multimodal ModelsWenlun Zhang, Haoran Pang, Yucai Zhou, Shixiao Wang, Luking Li. 1-5 [doi]
- Zero-resource Speech Translation and Recognition with LLMsKarel Mundnich, Xing Niu 0001, Prashant Mathur, Srikanth Ronanki, Brady Houston, Veera Raghavendra Elluru, Nilaksh Das, Zejiang Hou, Goeric Huybrechts, Anshu Bhatia, Daniel Garcia-Romero, Kyu J. Han, Katrin Kirchhoff. 1-5 [doi]
- Voice Conversion for Low-Resource Languages via Knowledge Transfer and Domain-Adversarial TrainingHuu Tuong Tu, Luong Thanh Long, Vu Huan, Nguyen Thi Phuong Thao, Nguyen Van Thang, Nguyen Tien Cuong, Nguyen Thi Thu Trang. 1-5 [doi]
- Trick-GS: A Balanced Bag of Tricks for Efficient Gaussian SplattingAnil Armagan, Albert Saà-Garriga, Bruno Manganelli, Mateusz Nowak 0002, Mehmet Kerim Yucel. 1-5 [doi]
- Audio Sparse-Transformer for Speech ClassificationHassan Salami Kavaki, Michael I. Mandel. 1-5 [doi]
- Fast Word Error Rate Estimation Using Self-Supervised Representations for Speech and TextChanho Park, Chengsong Lu, Mingjie Chen, Thomas Hain. 1-5 [doi]
- Polygon Pixel IoU: Similarity Metric between Polygons with Different Number of Vertices for Arbitrary-Shaped Text SpottingRei Endo, Taro Miyazaki, Takahiro Mochizuki, Yoshihiko Kawai. 1-5 [doi]
- Dysarthric Speech Conformer: Adaptation for Sequence-to-Sequence Dysarthric Speech RecognitionQianli Wang, Zihan Zhong, Satwinder Singh, Clarion Mendes, Mark Hasegawa-Johnson, Waleed Abdulla, Seyed Reza Shahamiri. 1-5 [doi]
- A decade of DCASE: Achievements, practices, evaluations and future challengesAnnamaria Mesaros, Romain Serizel, Toni Heittola, Tuomas Virtanen, Mark D. Plumbley. 1-5 [doi]
- Enhancing Teacher Classroom Behavior Descriptions: A Spatio-Temporal Graph-Based Method for Video CaptioningTing Cai, Chengyang He, Yu Xiong. 1-5 [doi]
- Investigating Factors Related to the Naturalness of Synthesized Unison SingingKaito Nishizawa, Ryuichi Yamamoto, Wen-Chin Huang, Tomoki Toda. 1-5 [doi]
- MTE: Multi Transformation of Entities in Quaternion Vector Space for Temporal Knowledge Graph CompletionRuiguo Yu, Jiazheng Guo, Mankun Zhao, Tianyi Xu, Jiujiang Guo, Wenbin Zhang 0010, Mei Yu 0004. 1-5 [doi]
- Inter-Frame Skip Coding Mode For Point Cloud Geometry Compression in Solid G-PCCRen Huang, Wei Zhang, Jiangxue Han, Fuzheng Yang. 1-5 [doi]
- Hybrid Content Caching Empowered By AIGC in Wireless NetworksDing Xu 0001, Lingjie Duan, Hongbo Zhu 0002. 1-5 [doi]
- Adaptive Aspect Ratios with Patch-Mixup-ViT-based Vehicle ReIDMei Qiu, Lauren Ann Christopher, Stanley Y. P. Chien, Lingxi Li 0001. 1-5 [doi]
- Curriculum Learning aided Audio-Visual Speech Recognition with Arbitrary Speaker NumberYuxiao Lin, Tao Jin 0004, Xize Cheng, Zhou Zhao 0001, Fei Wu 0001. 1-5 [doi]
- Integrating Potential Pronunciations for Enhanced Mispronunciation Detection and Diagnosis Ability in LLMsMinglin Wu, Jing Xu, Xueyuan Chen, Helen Meng. 1-5 [doi]
- Two-Stream Spiking Neural Network for Event-based Action RecognitionShuang Lian, Qianhui Liu, Ziling Wang, Jia Su, ZhiBin Zuo, Yi Zhang, Rui Yan, Huajin Tang. 1-5 [doi]
- MAP: Supporting Multimodal Knowledge Graph Completion via Augmented Modality Alignment and Instance PreservingYi Li, Qingmeng Zhu, Fei Song, Changwen Zheng, Jiangmeng Li. 1-5 [doi]
- SEHAP: Secure and Efficient Handover Authentication Protocol in LEO Satellite Non-Terrestrial NetworksYunchuan Guo, Jing Wang, Kui Geng, Zifu Li, FengHua Li, Liang Fang. 1-5 [doi]
- Evaluating Contrastive Methodologies for Music Representation Learning Using Playlist DataGregor Meehan, Johan Pauwels. 1-5 [doi]
- Semantic Residual for Multimodal Unified Discrete RepresentationHai Huang 0013, Shulei Wang, Yan Xia 0006. 1-5 [doi]
- Reenvisioning Skeleton-based Action Recognition Through the Lens of NLPLong Cao, Shuo Huai, Jingyao Gai. 1-5 [doi]
- Multi-Point Positional Insertion Tuning for Small Object DetectionKanoko Goto, Takumi Karasawa, Takumi Hirose, Rei Kawakami, Nakamasa Inoue. 1-5 [doi]
- Metric Learning with Progressive Self-Distillation for Audio-Visual Embedding LearningDonghuo Zeng, Kazushi Ikeda. 1-5 [doi]
- Fast Adaptation of Pretrained Speaker Verification System for Source Speaker TrackingXiang Lyu, Yuxuan Wang, Tianyu Zhao, Huadai Liu. 1-2 [doi]
- XLSR-Transducer: Streaming ASR for Self-Supervised Pretrained ModelsShashi Kumar, Srikanth R. Madikeri, Juan Zuluaga-Gomez, Esaú Villatoro-Tello, Iuliia Thorbecke, Petr Motlícek, Manjunath K. E, Aravind Ganapathiraju. 1-5 [doi]
- Multi-Source Multi-Target Domain Similarity Network for Cross-Cultural EEG Emotion RecognitionHaiqing Hu, Hanwen Shi, Bao-Liang Lu, Wei-Long Zheng. 1-5 [doi]
- Fast Sparse Learning from Streaming Data with LASSOMarija Iloska, Petar M. Djuric, Mónica F. Bugallo. 1-5 [doi]
- Towards Precision Characterization of Communication Disorders using Models of Perceived Pragmatic SimilarityNigel G. Ward, Andres Segura, Georgina Bugarini, Heike Lehnert-LeHouillier, Dancheng Liu, Jinjun Xiong, Olac Fuentes. 1-5 [doi]
- Wavelet Scattering Network Features for Intensity Category Classification and Prediction of SPL from SpeechManila Kodali, Sudarsana Reddy Kadiri, Shrikanth Narayanan, Paavo Alku. 1-5 [doi]
- Bootstrapping LLM-based Fact-checking via Iterative Rationalization FinetuningXiucheng Lyu, Chengyu Cao, Mingwei Sun, Bin Liang 0004, Liang Yao, Ruifeng Xu 0001. 1-5 [doi]
- Temporally Aligned Audio for Video with AutoregressionIlpo Viertola, Vladimir Iashin, Esa Rahtu. 1-5 [doi]
- SSRMamba: Efficient Visual State Space Model for Spectral Super-ResolutionBaisong Li, Xingwang Wang, Haixiao Xu. 1-5 [doi]
- MAID: Model Attribution via Inverse DiffusionLuyu Zhu, Kai Ye, Jiayu Yao, Chenxi Li, Luwen Zhao, Yuxin Cao, Derui Wang, Jie Hao 0001. 1-5 [doi]
- MetricGAN+KAN: Kolmogorov-Arnold Networks in Metric-Driven Speech Enhancement SystemsYemin Mai, Stefan Goetze. 1-5 [doi]
- A Clinical Knowledge-Driven Fine-Tuning Strategy for Applying Foundation Model to Fully Automatic Acute Ischemic Stroke Lesion Segmentation on Non-Contrast CT ScansXianzhen Tan, Zhe Qu, Jie Wang 0067, Hulin Kuang. 1-5 [doi]
- †Jiangnan Xia, Qilong Wu, Yanyin Guo, Yi Li, Jianghan Cheng, Junwei Li, Zhiyuan Zhang. 1-5 [doi]
- Same Semantics of the Signal - What Do We Cluster with what RepresentationAlexander Barnhill, Oliver Traub, Andreas K. Maier, Elmar Nöth, Christian Bergler. 1-5 [doi]
- Try Before You Buy: Solving Multi-Model Complex Tasks by Model CompetitionsYongqiang Zhao, Zhenyu Li, Zhi Jin, Feng Zhang, Lianwei Wu, Xinhai Xu, Donghong Liu. 1-5 [doi]
- Semantic Hierarchical Prompt Tuning for Parameter-Efficient Fine-TuningHaowei Zhu, Fangyuan Zhang, Rui Qin, Tianxiang Pan, Jun-Hai Yong, Bin Wang 0021. 1-5 [doi]
- Semantic Data Augmentation for Few-Shot Biomedical Named Entity RecognitionYing Zhang, Weihua Wang. 1-5 [doi]
- Generative Adversarial Network with Structured Semantic Prompts Constrainting Clip for Text-to-ImageShuheng Ge, Li Zhang, Haoyu Xing, Xiangqian Wu. 1-5 [doi]
- Prompt Crossing: Evaluating Whether LLM Response Stem from Jailbreak or Normal PromptKyungHo Kim, Jaejin Seo, Seongmin Park 0004, Jihwa Lee. 1-5 [doi]
- Functional Near-Infrared Spectroscopy Feature Extraction with Application in Workload EstimationElisabeth R. M. Heremans, David Johnston, Dimitra Emmanouilidou, Andre Golard, Ivan Tashev, Ryen White. 1-5 [doi]
- Exploring Text-Queried Sound Event Detection with Audio Source SeparationHan Yin, Jisheng Bai, Yang Xiao, Hui Wang 0030, Siqi Zheng, Yafeng Chen, Rohan Kumar Das, Chong Deng, Jianfeng Chen. 1-5 [doi]
- Security-Enhanced Data Transmission Scheme for IoT-Based Healthcare in Remote AreasZhenbin Guo, Yuchuan Luo, Shaojing Fu, Ming Xu 0002. 1-5 [doi]
- Self-Prompting Polyp Segmentation in Colonoscopy Using Hybrid YOLO-SAM2 ModelMobina Mansoori, Sajjad Shahabodini, Jamshid Abouei, Konstantinos N. Plataniotis, Arash Mohammadi 0001. 1-5 [doi]
- Exploring Kolmogorov-Arnold networks for realistic image sharpness assessmentShaode Yu, Ze Chen, Zhimu Yang, Jiacheng Gu, Bizu Feng, Qiurui Sun. 1-5 [doi]
- Online Learning from Strategic Human Feedback in LLM Fine-TuningShugang Hao, Lingjie Duan. 1-5 [doi]
- Dual Encoders for Diffusion-based Image InpaintingDezhi Zheng, Kaijun Deng, Jinbao Wang, LinLin Shen. 1-5 [doi]
- Convolutional Sparse Coding with Multipath Orthogonal Matching PursuitYanis Gomes, Charles Truong, Jean-Philippe Saut, Fikri Hafid, Pascale Prieur, Laurent Oudre. 1-5 [doi]
- Curriculum Contrastive Learning for Aspect-based Sentiment AnalysisZhongQuan Jian, Daihang Wu, Xiangjian Zeng, Junfeng Yao, Meihong Wang, Qingqiang Wu 0001. 1-5 [doi]
- Transformer Based Multi-view Learning for Integrating Static and Dynamic Complementarity of Brain FunctionShengbing Pei, Yan Wang, Zhao Lv, Chao Zhang. 1-5 [doi]
- ASCDomain: Domain Invariant Device-Adversarial Isotropic Knowledge Distillation Convolutional Neural ArchitectureHubert Truchan, Tien Hung Ngo, Zahra Ahmadi. 1-5 [doi]
- ConvexECG: Lightweight and Explainable Neural Networks for Personalized, Continuous Cardiac MonitoringRayan Ansari, John Cao, Sabyasachi Bandyopadhyay, Sanjiv M. Narayan, Albert J. Rogers, Mert Pilanci. 1-5 [doi]
- Efficient Multi-branch Black-box Semantic-aware Targeted Attack Against Deep Hashing RetrievalChihan Huang, Xiaobo Shen 0001. 1-5 [doi]
- Global Context MambaVision for EEG-based Emotion RecognitionHao Wang, Li Xu, Yuntao Yu, Weiyue Ding, Yiming Xu. 1-5 [doi]
- Linear Time Complexity Conformers with SummaryMixing for Streaming Speech RecognitionTitouan Parcollet, Rogier van Dalen, Shucong Zhang, Sourav Bhattacharya. 1-5 [doi]
- Mask-Weighted Spatial Likelihood Coding for Speaker-Independent Joint Localization and Mask EstimationJakob Kienegger, Alina Mannanova, Timo Gerkmann. 1-5 [doi]
- Stable Test-Time Training for Semantic Segmentation with Output Contrastive LossYunLong Zhang, Zhongyi Shui, Honglin Li 0001, Yuxuan Sun 0002, Chenglu Zhu, Lin Yang 0002. 1-5 [doi]
- Dynamic ROI Adaptation for Accurate Non-Contact Heart Rate Estimation Using VGG-13 based Encoder-Decoder Model and Facial LandmarksAravind A Anil, Srinivasa Karthik, Mohanasankar Sivaprakasam, Jayaraj Joseph. 1-5 [doi]
- STR-Saliency: Decomposition-based Perturbations to Generate Saliency Maps for Temporal Black-box Model InterpretationRunkai Chen, Yuedi Chen, Haozhe Xu, Xiaopeng Guo, Jun Sun 0012. 1-5 [doi]
- Coarse-to-Fine Text-to-Music Latent DiffusionLuca A. Lanzendörfer, Tongyu Lu, Nathanaël Perraudin, Dorien Herremans, Roger Wattenhofer. 1-5 [doi]
- Enhancing Listened Speech Decoding from EEG via Parallel Phoneme Sequence PredictionJihwan Lee, TianTian Feng, Aditya Kommineni, Sudarsana Reddy Kadiri, Shrikanth Narayanan. 1-5 [doi]
- LMTalker: Sparse Landmark-guided Gaussian Splatting for High-fidelity Talking Head SynthesisZhifeng Xie, Zhiwen Jiang, Xuemin Lei, Mengtian Li. 1-5 [doi]
- Projection Valued-based Quantum Machine Learning Adapting to Differential Privacy Algorithm for Word-level LipreadingHang Chen 0001, Chang Wang, Jun Du 0002, Chao-Han Huck Yang, Jun Qi 0002. 1-5 [doi]
- ZipEnhancer: Dual-Path Down-Up Sampling-based Zipformer for Monaural Speech EnhancementHaoxu Wang, Biao Tian. 1-5 [doi]
- LLDB: Efficient Low-Light Image Enhancement with Difffusion BridgeJunlong Ma, Conghan Yue, Zhengwei Peng, Dongyu Zhang 0002. 1-5 [doi]
- DSDN-Net: An Effective Network for Semantic Segmentation in Open-Pit Coal Mining Areas for Land Cover RecognitionJiaqi Li, Ming Ma 0006. 1-5 [doi]
- Improvements of Discriminative Feature Space Training for Anomalous Sound Detection in Unlabeled ConditionsTakuya Fujimura, Ibuki Kuroyanagi, Tomoki Toda. 1-5 [doi]
- ASR Benchmarking: Need for a More Representative Conversational DatasetGaurav Maheshwari 0005, Dmitry Ivanov, Théo Johannet, Kevin El Haddad. 1-5 [doi]
- DDSP Guitar Amp: Interpretable Guitar Amplifier ModelingYen-Tung Yeh, Yu-Hua Chen, Yuan-Chiao Cheng, Jui-Te Wu, Jun-Jie Fu, Yi-Fan Yeh, Yi-Hsuan Yang. 1-5 [doi]
- Emotional Knowledge Self-Distillation in DialogueZhongQuan Jian, Yancheng Wang, Weichao Wu, Junfeng Yao, Meihong Wang, Qingqiang Wu 0001. 1-5 [doi]
- Active Learning for Long-Tailed AnnotationLin Geng, Ningzhong Liu, Han Sun, Jie Qin. 1-5 [doi]
- Semantic-Guided Gaussian Splatting with Deferred RenderingNan Wang, Xiaohan Yan, Xiaowei Song, Zhicheng Wang. 1-5 [doi]
- Sparse-View X-ray 3D Reconstruction using Hybrid Representation Neural Attenuation FieldsYanping Fu, Hao Geng, Zhuangzhuang Zhao, Shaojie Zhang, Haifeng Zhao 0001. 1-5 [doi]
- A Unified Hardware Accelerator for Fast Fourier Transform and Number Theoretic TransformRishabh Shrivastava, Chaitanya Prasad Ratnala, Durga Manasa Puli, Utsav Banerjee. 1-5 [doi]
- NEST: Self-supervised Fast Conformer as All-purpose Seasoning to Speech Processing TasksHe Huang, Taejin Park, Kunal Dhawan, Ivan Medennikov, Krishna C. Puvvada, Nithin Rao Koluguri, Weiqing Wang, Jagadeesh Balam, Boris Ginsburg. 1-5 [doi]
- MoWE-Audio: Multitask AudioLLMs with Mixture of Weak EncodersWenyu Zhang, Shuo Sun, Bin Wang 0040, Xunlong Zou, Zhuohan Liu, Yingxu He, Geyu Lin, Nancy F. Chen, Ai Ti Aw. 1-5 [doi]
- Acoustic Identification of Individual Animals with Hierarchical Contrastive LearningInês Nolasco, Ilyass Moummad, Dan Stowell, Emmanouil Benetos. 1-5 [doi]
- Design of Robust Differential Beamformers with Microphone Arrays of Arbitrary Planar GeometryKunlong Zhao, Xueqin Luo, Jilu Jin, Gongping Huang, Jingdong Chen, Jacob Benesty. 1-5 [doi]
- Robust Seizure Prediction Based on Riemannian Manifold Enhanced Denoising Adversarial AutoencoderPeizhen Peng, Yansheng Wu, Wanqi Yang, Wenxin Wei, Ming Yang. 1-5 [doi]
- FABLE: A Bundle Method For Federated Learning In Wireless SystemsDaniel Cederberg, Erik G. Larsson, Mikael Johansson 0001. 1-5 [doi]
- An Interactive Evaluation Framework for Empathetic Response GenerationXixi Lei, Changqun Li, Liang He 0001, Xin Lin 0001. 1-5 [doi]
- SepNet: Deep Convolutional Neural Network for Specific Emitter Identification with High AccuracyRong Wang 0006, Xu Zhuang, Weixi Zhou, Zhicheng Dong. 1-5 [doi]
- CrossSleep: Multi-Scale Attention with Cross-Time Learning for Single Channel EEG-Based Sleep StagingJingchuan Lu, Jinlong Yang 0002. 1-5 [doi]
- Efficient Dataset Distillation through Low-Rank Space SamplingHangyang Kong, Wenbo Zhou, Xuxiang He, Xiaotong Tu, Xinghao Ding. 1-5 [doi]
- Towards Adversarial Robustness And Backdoor Mitigation in SSLAryan Satpathy, Nilaksh Nilaksh, Dhruva Abhijit Rajwade, Somesh Kumar. 1-5 [doi]
- Promoting PLM Fine-Tuning through Consistency Adversarial TrainingJianqi Gao 0001, Jian Cao 0001, Jinghua Tang. 1-5 [doi]
- Exploring the Implicit Semantic Ability of Multimodal Large Language Models: A Pilot Study on Entity Set ExpansionHebin Wang, Yangning Li, Yinghui Li, Hai-Tao Zheng 0002, Wenhao Jiang, Hong-Gee Kim. 1-5 [doi]
- Enhancing Unsupervised Acoustic Word Embedding with Visual-Grounded Speech Model and Novel Word-level ABX Evaluation SchemesMau Nguyen, Shinobu Hasegawa, Sakriani Sakti. 1-5 [doi]
- Locally Correctable LatticesHaodong Yang, Venkata Gandikota. 1-5 [doi]
- Redesigning graph filter-based GNNs to relax the homophily assumptionSamuel Rey, Madeline Navarro, Victor M. Tenorio, Santiago Segarra, Antonio G. Marques. 1-5 [doi]
- Advancing High-Resolution and Efficient Automotive Radar Imaging through Domain-Informed 1D Deep LearningRuxin Zheng, Shunqiao Sun, Hongshan Liu, Holger Caesar, Honglei Chen, Jian Li. 1-5 [doi]
- Standardization Status of MPEG Video-based Dynamic Mesh Coding (V-DMC)Wenjie Zou, Shizhuo Zhang, FuZheng Yang 0001, Marius Preda. 1-5 [doi]
- L3D-Pose: Lifting Pose for 3D Avatars from a Single Camera in the WildSoumyaratna Debnath, Harish Katti, Shashikant Verma, Shanmuganathan Raman. 1-5 [doi]
- MMFN: Multi-Feature Multi-Modal Fusion Network for Diagnosis of Superficial Lymph Node DiseaseYuankun Wang, Cheng Zhao 0003, Yingxin Liu, Baiying Lei, Tianfu Wang 0001, Luyao Zhou. 1-5 [doi]
- Impairments are Clustered in Latents of Deep Neural Network-based Speech Quality ModelsFredrik Cumlin, Xinyu Liang, Victor Ungureanu, Chandan K. A. Reddy, Christian Schüldt, Saikat Chatterjee. 1-5 [doi]
- Efficient Supernet Training with Orthogonal Softmax for Scalable ASR Model CompressionJingjing Xu 0002, Eugen Beck, Zijian Yang, Ralf Schlüter. 1-5 [doi]
- MACA: Multi-Anchor Classification Approach for Unsupervised Domain AdaptationDexuan Zhao, Chong Zhao, Taizhang Hu, Xing Wei, Fan Yang, Yang Lu. 1-5 [doi]
- Time-independent Spiking Neuron via Membrane Potential Estimation for Efficient Spiking Neural NetworksHanqi Chen, Lixing Yu, Shaojie Zhan, Penghui Yao, Jiankun Shao. 1-5 [doi]
- Leveraging Heterophily in Spatial-Temporal Graphs for Multivariate Time-Series ForecastingYuxin Chen, Fangru Lin, Jingyi Huo, Hui Yan. 1-5 [doi]
- DiGradPatch: Black-Box Patch Attacks via Diffusion-Based Double Gradient and Sensitive Distribution GuidanceYang Wu, Jing Liu 0003. 1-5 [doi]
- LIMMITS'25: Multilingual Streaming TTS With Neural Codecs for Indian LanguagesPhilipp Olbrich, Hema A. Murthy, Pranaw Kumar, Shinji Watanabe, Sheng Zhao, Mark Hasegawa-Johnson. 1-2 [doi]
- DCCT-Net: A Network Combined Dynamic CNN and Transformer for Image Compressive SensingLijuan Xu 0001, Haixiao Mei, Fenghua Tong, Dawei Zhao 0001, Fuqiang Yu. 1-5 [doi]
- Conditional Convolutions for End-to-End Single-Stage Video Text DetectionXiaoge Song, Danhuai Zhao, Wei Zhu, Kang Zheng, Tong Lu. 1-5 [doi]
- Faster Speech-LLaMA Inference with Multi-token PredictionDesh Raj, Gil Keren, Junteng Jia, Jay Mahadeokar, Ozlem Kalinli. 1-5 [doi]
- Image Compressive Sensing With Adaptive Sampling by Median FilteringYanfeng Wu, Chen hui, Ronghua Liao, Shaohui Liu, Debin Zhao. 1-5 [doi]
- Scaling A Simple Approach to Zero-Shot Speech RecognitionJinming Zhao, Vineel Pratap, Michael Auli. 1-5 [doi]
- Personalizing Keyword Spotting with Speaker InformationBeltrán Labrador, Pai Zhu, Guanlong Zhao, Angelo Scorza Scarpati, Quan Wang, Alicia Lozano-Diez, Ignacio López-Moreno. 1-5 [doi]
- LPBS: A RL-PPO Driven K8S Batch Processing Task SchedulerShengchao Yuan, Xiaoxin Bai. 1-5 [doi]
- InvGS: a Novel Real-Time Inverse Rendering Framework Utilizing 3D Gaussian SplattingYang Pu, Qingfeng Wu. 1-5 [doi]
- Gaze-Assisted Human-Centric Domain Adaptation for Cardiac Ultrasound Image SegmentationRuiyi Li, Yuting He, Rongjun Ge, Chong Wang, Daoqiang Zhang, Yang Chen 0008, Shuo Li 0001. 1-5 [doi]
- Serial Local Patterns and Irregular Dependencies Extract and Cascaded Fusion Network for Structural Crack SegmentationHui Liu, Chen Jia, Xu Cheng 0003, Xiufeng Liu, Fan Shi 0001. 1-5 [doi]
- Improving GAN Performance Using Confidence-Aware DiscriminationJinfeng Wu, Wu Shi. 1-5 [doi]
- The Sound of Water: Inferring Physical Properties from Pouring LiquidsPiyush Bagad, Makarand Tapaswi, Cees G. M. Snoek, Andrew Zisserman. 1-5 [doi]
- Cross-Domain Few-Shot Open-Set Keyword Spotting Using Keyword Adaptation and Prototype ReprojectionMingru Yang, Qianhua He, Jinxin Huang, Yongqiang Chen, Zunxian Liu, Yanxiong Li. 1-5 [doi]
- DRDM: A Disentangled Representations Diffusion Model for Synthesizing Realistic Person ImagesEnbo Huang, Yuan Zhang, Faliang Huang, Guangyu Zhang, Yang Liu. 1-5 [doi]
- Sketch-based Point Cloud Generation with Diffusion Model and Pre-training EnhancementYangdong Chen, Mohan Chen 0001, Yuejie Zhang, Rui Feng 0001, Tao Zhang, Shang Gao. 1-5 [doi]
- OpenACE: An Open Benchmark for Evaluating Audio Coding PerformanceJozef Coldenhoff, Niclas Granqvist, Milos Cernak. 1-5 [doi]
- BrainVis: Exploring the Bridge between Brain and Visual Signals via Image ReconstructionHonghao Fu, Hao Wang 0094, Jing Jih Chin, Zhiqi Shen 0001. 1-5 [doi]
- Generalized Graph Signal Reconstruction via the Uncertainty PrincipleYanan Zhao, Xingchao Jian, Feng Ji, Wee-Peng Tay, Antonio Ortega. 1-5 [doi]
- LLM based Text Generation for Improved Low-resource Speech Recognition ModelsTohru Nagano, Gakuto Kurata, Samuel Thomas 0001, Hong-Kwang Jeff Kuo, Daniel Bolaños, Hyun Jung, George Saon. 1-5 [doi]
- Bernoulli-Gaussian Scale Mixture Model and BP Method for Multi-Snapshot Sparse Signal RecoveryShaoxiu Wei, Mingchao Liang, Bhaskar Rao, Florian Meyer. 1-5 [doi]
- A Weighted Cross-entropy Loss for Mitigating LLM Hallucinations in Cross-lingual Continual PretrainingYuantao Fan, Ruifan Li, Guangwei Zhang 0003, Chuan Shi, Xiaojie Wang 0006. 1-5 [doi]
- Convergence Analysis of alpha-SVRG under Strong ConvexitySean Xiao, Sangwoo Park, Stefan Vlaski. 1-5 [doi]
- Multi-label Recognition under Noisy Supervision: A Confusion Mixture Modeling ApproachDiego Linares Gonzalez, Shahana Ibrahim. 1-5 [doi]
- TransPathNet: A Novel Two-Stage Framework for Indoor Radio Map PredictionXin Li, Ran Liu, Saihua Xu, Sirajudeen Gulam Razul, Chau Yuen. 1-2 [doi]
- 3D Mesh Saliency Based on Dictionary Learning with Multi-Level Laplacian-Beltrami OperatorYu Wang, Xingce Wang, Zhongke Wu, Haichuan Zhao. 1-5 [doi]
- Can Automated Speech Recognition Errors Provide Valuable Clues for Alzheimer's Disease Detection?Yin-Long Liu, Rui Feng, Ye-Xin Lu, Jia-xin Chen, Yang Ai, Jia-Hong Yuan, Zhen-Hua Ling. 1-5 [doi]
- Towards Sub-millisecond Latency Real-Time Speech Enhancement Models on HearablesArtem Dementyev, Chandan K. A. Reddy, Scott Wisdom, Navin Chatlani, John R. Hershey, Richard F. Lyon. 1-5 [doi]
- Label-constrained Unsupervised Domain Adaptation for Semantic Segmentation with Diffusion ModelsAlexandre Stenger, Étienne Baudrier, Benoît Naegel, Nicolas Passat. 1-5 [doi]
- USV-AUV Collaboration Framework for Underwater Tasks under Extreme Sea ConditionsJingzehua Xu, Guanwen Xie, Xinqi Wang, Yimian Ding, Shuai Zhang. 1-5 [doi]
- Topology-Informed Pre-training of Graph Neural NetworksPeiyu Liang, Yulia R. Gel, Yuzhou Chen. 1-5 [doi]
- Evaluating the Posterior Sampling Ability of Plug&Play Diffusion Methods in Sparse-View CTLiam Moroy, Guillaume Bourmaud, Frédéric Champagnat, Jean-François Giovannelli. 1-5 [doi]
- LipGen: Viseme-Guided Lip Video Generation for Enhancing Visual Speech RecognitionBowen Hao, Dongliang Zhou, Xiaojie Li, Xingyu Zhang, Liang Xie 0012, Jianlong Wu, Erwei Yin. 1-5 [doi]
- SPED: A Sight-singing Dataset for Performance EvaluationYan Zhang, Jie Luo, Tianrui Li, Wei Xu. 1-5 [doi]
- One-Shot Face Avatar Generation in a Single Forward Pass with Identity PreservationYingmao Miao, Chenhao Lin, Zhengyu Zhao 0001, Hang Wang, Shuai Liu 0016, Chao Shen 0001, Xiaohong Guan. 1-5 [doi]
- Sociologically-Informed Graph Neural Network for Opinion PredictionFan Yang, Jie Bai, Linjing Li, Daniel Dajun Zeng. 1-5 [doi]
- Sign-Mamba: Advanced Mamba-Based Sign Language GenerationGuanwen Feng, Yilin Zhang, Yunan Li 0001, An Liu, Qiguang Miao. 1-5 [doi]
- Domain-aware Node Representation Learning for Graph Out-of-Distribution GeneralizationYi Qiao, Yang Liu, Qing He, Xiang Ao. 1-5 [doi]
- Watermarking Datasets for LLM Fine-tuningJing Qiu, Xi Yang, Shuai Li, Kejiang Chen, Weiming Zhang 0001, Nenghai Yu. 1-5 [doi]
- KIKE: Linguistic Steganalysis Based on Knowledge Infusion and Knowledge EncodingZhuang Wang, Xuekai Chen, Zhongliang Yang, Linna Zhou. 1-5 [doi]
- Decoupled Feature Matching for Few-shot Counting and LocalizationChao Zhai, Fan Zhang 0068, Wenyu Chen 0001, Malu Zhang, Fan Li, Xuanting Xie. 1-5 [doi]
- Semi-Supervised Multilingual Alignment with Lexical Memory for Massively Parallel Text MiningWeitai Zhang, Peiwang Tang, Chao Lin, Simran Naagar, Zhongyi Ye, Junhua Liu. 1-5 [doi]
- Multiple Sclerosis Detection with Reinforcement Learning and Differential EvolutionJing Yang, Jin Yang, Chenwei Wu, Gaozhe Jiang, Yaning Lv, Bingyan Liu, Feng Xu. 1-5 [doi]
- Some Intriguing Observations on the Learnt Matrices in Deep Unfolded NetworksKartheek Kumar Reddy Nareddy, Inbasekaran Perumal, Chandra Sekhar Seelamantula. 1-5 [doi]
- BlurPaint: Image Inpainting using Blurring Diffusion ModelsLinxu Chen, Zhiqing Guo, Liejun Wang, Ke Lu. 1-5 [doi]
- Comprehensive Perturbation Consistency for Semi-Supervised Change Detection in Remote Sensing ImagesZan Mao, Xin Li, Ze Luo, Yingjuan Tang, Dongmei Jiang. 1-5 [doi]
- SSAAD: A Multi-Scale Temporal-Frequency Graph Network for Binary Auditory Attention Detection with Self-Supervised LearningShuai Huang, Yongxiong Wang, Huan Luo, Shuwen Jia, Han Chen, Chendong Qin, Zhongcai He, Rui Luo. 1-5 [doi]
- LiRCDepth: Lightweight Radar-Camera Depth Estimation via Knowledge Distillation and Uncertainty GuidanceHuawei Sun, Nastassia Vysotskaya, Tobias Sukianto, Hao Feng, Julius Ott, Xiangyuan Peng, Lorenzo Servadei, Robert Wille. 1-5 [doi]
- APLASE: Compression using Adaptive Piecewise Linear Approximation and Sparse EncodingRavi Raj Saxena, Prabhakar T. V 0001, Joy Kuri, Chandra R. Murthy. 1-5 [doi]
- An Automatic Extrinsic Calibration Method for LiDAR-Camera Fusion via Combining Semantic and Geometric FeaturesMinqian Wang, Libo Weng, Fei Gao 0014. 1-5 [doi]
- MotionFlow: Joint Motion Priors and Appearance Enhancement for High-Accuracy Optical Flow EstimationZixu Wang, Congxuan Zhang, Zhen Chen 0004, Hongye Chen, Liyue Ge, Ke Lu 0002. 1-5 [doi]
- EagerLog: Active Learning Enhanced Retrieval Augmented Generation for Log-based Anomaly DetectionChiming Duan, Tong Jia, Yong Yang, Guiyang Liu, Jinbu Liu, Huxing Zhang, Qi Zhou, Ying Li 0012, Gang Huang 0001. 1-5 [doi]
- Radiation and Directivity Analysis of a Vibrating Dome-Shaped Radiator Mounted on an Infinite BaffleJunqing Zhang, Wen Zhang 0002, Jingdong Chen, Jacob Benesty. 1-5 [doi]
- Test-time Alignment-Enhanced Adapter for Vision-Language ModelsBaoshun Tong, Kaiyu Song, Hanjiang Lai. 1-5 [doi]
- Find Details in Long Videos: Tower-of-Thoughts and Self-Retrieval Augmented Generation for Video UnderstandingTong Yue, Mingrui Xiao, Dafeng Zhang, Xin Liu, Yali Li 0001, Shengjin Wang. 1-5 [doi]
- Segue: Side-information Guided Generative Unlearnable Examples for Facial Privacy Protection in Real WorldZhiling Zhang, Jie Zhang 0073, Kui Zhang, Wenbo Zhou, Ting Xu 0004, Daiheng Gao, Zixian Guo, Qinglang Guo, Weiming Zhang 0001, Nenghai Yu. 1-5 [doi]
- Boosting Code-Switching ASR with Mixture of Experts Enhanced Speech-Conditioned LLMFengrun Zhang, Wang Geng, Hukai Huang, Yahui Shan, Cheng Yi, He Qu. 1-5 [doi]
- Supervised Dimension Reduction Through Linear ProjectionBiao Chen 0001, Joshua Kortje. 1-5 [doi]
- Generating Gezi Opera Scores with a Large Language Model and a High-Quality DatasetZhen Lei, Ke Gu, Peng Bai, Xiaodong Shi. 1-5 [doi]
- Differentially Private and Communication-efficient Decentralized Learning Using Deep QuantizersRobin Francis, Sundeep Prabhakar Chepuri. 1-5 [doi]
- Addressing Pilot Contamination in Channel Estimation with Variational AutoencodersAmar Kasibovic, Benedikt Fesl, Michael Baur, Wolfgang Utschick. 1-5 [doi]
- Contrastive Lyrics Alignment with a Timestamp-Informed LossTimon Kick, Florian Grötschla, Luca A. Lanzendörfer, Roger Wattenhofer. 1-5 [doi]
- Beyond Speaker Identity: Text Guided Target Speech ExtractionMingyue Huo, Abhinav Jain, Cong Phuoc Huynh, Fanjie Kong, Pichao Wang, Zhu Liu, Vimal Bhat. 1-5 [doi]
- Towards Detecting Auditory Attention from in-Ear Muscle Contractions using Commodity EarbudsHarshvardhan Takawale, Yang Liu 0101, Khaldoon Al-Naimi, Fahim Kawsar, Alessandro Montanari. 1-5 [doi]
- A Practical Gated Recurrent Transformer Network Incorporating Multiple Fusions for Video DenoisingKai Guo, Seungwon Choi, Jongseong Choi, Lae-Hoon Kim. 1-5 [doi]
- High-Efficiency Modulation Classification With Temporal-Frequency Analysis Based on Multi-channel Filter BankYifan Dai, Xin Gao, Ke Jing, Bin Tian. 1-5 [doi]
- Text-Aware Adapter for Few-Shot Keyword SpottingYoungmoon Jung, Jinyoung Lee, Seungjin Lee, Myunghun Jung, Yong-Hyeok Lee, Hoon-Young Cho. 1-5 [doi]
- Time-Graph Frequency Representation with Singular Value Decomposition for Neural Speech EnhancementTingting Wang, Tianrui Wang, Meng Ge, Qiquan Zhang, Zirui Ge, Zhen Yang 0001. 1-5 [doi]
- Hierarchical Spatial-Temporal Enhancement Network For Continuous Sign Language RecognitionMin Xu, Sheng Liu, Yuan Feng, Yiheng Yu, Zhelun Jin, Xuhua Yang. 1-5 [doi]
- Spatially-Aware Cross-Modal Contrastive Learning for Low-Shot HSI ClassificationAkhil Vasim, Pankhi Kashyap, Shabnam Choudhury, Biplab Banerjee. 1-5 [doi]
- A Novel Decision-Making Model for Playing Board Game Combining Planning and Opponent BehaviorsJiajing Zhang, Jiamei Jiang, Linjing Li, Daniel Zeng 0001. 1-5 [doi]
- Improving Food Recognition with Retrieval-Augmented and Domain-Adaptive LVLMsDehua Ma, Zhenbo Xu, Tianshun Xing, Lu Yuan, Jinghan Yang, Huijia Wu, Ming Lei, Zhaofeng He. 1-5 [doi]
- HiE-VL: A Large Vision-Language Model with Hierarchical Adapter for Handwritten Mathematical Expression RecognitionHong Yu Guo, Fei Yin, Jian Xu, Cheng-Lin Liu 0001. 1-5 [doi]
- SelectiveFinetuning: Enhancing Transfer Learning In Sleep Staging Through Selective Domain AlignmentSiyuan Zhao, Chenyu Liu, Yi Ding, Xinliang Zhou. 1-5 [doi]
- Boosting Jailbreak Attack with MomentumYihao Zhang, Zeming Wei. 1-5 [doi]
- Combined object-based audio and MASA format for enhanced spatial mobile communicationMikko-Ville Laitinen, Adriana Vasilache, Anssi Rämö, Jouni Paulus, Vaclav Eksler. 1-5 [doi]
- Semi-Supervised Contrastive Learning for Controllable Video-to-Music RetrievalShanti Stewart, Gouthaman KV, Lie Lu, Andrea Fanelli. 1-5 [doi]
- DiffAttack: Diffusion-based Timbre-reserved Adversarial Attack in Speaker IdentificationQing Wang 0039, Jixun Yao, Zhaokai Sun, Pengcheng Guo, Lei Xie 0001, John H. L. Hansen. 1-5 [doi]
- Fusing Multimodality of Large Language Models and Satellite Imagery via Simplicial Contrastive Learning for Latent Urban Feature Identification and Environmental ApplicationYuzhou Chen, Jiue-An Yang, Hugo Kyo Lee, Calvin Tribby, Tarik Benmarhnia, Marta M. Jankowska, Yulia R. Gel. 1-5 [doi]
- Non-Autoregressive Multimodal Machine TranslationGuojing Liu, Xiangqian Ding, Huili Gong, Xiangyu Qu, Zhenyu Yang, Kai Yan. 1-5 [doi]
- Towards Differential Optimization: Rehearsal-Free Class-Incremental Learning with Slow Learners and Fast AdaptersYinghong Chen, Huanjia Zhu, Jieyi Cai, Huanyu Liu, Jun Liang, Bingzhi Chen. 1-5 [doi]
- Personalized Federated Class-Incremental Learning through Critical Parameter TransferFeng Wu, Siwei Feng, Yuanlu Chen, Libang Zhao. 1-5 [doi]
- PASTD: Progressive Augmentation and Spatiotemporal Decoupling Contrastive Learning for Skeleton-Based Action RecognitionQian Huang, Weiwen Qian, Chang Li, Gongyou Xu, Zhongqi Chen. 1-5 [doi]
- From Pixels to Voice: A Simple and Efficient End-to-End Spoken Image Description Approach via Vision Codec Language ModelsChung Tran, Sakriani Sakti. 1-5 [doi]
- Negative Learning and Dual Contrastive for Unsupervised Visible-Infrared Person Re-identificationJiajia Xu, Xuemiao Xu, Weiwei Cai. 1-5 [doi]
- GS-PT: Exploiting 3D Gaussian Splatting for Comprehensive Point Cloud Understanding via Self-supervised LearningKeyi Liu, Yeqi Luo, Weidong Yang, Jingyi Xu, Zhijun Li, Wen-Ming Chen, Ben Fei. 1-5 [doi]
- Contextualization of ASR with LLM using phonetic retrieval-based augmentationZhihong Lei, Xingyu Na, Mingbin Xu, Ernest Pusateri, Christophe Van Gysel, Yuanyuan Zhang, Shiyi Han, Zhen Huang 0001. 1-5 [doi]
- DreamHA: Towards High-Quality Human Animation with Image-to-Video Diffusion ModelsLongran Shao, Bonan Li, Congying Han, Wenzhao Liu, Tiande Guo, Tianchi Xing, Xinmin Qiu, Zicheng Zhang. 1-5 [doi]
- VP-YOLO: Robust Vehicle-Pedestrian Detection in Challenging Traffic Scenarios via A Human Visual Perception-Inspired NetworkWenbo Liu, Tao Deng, Fei Yan. 1-5 [doi]
- Adopting Whisper for Confidence EstimationVaibhav Aggarwal, Shabari S. Nair, Yash Verma, Yash Jogi. 1-5 [doi]
- Unsupervised Search for Ethnic Minorities' Medical Segmentation Training SetYixiao Chen, Yue Yao, Ruining Yang, Md. Zakir Hossain, Ashu Gupta, Tom Gedeon. 1-5 [doi]
- MRANet: An Encoder-Decoder Network with Multi-Scale Residual Atrous-Spatial Pyramid Pooling for Seismic Phase PickingFeng Jiang, Hongxi Wei, Yingyue Jing, Lingguo Meng. 1-5 [doi]
- OPFormer: Real-Time Optimal Power Flow with CNN-Based TransformerKaijie Xu, Xilin Dai, Lin Qiu. 1-5 [doi]
- Dilated Convolution for Time Series LearningWang Zhang, Subhro Das, Lam M. Nguyen, Luca Daniel. 1-5 [doi]
- PeT-KeyStAtion: Parameter-efficient Transformer with Keypoint-guided Spatial-temporal Aggregation for Video-based Person Re-identificationXingan Ma, Jinhui Yi, Juergen Gall. 1-5 [doi]
- Grouped Knowledge Distillation with Adaptive Logit Softening for Speaker RecognitionChong-Xin Gan, Youzhi Tu, Zezhong Jin, Man-Wai Mak, Kong-Aik Lee. 1-5 [doi]
- FKAN-GMFNet: Fourier Kolmogorov-Arnold-based Group Multi-scale Fusion Network for Aneurysm Image SegmentationShanchen Pang, Xue Zhao, Yulin Zhang, Yawu Zhao, Hengtao Ding, Zhiyuan Zhao 0003, Sibo Qiao. 1-5 [doi]
- Multi-scale Context Intertwining for Panoramic Renal Pathology SegmentationYe Zhang, Xianchao Guan, Hengrui Li, Xiangming Yan, Ziyue Wang 0005, Yongbing Zhang 0002. 1-5 [doi]
- Text Descriptions of Actions and Objects Improve Action AnticipationApoorva Beedu, Harish Haresamudram, Irfan Essa. 1-5 [doi]
- Contrastive Knowledge Distillation for Embedding Refinement in Personalized Speech EnhancementThomas Serre, Mathieu Fontaine 0002, Éric Benhaim, Slim Essid. 1-5 [doi]
- Physics-Informed Neural Networks for Ocean Acoustic Field Prediction with Envelope SmoothingYongsung Park, Peter Gerstoft, Seunghyun Yoon, Woojae Seong. 1-5 [doi]
- VoiceDiT: Dual-Condition Diffusion Transformer for Environment-Aware Speech SynthesisJaemin Jung, Junseok Ahn, Chaeyoung Jung, Tan Dat Nguyen, Youngjoon Jang 0001, Joon Son Chung. 1-5 [doi]
- Meta-UAD: A Meta-Learning Scheme for User-level Network Traffic Anomaly DetectionTongtong Feng, Qi Qi 0001, Lingqi Guo, Jingyu Wang. 1-5 [doi]
- DH-VTON: Deep Text-Driven Virtual Try-On via Hybrid Attention LearningJiabao Wei, Zhiyuan Ma. 1-5 [doi]
- Fading-Invariant Adversarial Attacks on Neural Modulation RecognitionXinze Zhang, Dengao Zhu, Xiyao Dong, Kun He 0001. 1-5 [doi]
- Multiview Canonical Correlation Analysis for Automatic Pathological Speech DetectionYacouba Kaloga, Shakeel A. Sheikh, Ina Kodrasi. 1-5 [doi]
- Contextual Speech Extraction: Leveraging Textual History as an Implicit Cue for Target Speech ExtractionMinsu Kim, Rodrigo Mira, Honglie Chen, Stavros Petridis, Maja Pantic. 1-5 [doi]
- Selective Attention Merging for low resource tasks: A case study of Child ASRNatarajan Balaji Shankar, Zilai Wang, Eray Eren, Abeer A. Alwan. 1-5 [doi]
- Evidential Deep Learning with Reweighted Margin Adjustment for Uncertainty-Driven Cervical OCT Image DiagnosisHanfeng Zhu, Yi Qu, Yutao Ma, Yuchen Pei. 1-5 [doi]
- PAFedMIS: Personalized Asynchronous Federated Learning for Medical Image SegmentationYi Li 0018, Yue Hua, Xin Zheng, Yanqing Guo, Bo Wang 0024. 1-5 [doi]
- Sequential Posterior Sampling with Diffusion ModelsTristan S. W. Stevens, Oisín Nolan, Jean-Luc Robert, Ruud J. G. van Sloun. 1-5 [doi]
- Symmetry and Fusion Data Augmentation for Semi-Supervised Medical SegmentationYishan Zhang, Wenxin Yu 0001, Zhiqiang Zhang, Jun Gong, Peng Chen, Chang Liu. 1-5 [doi]
- Graph-Enhanced Dual-Stream Feature Fusion with Pre-Trained Model for Acoustic Traffic MonitoringShitong Fan, Feiyang Xiao, Wenbo Wang, Shuhan Qi, Qiaoxi Zhu, Wenwu Wang 0001, Jian Guan 0001. 1-5 [doi]
- OrientDream: Streamlining Text-to-3D Generation with Explicit Orientation ControlYuzhong Huang, Fred Morstatter. 1-5 [doi]
- Early Dementia Detection Using Multiple Spontaneous Speech Prompts: The PROCESS ChallengeFuxiang Tao, Bahman Mirheidari, Madhurananda Pahar, Sophie Young, Yao Xiao, Hend Elghazaly, Fritz Peters, Caitlin Illingworth, Dorota Braun, Ronan O'Malley, Simon Bell, Daniel Blackburn, Fasih Haider, Saturnino Luz, Heidi Christensen. 1-2 [doi]
- Learning Source Disentanglement in Neural Audio CodecXiaoyu Bie, Xubo Liu, Gaël Richard. 1-5 [doi]
- Diff4Steer: Steerable Diffusion Prior for Generative Music Retrieval with Semantic GuidanceXuchan Bao, Judith Yue Li, Zhong Yi Wan, Kun Su, Timo I. Denk, Joonseok Lee, Dima Kuzmin, Fei Sha. 1-5 [doi]
- Towards Bringing Parity in Pretraining Datasets for Low-resource Indian LanguagesKaushal Santosh Bhogale, Deovrat Mehendale, Tahir Javed, Devbrat Anuragi, Sakshi Joshi, Sai Sundaresan, Aparna Ananthanarayanan, Sharmistha Dey, Sathish Kumar Reddy G, Anusha Srinivasan, Abhigyan Raman, Pratyush Kumar, Mitesh M. Khapra. 1-5 [doi]
- Improving Robustness of Diffusion-Based Zero-Shot Speech Synthesis via Stable Formant GenerationChangjin Han, Seokgi Lee, Gyuhyeon Nam, Gyeongsu Chae. 1-5 [doi]
- Leveraging Self-Supervised Learning for Speaker DiarizationJiangyu Han, Federico Landini, Johan Rohdin, Anna Silnova, Mireia Díez, Lukás Burget. 1-5 [doi]
- SMCNet: Supervised Surface Material Classification Using mmWave Radar IQ Signals and Complex-valued CNNsStefan Hägele, Fabián Seguel, Driton Salihu, Adam Misik, Eckehard G. Steinbach. 1-5 [doi]
- UV-Mamba: A DCN-Enhanced State Space Model for Urban Village Boundary Identification in High-Resolution Remote Sensing ImagesLulin Li, Ben Chen, Xuechao Zou, Junliang Xing, Pin Tao. 1-5 [doi]
- CLAP-S: Support Set Based Adaptation for Downstream Fiber-optic Acoustic RecognitionJingchen Sun, Shaobo Han, Wataru Kohno, Changyou Chen. 1-5 [doi]
- RAPID: Recognition of Any-Possible DrIver Distraction via Multi-view Pose Generation ModelsJingyu Lei, Shengyu Hao, Gaoang Wang, Der Horng Lee. 1-5 [doi]
- 2L: Dual Knowledge Distillation Dynamic Learning for sketch-based 3D shape retrievalYawen Su, Jing Bai 0004, Gan Lin. 1-5 [doi]
- Fast inter-frame coding for dynamic meshes via supervoxel-based shape matchingXudong Jin, Jianfeng Xu, Kei Kawamura. 1-5 [doi]
- Sensitivity of Room Impulse Responses in Changing Acoustic EnvironmentKarolina Prawda. 1-5 [doi]
- EPIC: Error Pattern Informed Correction for Classroom ASR with Limited Labeled DataLinzhao Jia, Han Sun, Yuang Wei, Changyong Qi, Xiaozhe Yang. 1-5 [doi]
- DiffDesign: A diffusion model using garment Knowledge-Enhanced for Fashion Design SynthesisShouhao Wu. 1-5 [doi]
- Diffusion-based Identity-Preserving Facial Privacy ProtectionDong Han, Salaheldin Mohamed, Yong Li, Joachim Denzler. 1-5 [doi]
- Quantum-Train with Tensor Network Mapping Model and Distributed Circuit AnsatzChen-yu Liu, Chu-Hsuan Abraham Lin, Kuan-Cheng Chen. 1-4 [doi]
- MathReader : Text-to-Speech for Mathematical DocumentsSieun Hyeon, Kyudan Jung, Nam-Joon Kim, Hyun Gon Ryu, Jaeyoung Do. 1-5 [doi]
- Uncovering the Visual Contribution in Audio-Visual Speech RecognitionZhaofeng Lin, Naomi Harte. 1-5 [doi]
- Bootstrapping Language-Audio Pre-training for Music CaptioningLuca A. Lanzendörfer, Constantin Pinkl, Nathanaël Perraudin, Roger Wattenhofer. 1-5 [doi]
- Continuously Learning Video-level Object Tokens for Robust UAV trackingBin Chen, Shenglong Hu, Gang Dong, Lingyan Liang, Dongchao Wen, Kaihua Zhang 0001. 1-5 [doi]
- See In Detail: Enhancing Sparse-view 3D Gaussian Splatting with Local Depth and Semantic RegularizationZongqi He, Zhe Xiao 0001, Kin-Chung Chan, Yushen Zuo, Jun Xiao 0010, Kin-Man Lam 0001. 1-5 [doi]
- DrLLM: Prompt-Enhanced Distributed Denial-of-Service Resistance Method with Large Language ModelsZhenyu Yin, Shang Liu, Guangyuan Xu. 1-5 [doi]
- Distributed-Robust Source Localization in Wireless Acoustic Sensor NetworksXu Wang, De Hu, Qintuya Si. 1-5 [doi]
- XDGesture: An xLSTM-based Diffusion Model for Co-speech Gesture GenerationZixing Zhang 0001, Jiajun Li, Bin Wang, Yiming Liu, Huan Zhao 0003, Björn W. Schuller. 1-5 [doi]
- Audio Texture Manipulation by Exemplar-Based AnalogyKan Jen Cheng, Tingle Li, Gopala Anumanchipalli. 1-5 [doi]
- AudioTime: A Temporally-aligned Audio-text Benchmark DatasetZeyu Xie, Xuenan Xu, Zhizheng Wu 0001, Mengyue Wu. 1-5 [doi]
- Adaptive Fine-Grained Feature Mining and RoI Feature Interaction Network for Small Object Detection in Aerial ImagesFangmin Xie, Zhan Xiong, Zhuoyue Wang, Kun Yang, Guoqiang Xiao. 1-5 [doi]
- Language-based Audio Moment RetrievalHokuto Munakata, Taichi Nishimura, Shota Nakada, Tatsuya Komatsu. 1-5 [doi]
- Compositional Audio Representation LearningSripathi Sridhar, Mark Cartwright. 1-5 [doi]
- Active Listener: Continuous Generation of Listener's Head Motion Response in Dyadic InteractionsBishal Ghosh, Emma Li 0001, Tanaya Guha. 1-5 [doi]
- Take Attention Inside: Neighbor Pair Graph Contrastive LearningBisheng Tang, Xiaojun Chen, Shaopu Wang, Yuexin Xuan, Zhendong Zhao. 1-5 [doi]
- Camouflaged Object Detection with CNN-Transformer Harmonization and CalibrationYilin Zhao, Qing Zhang 0004, Yuetong Li. 1-5 [doi]
- Imitating Human Selective Attention Using Dual Policy Network for Scanpath PredictionKepei Zhang, Ge Tong, Xuetao Zhang. 1-5 [doi]
- Reliable Learning From LLM Features for Multimodal Emotion and Intent Joint UnderstandingXiaolin Xu, Cheng Lu, Zhaoyang Li, Yuyun Liu, Yinghao Ma, Jiahao Luo, Yuan Zong, Wenming Zheng. 1-2 [doi]
- Using Depth-Enhanced Spatial Transformation for Student Gaze Target Estimation in Dual-View Classroom ImagesHaonan Miao, Peizheng Zhao, Yuqi Sun, Fang Nan, Xiaolong Zhang, Yaqiang Wu, Feng Tian 0002. 1-5 [doi]
- Build LLM-Based Zero-Shot Streaming TTS System with CosyvoiceXiang Lyu, Yuxuan Wang, Tianyu Zhao, Hao Wang, Huadai Liu, Zhihao Du. 1-2 [doi]
- Less is Enough: Relation Graph Guided Few-shot Learning for Multi-label Aspect Category DetectionShiman Zhao, Wei Chen 0056, Tengjiao Wang 0003, Jiahui Yao, Dawei Lu, Jiabin Zheng. 1-5 [doi]
- DOA Estimation of Coherent Sources Using Residual Network-based Subspace ReconstructionTiange Wang, Lingyu Chen, Huanglin Zhang. 1-5 [doi]
- Multimodal ARMAX Model for Characterization of Human Body SystemHuiling Li, Qian He, Zhao Jin. 1-5 [doi]
- Estimating Musical Surprisal in AudioMathias Rose Bjare, Giorgia Cantisani, Stefan Lattner, Gerhard Widmer. 1-5 [doi]
- EDSep: An Effective Diffusion-Based Method for Speech Source SeparationJinwei Dong, Xinsheng Wang, Qirong Mao. 1-5 [doi]
- Enhancing Document-Level Relation Extraction through Entity-Pair-Level Interaction ModelingWanlong Liu, Dingyi Zeng, Li Zhou 0010, Yichen Xiao, Malu Zhang, Wenyu Chen 0001. 1-5 [doi]
- Regarding the Existence of the Internal Language Model in CTC-Based E2E ASRZeyu Zhao 0004, Peter Bell 0001. 1-5 [doi]
- LiSenNet: Lightweight Sub-band and Dual-Path Modeling for Real-Time Speech EnhancementHaoyin Yan, Jie Zhang, Cunhang Fan, Yeping Zhou, Peiqi Liu. 1-5 [doi]
- A Robust Quality Evaluator for Panoramic VideosYi Wang, Jiabao Feng, Yu Zhou 0009, Lijuan Tang, Ruirui Chen 0001, Yanjing Sun, Jicun Ding. 1-5 [doi]
- FedSe: Group-Based Sequential Training Strategies for Mitigating Label Skew in Federated LearningKetu Qiao, Yi Wang, Baoquan Wang, Zhengdong Luo, Xi Zhou. 1-5 [doi]
- Mamba-SEUNet: Mamba UNet for Monaural Speech EnhancementJunyu Wang, Zizhen Lin, Tianrui Wang, Meng Ge, Longbiao Wang, Jianwu Dang 0001. 1-5 [doi]
- ZCS-CDiff: A Zero-Shot Code-Switching TTS System with Conformer-Based Diffusion ModelKe Chen, Zhihua Huang, Liang He, Yonghong Yan 0002. 1-5 [doi]
- FlowSep: Language-Queried Sound Separation with Rectified Flow MatchingYi Yuan, Xubo Liu 0001, Haohe Liu, Mark D. Plumbley, Wenwu Wang 0001. 1-5 [doi]
- Adaptive Password Guessing Framework Using Various DatasetsWenbo Zhang, Haibo Cheng, Mingli Zheng, Jiahong Yang, Ping Wang. 1-5 [doi]
- OTFS for Automotive Radars: Waveform Optimization and Ambiguity Function AnalysisNazila Karimian Sichani, Mohammad Alaee Kerahroodi, Maria S. Greco, Fulvio Gini, Bhavani Shankar. 1-5 [doi]
- GLoG-CSUnet: Enhancing Vision Transformers with Adaptable Radiomic Features for Medical Image SegmentationNiloufar Eghbali Zarch, Hassan Bagher-Ebadian, Tuka Alhanai, Mohammad M. Ghassemi. 1-5 [doi]
- Seg-diffusion: Text-to-Image Diffusion Model for Open-Vocabulary Semantic SegmentationShuo Zhang 0013, Jiaming Huang, Yan Wu, Tao Hu, Wenbing Tang 0001, Jing Liu 0012. 1-5 [doi]
- Efficient Finetuning for Dimensional Speech Emotion Recognition in the Age of TransformersAneesha Sampath, James Tavernor, Emily Mower Provost. 1-5 [doi]
- Text-Guided Editable 3D City Scene GenerationYuchuan Feng, Jihang Jiang, Jie Ren, Wenrui Li, Ruotong Li, Xiaopeng Fan. 1-5 [doi]
- Faithful Self-Refinement in Mathematical Reasoning via Progressive Back-TranslationHaoran Liao, Zhihao Zhu, Shaohua Hu, Hao He 0007, Yaohui Jin. 1-5 [doi]
- Toward noise-robust whisper keyword spotting on headphones with in-earcup microphone and curriculum learningQiaoyu Yang, Shuo Zhang, Chuan-Che Jeff Huang. 1-5 [doi]
- Colorization Network Watermarking in the CIE-Lab DomainAlessio Chiovelli, Nischay Purnekar, Benedetta Tondi, Mauro Barni. 1-5 [doi]
- PET: High-Frequency Temporal Self-Consistency Learning for Partially Deepfake Audio LocalizationJiayi He, Jiangyan Yi, Jianhua Tao 0001, Siding Zeng. 1-5 [doi]
- Enhancing Long-Term Capabilities of Large Language Models via Discourse Sub-graph AnalysisZhenyu Guan, Xun Liang 0001, Sensen Zhang. 1-5 [doi]
- Unrolled Generative Compound Gaussian Network for Computer TomographyCarter Lyons, Raghu G. Raj, Margaret Cheney. 1-5 [doi]
- Multimodal Emotion Recognition in Conversation via Possible Speaker's Audio and Visual Sequence SelectionRahul Singh Maharjan, Niyati Rawal, Marta Romeo, Lorenzo Baraldi 0002, Rita Cucchiara, Angelo Cangelosi. 1-5 [doi]
- Exploring Acted Sleepy Speech to Advance Real-World Sleepiness Estimation and Cognitive Degradation DetectionJihye Moon, Youngsun Kong, Yashvi Gupta, Ki H. Chon. 1-5 [doi]
- LogSI: A Benchmark for System-Incremental Log AnalysisMingjie Zhou, Weidong Yang, Lipeng Ma, Sihang Jiang, Bo Xu 0023, Yanghua Xiao. 1-5 [doi]
- Hierarchical Perceptual Distillation Network for Lightweight Image Super-Resolution ReconstructionQingting Tang, Zhiqing Guo, Liejun Wang. 1-5 [doi]
- CPL: Curriculum Pseudo Labeling for Weakly Supervised Temporal Forgery LocalizationDijia Zhang, Mingqi Fang, Zhiying Lu, Hongtao Xie. 1-5 [doi]
- Adapting Large Language Models to Forecast in Frequency DomainYungeng Zhang, Yuan Chang, XiaoHou Shi, Yaqi Song, Feng Wang, Mingchuan Yang. 1-5 [doi]
- MAEM: A Multi-Aspect Extraction Model for Enhanced Embedding in RAGNingyuan Yi, Chen Liu, Yue Wang, Jianjun Yu. 1-5 [doi]
- Optimized Dynamic Watermarking for Audio DNNs with Adaptive Embedding and Boundary SamplingHao Fei, Hewang Nie, Siqi Sun, Songfeng Lu, Ting Luo, Ling Qian, Dunbo Cai, Zhiguo Huang, Runqing Zhang. 1-5 [doi]
- Trimformer: A Novel Sequence Compression Mechanism with Local AttentionRan Dou, Liyang Ru, José C. Príncipe. 1-5 [doi]
- MambaNext: An Enhanced Backbone Network with Focus Linear AttentionDafeng Zhang, Shizhuo Liu. 1-5 [doi]
- Dual Attention for Space-Time Video Super-ResolutionJiakai Zheng, Jianping Luo. 1-5 [doi]