Abstract is missing.
- HOICS: Zero-Shot Hoi Detection via Compatibility Self-LearningMiao Jiang, Min Li, Junxing Ren, Weiqing Huang. 1-5 [doi]
- Small-Footprint Automatic Speech Recognition System using Two-Stage Transfer Learning based Symmetrized Ternary Weight NetworkXuanhao Zhang, Hui Kou, Chenjie Xia, Hao Cai, Bo Liu 0019. 1-5 [doi]
- Incomplete Multi-View Representation Learning Through Anchor Graph-Based GCN and Information BottleneckZhenjiao Liu, Xiao Wang, Xiaodi Huang, Guanlin Li, Ke Sun, Zhikui Chen. 1-5 [doi]
- KC-Prompt: End-To-End Knowledge-Complementary Prompting for Rehearsal-Free Continual LearningYaowei Li, Yating Liu, Xuxin Cheng, Zhihong Zhu, Hongxiang Li, Bang Yang, Zhiqi Huang. 1-5 [doi]
- Enhanced Deep Reinforcement Learning for Parcel Singulation in Non-Stationary EnvironmentsJiwei Shen, Hu Lu, Hao Zhang, Shujing Lyu, Yue Lu. 1-5 [doi]
- Tracking Beyond the Unambiguous Range with Modulo Single-Photon LidarSamuel Fernández-Menduiña, J. Rapp, Hassan Mansour, M. Greiff, Kieran Parsons. 6-10 [doi]
- Modulo Sampling and Recovery in Shift-Invariant SpacesYhonatan Kvich, Yonina C. Eldar. 11-15 [doi]
- Text2Avatar: Text to 3d Human Avatar Generation with Codebook-Driven Body Controllable AttributeChaoqun Gong, Yuqin Dai, Ronghui Li, Achun Bao, Jun Li 0027, Jian Yang 0003, Yachao Zhang, Xiu Li 0001. 16-20 [doi]
- The Joint Grid-Free DOA and Polarization Estimation Algorithm based on Atomic Norm MinimizationTao Chen, Minxing Li, Ziming Liu. 21-25 [doi]
- A Learning-Based System for Automatic Intentional Non-Adherence Detection from Dosing VideosShaolei Feng, Xiaoguang Lu, Deshana Kaushal Desai, Lei Guan. 26-30 [doi]
- MaDE: Multi-Scale Decision Enhancement for Multi-Agent Reinforcement LearningJingqing Ruan, Runpeng Xie, Xuantang Xiong, Shuang Xu, Bo Xu 0002. 31-35 [doi]
- Encoder-Minimal and Decoder-Minimal Framework for Remote Sensing Image DehazingYuanbo Wen, Tao Gao 0001, Ziqi Li, Jing Zhang, Ting Chen 0003. 36-40 [doi]
- An Error Self-Corrected DOA Estimation Model for Sparse Array Based on ANMTao Chen, Qi An, Minxing Li. 41-45 [doi]
- UAV Operation Time Minimization for Wireless-Powered Data CollectionYijia Zhang, Deepak Mishra 0001, Hassan Habibi Gharakheili, Derrick Wing Kwan Ng. 46-50 [doi]
- Dicetrack: Lightweight Dice Classification on Resource-Constrained Platforms with Optimized Deep Learning ModelsChristophe El Zeinaty, Glenn Herrou, Wassim Hamidouche, Daniel Ménard. 51-55 [doi]
- MMCOUNT: Stationary Crowd Counting System Based on Commodity Millimeter-Wave RadarKaiyuan Hu, Hongjie Liao, Mingxiao Li, Fangxin Wang 0001. 56-60 [doi]
- Crowd Modeling and Control Via Cooperative Adaptive FilteringZirui Wan, Saeid Sanei. 61-65 [doi]
- Deep Learning AMR Model Inference Acceleration with CFU for Edge SystemsPavlo Hilei, Marian Petruk, Ievgen Korotkyi, Oleg Farenyuk. 66-70 [doi]
- Real-Time Stereo Speech Enhancement with Spatial-Cue Preservation Based on Dual-Path StructureMasahito Togami, Jean-Marc Valin, Karim Helwani, Ritwik Giri, Umut Isik, Michael M. Goodwin. 71-75 [doi]
- SERC-GCN: Speech Emotion Recognition In Conversation Using Graph Convolutional NetworksDeeksha Chandola, Enas Altarawneh, Michael Jenkin, Manos Papagelis. 76-80 [doi]
- Sensing-Assisted Distributed User Scheduling and Beamforming in Muli-Cell mmWave NetworksTenghao Cai, Lei Li 0030, Tsung-Hui Chang. 81-85 [doi]
- Unsupervised Human Activity Recognition Via Large Language Models and Iterative EvolutionJiayuan Gao, Yingwei Zhang, Yiqiang Chen, Tengxiang Zhang, Boshi Tang, Xiaoyu Wang. 91-95 [doi]
- ANM-Based Source Localization Under Mixed FieldTao Chen, Ziming Liu, Lei Zhan. 96-100 [doi]
- Reinforcement Learning Compensated Filter for Multi-Agents Cooperative LocalizationRan Wang, Jing Sun, Cheng Xu, Ruixue Li, Shihong Duan, Xiaotong Zhang. 101-105 [doi]
- Quantum Ranging Enhanced TDoA LocalizationEntong He, Yuxiang Yang, Chenshu Wu. 106-110 [doi]
- Contactless Radar Heart Rate Variability Monitoring Via Deep Spatio-Temporal ModelingHaoyu Wang, Jinbo Chen, Dongheng Zhang, Zhi Lu, Changwei Wu, Yang Hu, Qibin Sun, Yan Chen 0007. 111-115 [doi]
- Quantum Inspired Image Augmentation Applicable to Waveguides and Optical Image Transfer Via Anderson LocalizationNikolaos Palaiodimopoulos, Vítor Fortes Rey, Matthias Tschöpe, Christina Jörg, Paul Lukowicz, Maximilian Kiefer-Emmanouilidis. 116-120 [doi]
- Political Tweet Sentiment Analysis for Public Opinion PollingAnestis Kaimakamidis, Ioannis Pitas. 121-125 [doi]
- Enhanced Axle-Based Vehicle Classification Using Angle-Based Micro-Doppler SignatureV. R. J. Deville, C. M. Lievers, Jonathan H. Manton. 126-130 [doi]
- Applying Hybrid Quantum LSTM for Indoor Localization Based on RSSISu Fong Chien, David Chieng, Samuel Y. C. Chen, Charilaos C. Zarakovitis, Heng Siong Lim, Y. H. Xu. 131-135 [doi]
- Optimizing Trading Strategies in Quantitative Markets Using Multi-Agent Reinforcement LearningHengxi Zhang, Zhendong Shi, Yuanquan Hu, Wenbo Ding, Ercan E. Kuruoglu, Xiao-Ping Zhang 0002. 136-140 [doi]
- Motif-Matching Based Sub-Braingraph Level Networks for Noisy Resting-State fMRI AnalysisYan Zhang, Xin Liu, Zuping Zhang. 141-145 [doi]
- Detecting Continuous Gravitational Waves Using Generated Training DataJudith Herrmann, Raphael Kunert, Ron Hachmon, Aviv Markus, Allison Gunby-Mann, Sarel Cohen, Tobias Friedrich 0001, Peter Chin 0001. 146-150 [doi]
- Hardware-Limited Time Constant Estimation Using a Weighted Linear RegressionTitan Yuan, Filip Maksimovic, David C. Burnett, Kristofer S. J. Pister. 151-155 [doi]
- Joint Transmit Precoders and Passive Reflection Beamformer Design in IRS-Aided IoT NetworksKunwar Pritiraj Rajput, Linlong Wu, M. R. Bhavani Shankar, Pramod K. Varshney. 156-160 [doi]
- RobustTSVar: A Robust Time Series Variance Estimation AlgorithmZhiqiang Zhou, Linxiao Yang, Qingsong Wen, Liang Sun 0001. 161-165 [doi]
- RoFi: Robust WiFi Intrusion Detection via Distribution MatchingXu Wang, Dongheng Zhang, Fengquan Zhan, Xuecheng Xie, Pengcheng Huang, Yang Hu, Yan Chen. 166-170 [doi]
- Digital Task-Oriented Communication with Hardware-Limited Task-Based QuantizationWuxia Hu, Yang Yang 0057, Yonina C. Eldar, Chunyan Feng, Caili Guo. 171-175 [doi]
- Automotive Radar Interference Mitigation Via SINR MaximizationShuai Yang, Dongheng Zhang, Jinbo Chen, Fang Zhou, Guanzhong Wang, Qibin Sun, Yan Chen. 176-180 [doi]
- A Low-Latency Fft-Ifft Cascade ArchitectureKeshab K. Parhi. 181-185 [doi]
- Cuffless Blood Pressure Estimation Using Magnetic Flux In A Ring Form FactorSeyed Ali Ghazi Asgar, Kaan Sel, Anando Paul, Roderic I. Pettigrew, Roozbeh Jafari. 186-190 [doi]
- UNeC: Unsupervised Exploring In Controllable SpaceXuantang Xiong, Linghui Meng 0001, Jingqing Ruan, Shuang Xu, Bo Xu 0002. 191-195 [doi]
- MAML-Based 24-Hour Personalized Blood Pressure Estimation from Wrist Photoplethysmography Signals in Free-Living ContextJia-Yu Yang, Chih-I Ho, Pei-Yun Tsai, Hung-Ju Lin, Tzung-Dau Wang. 196-200 [doi]
- Aerial-IRS-Assisted Load Balancing In Downlink NetworksShuyi Ren, Beichen Huang, Xiaoyang Li 0002, Kaiming Shen. 201-205 [doi]
- Multi-Layer Relation Knowledge Distillation For Fingerprint RestorationYu-Min Chiu, Ching-Te Chiu, Dao-Heng Luo. 206-210 [doi]
- A Concept for a Slam Back End Hardware AcceleratorToivo Henningson, Stefan Ingi Adalbjörnsson, Anders Berkeman, Carl Drougge, Xavante Erickson, Alexander Hunt. 211-215 [doi]
- Practical Challenge and Solution for IRS-Aided Indoor Localization SystemGanlin Zhang, Dongheng Zhang, Hongyu Deng, Yun Wu, Fengquan Zhan, Yan Chen. 216-220 [doi]
- SVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection with Spiking Neural NetworksQu Yang, Qianhui Liu, Nan Li, Meng Ge, Zeyang Song, Haizhou Li 0001. 221-225 [doi]
- Spiking-Leaf: A Learnable Auditory Front-End for Spiking Neural NetworksZeyang Song, Jibin Wu, Malu Zhang, Mike Zheng Shou, Haizhou Li 0001. 226-230 [doi]
- Application of SNNS Model Based On Multi-Dimensional Attention In Drone Radio Frequency Signal ClassificationZheng Si, Chao Liu, Jianyu Liu, Yinhao Zhou. 231-235 [doi]
- Differentiable Quantum Architecture Search For Job Shop Scheduling ProblemYize Sun, Jiarui Liu, Yunpu Ma, Volker Tresp. 236-240 [doi]
- Low-Complexity GLRT Based Quickest Detection With Unknown ParametersPeichao Wang, Qian He. 241-245 [doi]
- Towards Enabling DPOAE Estimation on Single-Speaker EarbudsIrtaza Shahid, Khaldoon Al-Naimi, Ting Dang, Yang Liu, Fahim Kawsar, Alessandro Montanari. 246-250 [doi]
- Efficient 3D Position Estimation in Badminton SceneBo Han, Liangjian Han. 251-255 [doi]
- F1-EV score: Measuring The Likelihood of Estimating a Good Decision Threshold for Semi-Supervised Anomaly DetectionKevin Wilkinghoff, Keisuke Imoto. 256-260 [doi]
- SoundLoCD: An Efficient Conditional Discrete Contrastive Latent Diffusion Model for Text-to-Sound GenerationXinlei Niu, Jing Zhang 0052, Christian Walder, Charles Patrick Martin. 261-265 [doi]
- StofNet: Super-Resolution Time of Flight NetworkChristopher Hahne, Michel Hayoz, Raphael Sznitman. 266-270 [doi]
- Semi-Supervised Sound Event Detection with Local and Global Consistency RegularizationYiming Li, Xiangdong Wang, Hong Liu 0007, Rui Tao, Long Yan, Kazushige Ouchi. 271-275 [doi]
- Self-Supervised Learning for Anomalous Sound DetectionKevin Wilkinghoff. 276-280 [doi]
- "It os Okay to be Uncommon": Quantizing Sound Event Detection Networks on Hardware Accelerators with Uncommon Sub-Byte SupportYushu Wu, Xiao Quan, Mohammad Rasool Izadi, Chuan-Che Jeff Huang. 281-285 [doi]
- Music Understanding LLaMA: Advancing Text-to-Music Generation with Question Answering and CaptioningShansong Liu, Atin Sakkeer Hussain, Chenshuo Sun, Ying Shan. 286-290 [doi]
- CED: Consistent Ensemble Distillation for Audio TaggingHeinrich Dinkel, Yongqing Wang, Zhiyong Yan, Junbo Zhang, Yujun Wang. 291-295 [doi]
- Semi-Blind Estimation of Direct-to-Reverberant Energy Ratio Using Residual Energy Test StatisticsAli Gökçe, Hüseyin Hacihabiboglu. 296-300 [doi]
- DJCM: A Deep Joint Cascade Model for Singing Voice Separation and Vocal Pitch EstimationHaojie Wei, Xueke Cao, Wenbo Xu, Tangpeng Dan, Yueguo Chen. 301-305 [doi]
- Non-Intrusive Speech Intelligibility Prediction for Hearing-Impaired Users Using Intermediate ASR Features and Human Memory ModelsRhiannon Mogridge, George Close, Robert Sutherland, Thomas Hain, Jon Barker, Stefan Goetze, Anton Ragni. 306-310 [doi]
- Vocal Fold Dynamics for Automatic Detection of Amyotrophic Lateral Sclerosis from VoiceJiayi Zhang, Rita Singh. 311-315 [doi]
- Improving Audio Captioning Models with Fine-Grained Audio Features, Text Embedding Supervision, and LLM Mix-Up AugmentationShih-Lun Wu, Xuankai Chang, Gordon Wichern, Jee-weon Jung, François G. Germain, Jonathan Le Roux, Shinji Watanabe 0001. 316-320 [doi]
- Localizing Acoustic Energy in Sound Field Synthesis by Directionally Weighted Exterior Radiation SuppressionYoshihide Tomita, Shoichi Koyama, Hiroshi Saruwatari. 321-325 [doi]
- SPGM: Prioritizing Local Features for Enhanced Speech Separation PerformanceJia Qi Yip, Shengkui Zhao, Yukun Ma, Chongjia Ni, Chong Zhang, Hao Wang 0199, Trung Hieu Nguyen 0001, Kun Zhou 0003, Dianwen Ng, Eng Siong Chng, Bin Ma 0001. 326-330 [doi]
- Voice Toxicity Detection Using Multi-Task LearningMahesh Kumar Nandwana, Yifan He, Joseph Liu, Xiao Yu, Charles Shang, Eloi du Bois, Morgan McGuire, Kiran Bhat. 331-335 [doi]
- Natural Language Supervision For General-Purpose Audio RepresentationsBenjamin Elizalde, Soham Deshmukh, Huaming Wang. 336-340 [doi]
- Enhancing Note-Level Singing Transcription Model with Unlabeled and Weakly Labeled DataYao Qiu, Jinchao Zhang, Yong Shan, Jie Zhou. 341-345 [doi]
- Simultaneous Interior and Exterior Sound Field Synthesis Using Cylindrical and Spherical Loudspeaker ArraysYo Sasaki, Yasushige Nakayama. 346-350 [doi]
- Multi-CMGAN+/+: Leveraging Multi-Objective Speech Quality Metric Prediction for Speech EnhancementGeorge Close, William Ravenscroft, Thomas Hain, Stefan Goetze. 351-355 [doi]
- Soft Dynamic Time Warping with Variable Step WeightsJohannes Zeitler, Michael Krause 0002, Meinard Müller. 356-360 [doi]
- ScoreDec: A Phase-Preserving High-Fidelity Audio Codec with a Generalized Score-Based Diffusion Post-FilterYi-Chiao Wu, Dejan Markovic, Steven Krenn, Israel D. Gebru, Alexander Richard. 361-365 [doi]
- Learning Audio Concepts from Counterfactual Natural LanguageAli Vosoughi, Luca Bondi, Ho-Hsiang Wu, Chenliang Xu. 366-370 [doi]
- Training Audio Captioning Models without AudioSoham Deshmukh, Benjamin Elizalde, Dimitra Emmanouilidou, Bhiksha Raj, Rita Singh, Huaming Wang. 371-375 [doi]
- Corn: Co-Trained Full- and No-Reference Speech Quality AssessmentPranay Manocha, Donald Williamson, Adam Finkelstein. 376-380 [doi]
- Multi-Channel Mosra: Mean Opinion Score and Room Acoustics Estimation Using Simulated Data and A Teacher ModelJozef Coldenhoff, Andrew Harper, Paul Kendrick, Tijana Stojkovic, Milos Cernak. 381-385 [doi]
- Unsupervised Acoustic Scene Mapping Based on Acoustic Features and Dimensionality ReductionIdan Cohen, Sharon Gannot, Ofir Lindenbaum. 386-390 [doi]
- Bringing the Discussion of Minima Sharpness to the Audio Domain: A Filter-Normalised Evaluation for Acoustic Scene ClassificationManuel Milling, Andreas Triantafyllopoulos, Iosif Tsangko, Simon David Noel Rampp, Björn Wolfgang Schuller. 391-395 [doi]
- Beast: Online Joint Beat and Downbeat Tracking Based on Streaming TransformerChih-Cheng Chang, Li Su. 396-400 [doi]
- Improving Acoustic Echo Cancellation by Exploring Speech and Echo Affinity with Multi-Head AttentionYiqun Zhang, Xinmeng Xu, Weiping Tu. 401-405 [doi]
- ASPED: An Audio Dataset for Detecting PedestriansPavan Seshadri, ChaeYeon Han, Bon-Woo Koo, Noah Posner, Subhrajit Guhathakurta, Alexander Lerch 0001. 406-410 [doi]
- Environmental Sound Synthesis from Vocal Imitations and Sound Event LabelsYuki Okamoto, Keisuke Imoto, Shinnosuke Takamichi, Ryotaro Nagase, Takahiro Fukumori, Yoichi Yamashita. 411-415 [doi]
- Multi-Microphone Noise Data Augmentation for DNN-Based Own Voice Reconstruction for Hearables in Noisy EnvironmentsMattes Ohlenbusch, Christian Rollwage, Simon Doclo. 416-420 [doi]
- 3S-TSE: Efficient Three-Stage Target Speaker Extraction for Real-Time and Low-Resource ApplicationsShulin He, Jinjiang Liu, Hao Li 0046, Yang Yang, Fei Chen, Xueliang Zhang 0001. 421-425 [doi]
- Improving Music Source Separation with Simo Stereo Band-Split RnnYi Luo 0004, Rongzhi Gu. 426-430 [doi]
- A Study of Multichannel Spatiotemporal Features and Knowledge Distillation on Robust Target Speaker ExtractionYichi Wang, Jie Zhang 0042, Shihao Chen, Weitai Zhang, Zhongyi Ye, Xinyuan Zhou, Lirong Dai 0001. 431-435 [doi]
- Resource-Constrained Stereo Singing Voice CancellationClara Borrelli, James Rae, Dogac Basaran, Matt McVicar, Mehrez Souden, Matthias Mauch. 436-440 [doi]
- Unsupervised Learning Based End-to-End Delayless Generative Fixed-Filter Active Noise ControlZhengding Luo, Dongyuan Shi, Xiaoyi Shen, Woon-Seng Gan. 441-445 [doi]
- Boosting Unknown-Number Speaker Separation with Transformer Decoder-Based AttractorYounglo Lee, Shukjae Choi, Byeong-Yeol Kim, Zhongqiu Wang, Shinji Watanabe 0001. 446-450 [doi]
- Srcodec: Split-Residual Vector Quantization for Neural Speech CodecYouqiang Zheng, Weiping Tu, Li Xiao, Xinmeng Xu. 451-455 [doi]
- A Light-Weight State Detection Model for Kalman-Filter-Based Acoustic Feedback Cancellation with Rapid Recovery from Abrupt Path ChangesHaocheng Guo, Xiaohuai Le, Kai Chen, Jing Lu. 456-460 [doi]
- Fastmandarin: Efficient Local Modeling for Natural Mandarin Speech SynthesisChenglong Jiang, Ying Gao, Hao Jin, Linrong Pan, Wing W. Y. Ng. 461-465 [doi]
- Mtdiffusion: Multi-Task Diffusion Model With Dual-Unet for Foley Sound GenerationAnbin Qi, Xiang Xie, Jing Wang. 461-465 [doi]
- Ultra Low Complexity Deep Learning Based Noise SuppressionShrishti Saha Shetu, Soumitro Chakrabarty, Oliver Thiergart, Edwin Mabande. 466-470 [doi]
- Binaural Rendering of Heterogeneous Sound Sources with ExtentCarlotta Anemüller, Oliver Thiergart, Emanuël A. P. Habets. 471-475 [doi]
- NOLACE: Improving Low-Complexity Speech Codec Enhancement Through Adaptive Temporal ShapingJan Büthe, Ahmed Mustafa, Jean-Marc Valin, Karim Helwani, Michael M. Goodwin. 476-480 [doi]
- Music Source Separation With Band-Split Rope TransformerWei Tsung Lu, Ju-Chiang Wang, Qiuqiang Kong, Yun-Ning Hung. 481-485 [doi]
- Audio-Free Prompt Tuning for Language-Audio ModelsYiming Li, Xiangdong Wang, Hong Liu. 491-495 [doi]
- RVAE-EM: Generative Speech Dereverberation Based On Recurrent Variational Auto-Encoder And Convolutive Transfer FunctionPengyu Wang, Xiaofei Li. 496-500 [doi]
- A Practical Online Multichannel Dereverberation Approach with Data-Reuse TechniqueWeilong Huang, Cheng Xue, Jinwei Feng, W. Bastiaan Kleijn. 501-505 [doi]
- An Active Noise Control System Based On Soundfield Interpolation Using A Physics-Informed Neural NetworkYile Angela Zhang, Fei Ma, Thushara D. Abhayapala, Prasanga N. Samarasinghe, Amy Bastine. 506-510 [doi]
- Directional Gain Based Noise Covariance Matrix Estimation for MVDR BeamformingFan Zhang, Chao Pan 0001, Jacob Benesty, Jingdong Chen. 511-515 [doi]
- Noisy-Arcmix: Additive Noisy Angular Margin Loss Combined With Mixup For Anomalous Sound DetectionSoonhyeon Choi, Jung-Woo Choi. 516-520 [doi]
- Mertech: Instrument Playing Technique Detection Using Self-Supervised Pretrained Model with Multi-Task FinetuningDichucheng Li, Yinghao Ma, Weixing Wei, Qiuqiang Kong, Yulun Wu, Mingjin Che, Fan Xia, Emmanouil Benetos, Wei Li 0012. 521-525 [doi]
- Music Auto-Tagging with Robust Music Representation Learned via Domain Adversarial TrainingHaesun Joung, Kyogu Lee. 526-530 [doi]
- An Explainable Proxy Model for Multilabel Audio SegmentationThéo Mariotte, Antonio Almudévar, Marie Tahon, Alfonso Ortega Giménez. 531-535 [doi]
- Pre-Echo Reduction in Transform Audio Coding via Temporal Envelope Control with Machine Learning Based EstimationJae-Won Kim, Byeongho Jo, Seungkwon Beack, Hochong Park. 536-540 [doi]
- Semantic Proximity Alignment: Towards Human Perception-Consistent Audio Tagging by Aligning with Label Text DescriptionWuyang Liu, Yanzhen Ren. 541-545 [doi]
- GASS: Generalizing Audio Source Separation with Large-Scale DataJordi Pons, Xiaoyu Liu, Santiago Pascual, Joan Serrà. 546-550 [doi]
- GaP-Aug: Gamma Patch-Wise Correction Augmentation Method for Respiratory Sound ClassificationAn-Yan Chang, Jing-Tong Tzeng, Huan-Yu Chen, Chih-Wei Sung, Chun-Hsiang Huang, Edward Pei-Chuan Huang, Chi-Chun Lee. 551-555 [doi]
- Conjugate Gradient Based Adaptive Algorithm for Nonlinear AECSrikanth Burra, Asutosh Kar, Mads Græsbøll Christensen. 556-560 [doi]
- Online Target Sound Extraction with Knowledge Distillation from Partially Non-Causal TeacherKeigo Wakayama, Tsubasa Ochiai, Marc Delcroix, Masahiro Yasuda, Shoichiro Saito, Shoko Araki, Akira Nakayama. 561-565 [doi]
- SuperCodec: A Neural Speech Codec with Selective Back-Projection NetworkYouqiang Zheng, Weiping Tu, Li Xiao, Xinmeng Xu. 566-570 [doi]
- BAE-Net: a Low Complexity and High Fidelity Bandwidth-Adaptive Neural Network for Speech Super-ResolutionGuochen Yu, Xiguang Zheng, Nan Li, Runqiang Han, Chengshi Zheng, Chen Zhang, Chao Zhou, Qi Huang, Bing Yu. 571-575 [doi]
- Array Geometry Optimization for Region-of-Interest Near-Field BeamformingRon Moisseev, Gal Itzhak, Israel Cohen. 576-580 [doi]
- Retrieval-Augmented Text-to-Audio GenerationYi Yuan, Haohe Liu, Xubo Liu, Qiushi Huang, Mark D. Plumbley, Wenwu Wang 0001. 581-585 [doi]
- LightCodec: A High Fidelity Neural Audio Codec with Low Computation ComplexityLiang Xu, Jing Wang, Jianqian Zhang, Xiang Xie. 586-590 [doi]
- FunCodec: A Fundamental, Reproducible and Integrable Open-Source Toolkit for Neural Speech CodecZhihao Du, Shiliang Zhang, Kai Hu, Siqi Zheng. 591-595 [doi]
- VRDMG: Vocal Restoration via Diffusion Posterior Sampling with Multiple GuidanceCarlos Hernandez-Olivan, Koichi Saito, Naoki Murata, Chieh-Hsin Lai, Marco A. Martínez Ramírez, Wei-Hsiang Liao, Yuki Mitsufuji. 596-600 [doi]
- Permutation-Alignment Method Using Manifold Optimization for Frequency-Domain Blind Source SeparationSatoru Emura. 601-605 [doi]
- Two-Stage Acoustic Echo Cancellation Network with Dual-Path AlignmentZhijian Jiang, Haoming Li, Nengheng Zheng. 606-610 [doi]
- Real-Time Low-Latency Music Source Separation Using Hybrid Spectrogram-TasnetSatvik Venkatesh, Arthur Benilov, Philip Coleman, Frederic Roskam. 611-615 [doi]
- Binaural Room Transfer Function Interpolation Via System InversionAmal Emthyas, Sebastià V. Amengual Garí, Enzo De Sena. 616-620 [doi]
- Leveraging Sound Localization to Improve Continuous Speaker SeparationHassan Taherian, Ashutosh Pandey 0004, Daniel Wong, Buye Xu, DeLiang Wang. 621-625 [doi]
- Single and Few-Step Diffusion for Generative Speech EnhancementBunlong Lay, Jean-Marie Lemercier, Julius Richter, Timo Gerkmann. 626-630 [doi]
- Can Synthetic Data Boost the Training of Deep Acoustic Vehicle Counting Networks?Stefano Damiano, Luca Bondi, Shabnam Ghaffarzadegan, Andre Guntoro, Toon van Waterschoot. 631-635 [doi]
- Zero- and Few-Shot Sound Event Localization and DetectionKazuki Shimada, Kengo Uchida, Yuichiro Koyama, Takashi Shibuya 0001, Shusuke Takahashi, Yuki Mitsufuji, Tatsuya Kawahara. 636-640 [doi]
- Active Noise Control Over A Large Region with Multiple Spherical Microphone Arrays In Wave DomainXiaoli Tang, Jihui Aimee Zhang, Thushara D. Abhayapala. 641-645 [doi]
- U2R: Underwater Ultrasonic Reflection Wave Dataset Toward Pose-Invariant Material RecognitionMayuka Kono, Yutaro Hirao, Monica Perusquía-Hernández, Naoya Isoyama, Hideaki Uchiyama, Nobuchika Sakata, Jun Takamatsu, Kiyoshi Kiyokawa. 646-650 [doi]
- Sector-Based Interference Cancellation for Robust Keyword Spotting Applications Using an Informed MPDR BeamformerGuendalina Milano, Oliver Thiergart, Emanuël A. P. Habets. 651-655 [doi]
- Low-Latency Speech Enhancement via Speech Token GenerationHuaying Xue, Xiulian Peng, Yan Lu 0001. 661-665 [doi]
- Ambisonics Networks - The Effect of Radial Functions RegularizationBar Shaybet, Anurag Kumar 0003, Vladimir Tourbabin, Boaz Rafaely. 666-670 [doi]
- On The Effect Of Data-Augmentation On Local Embedding Properties In The Contrastive Learning Of Music Audio RepresentationsMatthew C. McCallum, Matthew E. P. Davies, Florian Henkel, Jaehun Kim, Samuel E. Sandberg. 671-675 [doi]
- Meta-AF Echo Cancellation for Improved Keyword SpottingJonah Casebeer, Junkai Wu, Paris Smaragdis. 676-680 [doi]
- Binaural Speech Enhancement Using Deep Complex Convolutional Transformer NetworksVikas Tokala, Eric Grinstein, Mike Brookes, Simon Doclo, Jesper Jensen 0001, Patrick A. Naylor. 681-685 [doi]
- Similar but Faster: Manipulation of Tempo in Music Audio Embeddings for Tempo Prediction and SearchMatthew C. McCallum, Florian Henkel, Jaehun Kim, Samuel E. Sandberg, Matthew E. P. Davies. 686-690 [doi]
- Multi-Modal Continual Pre-Training For Audio EncodersGyuhak Kim, Ho-Hsiang Wu, Luca Bondi, Bing Liu 0001. 691-695 [doi]
- Multi-Dimensional Speech Quality Assessment in CrowdsourcingBabak Naderi, Ross Cutler, Nicolae-Catalin Ristea. 696-700 [doi]
- Neural Ambisonics Encoding For Compact Irregular Microphone ArraysMikko Heikkinen, Archontis Politis, Tuomas Virtanen. 701-705 [doi]
- A Transformer Approach for Polyphonic Audio-to-Score TranscriptionMaría Alfaro-Contreras, Antonio Ríos-Vila, Jose J. Valero-Mas, Jorge Calvo-Zaragoza. 706-710 [doi]
- Advancing Acoustic Howling Suppression Through Recursive Training of Neural NetworksHao Zhang, Yixuan Zhang, Meng Yu, Dong Yu. 711-715 [doi]
- Multi-Level Graph Learning For Audio Event Classification And Human-Perceived Annoyance Rating PredictionYuanbo Hou, Qiaoqiao Ren, Siyang Song, Yuxin Song, Wenwu Wang 0001, Dick Botteldooren. 716-720 [doi]
- Unsupervised Multi-Channel Separation And AdaptationCong Han, Kevin W. Wilson, Scott Wisdom, John R. Hershey. 721-725 [doi]
- Quantifying The Effect Of Simulator-Based Data Augmentation For Speech Recognition On Augmented Reality GlassesRiku Arakawa, Mathieu Parvaix, Chiong Lai, Hakan Erdogan, Alex Olwal. 726-730 [doi]
- Comparison Of Frequency-Fusion Mechanisms For Binaural Direction-Of-Arrival Estimation For Multiple SpeakersDaniel Fejgin, Elior Hadad, Sharon Gannot, Zbynek Koldovský, Simon Doclo. 731-735 [doi]
- Improving Acoustic Echo Cancellation for Voice Assistants Using Neural Echo Suppression and Multi-Microphone Noise ReductionJens Heitkaemper, Arun Narayanan, Turaj Zakizadeh Shabestary, Sankaran Panchapagesan, James Walker, Bhalchandra Gajare, Shlomi Regev, Ajay Dudani, Alexander Gruenstein. 736-740 [doi]
- MDX-GAN: Enhancing Perceptual Quality in Multi-Class Source Separation Via Adversarial TrainingKe Chen, Jiaqi Su, Zeyu Jin. 741-745 [doi]
- Quantifying Spatial Audio Quality ImpairmentKarn N. Watcharasupat, Alexander Lerch 0001. 746-750 [doi]
- A Closer Look at Wav2vec2 Embeddings for On-Device Single-Channel Speech EnhancementRavi Shankar, Ke Tan 0001, Buye Xu, Anurag Kumar 0003. 751-755 [doi]
- A Computationally Efficient Semi-Blind Source Separation Approach for Nonlinear Echo Cancellation Based on an Element-Wise Iterative Source SteeringKunxing Lu, Xianrui Wang, Tetsuya Ueda, Shoji Makino, Jingdong Chen. 756-760 [doi]
- Resource-Efficient Separation TransformerLuca Della Libera, Cem Subakan, Mirco Ravanelli, Samuele Cornell, Frédéric Lepoutre, François Grondin. 761-765 [doi]
- Spiking Structured State Space Model for Monaural Speech EnhancementYu Du, Xu Liu, Yansong Chua. 766-770 [doi]
- Learning from Taxonomy: Multi-Label Few-Shot Classification for Everyday Sound RecognitionJinhua Liang, Huy Phan, Emmanouil Benetos. 771-775 [doi]
- Differential Beamforming with Null Constraints for Spherical Microphone ArraysXudong Zhao, Xueqin Luo, Gongping Huang, Jingdong Chen, Jacob Benesty. 776-780 [doi]
- A Deep Representation Learning-Based Speech Enhancement Method Using Complex Convolution Recurrent Variational AutoencoderYang Xiang, Jingguang Tian, Xinhui Hu, Xinkang Xu, Zhaohui Yin. 781-785 [doi]
- Stereophonic Music Source Separation with Spatially-Informed Bridging Band-Split NetworkYichen Yang, Haowen Li, Xianrui Wang, Wen Zhang, Shoji Makino, Jingdong Chen. 786-790 [doi]
- Ultra-Low Delay Lossless Compression of Higher Order AmbisonicsMahmoud Namazi, Kenneth Rose. 791-795 [doi]
- Stack-and-Delay: A New Codebook Pattern for Music GenerationGaël Le Lan, Varun Nagaraja, Ernie Chang, David Kant, Zhaoheng Ni, Yangyang Shi, Forrest N. Iandola, Vikas Chandra. 796-800 [doi]
- DDD: A Perceptually Superior Low-Response-Time DNN-Based DeclipperJayeon Yi, Junghyun Koo, Kyogu Lee. 801-805 [doi]
- Remixed2remixed: Domain Adaptation for Speech Enhancement by Noise2noise Learning with RemixingLi Li 0063, Shogo Seki. 806-810 [doi]
- Enhancing Violin Fingering Generation through Audio-Symbolic FusionWei-Yang Lin, Yu-Chiang Frank Wang, Li Su. 811-815 [doi]
- On HRTF Notch Frequency Prediction using Anthropometric Features and Neural NetworksLior Arbel, Ishwarya Ananthabhotla, Zamir Ben-Hur, David Lou Alon, Boaz Rafaely. 816-820 [doi]
- TF-SepNet: An Efficient 1D Kernel Design in Cnns for Low-Complexity Acoustic Scene ClassificationYiqiang Cai, Peihong Zhang, Shengchen Li. 821-825 [doi]
- Enriching Music Descriptions with A Finetuned-LLM and Metadata for Text-to-Music RetrievalSeungheon Doh, Minhee Lee, Dasaem Jeong, Juhan Nam. 826-830 [doi]
- Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment ModelRyandhimas E. Zezario, Bo-Ren Brian Bai, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao 0001. 831-835 [doi]
- Odaq: Open Dataset of Audio QualityMatteo Torcoli, Chih-Wei Wu, Sascha Dick, Phillip A. Williams, Mhd Modar Halimeh, William Wolcott, Emanuël A. P. Habets. 836-840 [doi]
- Generalized Specaugment via Multi-Rectangle Inverse Masking For Acoustic Scene ClassificationPil Moo Byun, Joon-Hyuk Chang. 841-845 [doi]
- A Flexible Online Framework for Projection-Based Stft Phase RetrievalTal Peer, Simon Welker, Johannes Kolhoff, Timo Gerkmann. 846-850 [doi]
- Non-Intrusive Speech Quality Assessment with Multi-Task Learning Based on Tensor NetworkHanyue Liu, Miao Liu, Jing Wang, Xiang Xie, Lidong Yang. 851-855 [doi]
- Blind Estimation of Audio Effects Using an Auto-Encoder Approach and Differentiable Digital Signal ProcessingCôme Peladeau, Geoffroy Peeters. 856-860 [doi]
- Intelligent Cardiac Auscultation for Murmur Detection via Parallel-Attentive Models with Uncertainty EstimationZixing Zhang 0001, Tao Pang, Jing Han 0010, Björn W. Schuller. 861-865 [doi]
- Enhancing Audio Generation Diversity with Visual InformationZeyu Xie, Baihan Li, Xuenan Xu, Mengyue Wu, Kai Yu 0004. 866-870 [doi]
- Determined BSS by Combination of IVA and DNN via Proximal AverageKazuki Matsumoto, Kohei Yatabe. 871-875 [doi]
- MOS-FAD: Improving Fake Audio Detection Via Automatic Mean Opinion Score PredictionWangjin Zhou, Zhengdong Yang, Chenhui Chu, Sheng Li, Raj Dabre, Yi Zhao, Tatsuya Kawahara. 876-880 [doi]
- Cross-Triggering Issue in Audio Event Detection and MitigationHuy Phan, Byeonggeun Kim, Vu Nguyen, Andrew Bydlon, Qingming Tang, Chieh-Chi Kao, Chao Wang. 881-885 [doi]
- Efficient Functional Link Adaptive Filters Based On Nearest Kronecker Product DecompositionAlireza Nezamdoust, Mario Huemer, Aurelio Uncini, Danilo Comminiello. 886-890 [doi]
- Active Learning for Sound Event Classification Using Bayesian Neural Networks with Gaussian Variational PosteriorStepan Shishkin, Danilo Hollosi, Stefan Goetze, Simon Doclo. 896-900 [doi]
- Snore Sound Features Based on Percussive Enhancing and Positional Encoding Combined with Multi-Task Learning for Osahs DetectionAolin Hu, Xueshuai Zhang, Shaoxing Zhang, Pengyuan Zhang, Yu Lu, Pengfei Ye, QingWei Zhao, Yonghong Yan 0002. 901-905 [doi]
- On The Role of Room Acoustics in Audio Presentation Attack DetectionNikolay D. Gaubitch, David Looney. 906-910 [doi]
- Fine-Tune the Pretrained ATST Model for Sound Event DetectionNian Shao, Xian Li, Xiaofei Li. 911-915 [doi]
- Class-Incremental Learning for Multi-Label Audio ClassificationManjunath Mulimani, Annamaria Mesaros. 916-920 [doi]
- Estimation of Impulse Responses for a Moving Source Using Optimal Transport RegularizationDavid Sundström, Filip Elvander, Andreas Jakobsson. 921-925 [doi]
- SSL-Net: A Synergistic Spectral and Learning-Based Network for Efficient Bird Sound ClassificationYiyuan Yang, Kaichen Zhou, Niki Trigoni, Andrew Markham. 926-930 [doi]
- One-Epoch Training with Single Test Sample in Test Time for Better Generalization of Cough-Based Covid-19 Detection ModelJiakun Shen, Xueshuai Zhang, Pengyuan Zhang, Yonghong Yan 0002, QingWei Zhao, Ta Li, Yanfen Tang, Shaoxing Zhang. 931-935 [doi]
- Syncfusion: Multimodal Onset-Synchronized Video-to-Audio Foley SynthesisMarco Comunità, Riccardo F. Gramaccioni, Emilian Postolache, Emanuele Rodolà, Danilo Comminiello, Joshua D. Reiss. 936-940 [doi]
- Multi-View Midivae: Fusing Track- and Bar-View Representations for Long Multi-Track Symbolic Music GenerationZhiwei Lin, Jun Chen, Boshi Tang, Binzhu Sha, Jing Yang, Yaolong Ju, Fan Fan, Shiyin Kang, Zhiyong Wu 0001, Helen Meng. 941-945 [doi]
- A Fully Differentiable Model for Unsupervised Singing Voice SeparationGaël Richard, Pierre Chouteau, Bernardo Torres. 946-950 [doi]
- Structure-Informed Positional Encoding for Music GenerationManvi Agarwal, Changhong Wang, Gaël Richard. 951-955 [doi]
- Adapting Pitch-Based Self Supervised Learning Models for Tempo EstimationAntonin Gagneré, Slim Essid, Geoffroy Peeters. 956-960 [doi]
- Consistent and Relevant: Rethink the Query Embedding in General Sound SeparationYuanyuan Wang, Hangting Chen, Dongchao Yang, Jianwei Yu, Chao Weng, Zhiyong Wu 0001, Helen Meng. 961-965 [doi]
- An Efficient Temporary Deepfake Location Approach Based Embeddings for Partially Spoofed Audio DetectionYuankun Xie, Haonan Cheng, Yutian Wang, Long Ye. 966-970 [doi]
- GTCRN: A Speech Enhancement Model Requiring Ultralow Computational ResourcesXiaobin Rong, Tianchi Sun, Xu Zhang, Yuxiang Hu, Changbao Zhu, Jing Lu. 971-975 [doi]
- On The Choice of the Optimal Temporal Support for Audio Classification with Pre-Trained EmbeddingsAurian Quelennec, Michel Olvera, Geoffroy Peeters, Slim Essid. 976-980 [doi]
- Cognitive Virtual Sensing Technique for Feedforward Active Noise ControlRong Xie, Anqi Tu, Chuang Shi, Stephen Elliott, Huiyong Li, Le Zhang. 981-985 [doi]
- SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and Music SynthesisTeysir Baoueb, Haocheng Liu, Mathieu Fontaine 0002, Jonathan Le Roux, Gaël Richard. 986-990 [doi]
- Personalized Neural Speech CodecInseon Jang, Haici Yang, Wootaek Lim, Seungkwon Beack, Minje Kim. 991-995 [doi]
- A Unified Loss Function to Tackle Inter-Class and Intra-Class Data Imbalance in Sound Event DetectionYuliang Zhang, Roberto Togneri, David Huang. 996-1000 [doi]
- An Empirical Study on the Impact of Positional Encoding in Transformer-Based Monaural Speech EnhancementQiquan Zhang, Meng Ge, Hongxu Zhu, Eliathamby Ambikairajah, Qi Song, Zhaoheng Ni, Haizhou Li 0001. 1001-1005 [doi]
- Speech Enhancement in Hearing Aids Using Target Speech Presence Estimation Based on a Delayed Remote Microphone SignalVasudha Sathyapriyan, Michael Syskind Pedersen, Mike Brookes, Jan Østergaard, Patrick A. Naylor, Jesper Jensen 0001. 1006-1010 [doi]
- NOMAD: Unsupervised Learning of Perceptual Embeddings For Speech Enhancement and Non-Matching Reference Audio Quality AssessmentAlessandro Ragano, Jan Skoglund, Andrew Hines. 1011-1015 [doi]
- NIIRF: Neural IIR Filter Field for HRTF Upsampling and PersonalizationYoshiki Masuyama, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux. 1016-1020 [doi]
- Contrastive Loss Based Frame-Wise Feature Disentanglement for Polyphonic Sound Event DetectionYadong Guan, Jiqing Han 0001, Hongwei Song, Wenjie Song 0003, Guibin Zheng, Tieran Zheng, Yongjun He. 1021-1025 [doi]
- Unrestricted Global Phase Bias-Aware Single-Channel Speech Enhancement with Conformer-Based Metric GanShiQi Zhang, Zheng Qiu, Daiki Takeuchi, Noboru Harada, Shoji Makino. 1026-1030 [doi]
- Low Bitrate Loss Resilience Scheme for a Speech Enhancing Neural CodecMihailo Kolundzija, Mathew Shaji Kavalekalam, Ivana Balic, Michelle Mao, Raúl Casas. 1031-1035 [doi]
- Unsupervised Pitch-Timbre Disentanglement of Musical Instruments Using a Jacobian Disentangled Sequential AutoencoderYin-Jyun Luo, Sebastian Ewert, Simon Dixon. 1036-1040 [doi]
- Three-Dimensional Sound Wave Propagation Reproduction by CE-FDTD Simulation Applying Actual Radiation CharacteristicsShota Okubo, Toshiharu Horiuchi. 1041-1045 [doi]
- A Steered Response Power Approach with Bilinear Prediction-Based Trade-Off Prewhitening for Speaker LocalizationZhiheng Wang, Hongsen He, Jingdong Chen, Jacob Benesty, Yi Yu 0002. 1046-1050 [doi]
- High Resolution Guitar Transcription Via Domain AdaptationXavier Riley, Drew Edwards, Simon Dixon. 1051-1055 [doi]
- Effect of Target Signals and Delays on Spatially Selective Active Noise Control for Open-Fitting HearablesTong Xiao, Simon Doclo. 1056-1060 [doi]
- Max-AST: Combining Convolution, Local and Global Self-Attentions for Audio Event ClassificationTony Alex, Sara Ahmed, Armin Mustafa, Muhammad Awais, Philip JB Jackson. 1061-1065 [doi]
- TIA: A Teaching Intonation Assessment Dataset in Real Teaching SituationsShuhua Liu, Chunyu Zhang, Binshuai Li, Niantong Qin, Huanting Cheng, Huayu Zhang. 1066-1070 [doi]
- A Scalable Sparse Transformer Model for Singing Melody ExtractionShuai Yu, Jun Liu, Yi Yu, Wei Li. 1071-1075 [doi]
- Audiosr: Versatile Audio Super-Resolution at ScaleHaohe Liu, Ke Chen 0021, Qiao Tian, Wenwu Wang 0001, Mark D. Plumbley. 1076-1080 [doi]
- Investigating Personalization Methods in Text to Music GenerationManos Plitsis, Theodoros Kouzelis, Georgios Paraskevopoulos, Vassilis Katsouros, Yannis Panagakis. 1081-1085 [doi]
- Learning Ontology Informed Representations with Constraints for Acoustic Event DetectionAkshay Raina, Sayeedul Islam Sheikh, Vipul Arora 0001. 1086-1090 [doi]
- A Detailed Audio-Text Data Simulation Pipeline Using Single-Event SoundsXuenan Xu, Xiaohang Xu 0004, Zeyu Xie, Pingyue Zhang, Mengyue Wu, Kai Yu 0004. 1091-1095 [doi]
- Performance and Energy Balance: A Comprehensive Study of State-of-the-Art Sound Event Detection SystemsFrancesca Ronchini, Romain Serizel. 1096-1100 [doi]
- Microphone Subset Selection for the Weighted Prediction Error Algorithm Using a Group Sparsity PenaltyAnselm Lohmann, Toon van Waterschoot, Jörg Bitzer, Simon Doclo. 1101-1105 [doi]
- HRTF Recommendation Based on the Predicted Binaural Colouration ModelNils Marggraf-Turley, Michael Lovedee-Turner, Enzo De Sena. 1106-1110 [doi]
- ByteHum: Fast and Accurate Query-by-Humming in the WildXingjian Du, Pei Zou, Mingyu Liu, Xia Liang, Minghang Chu, Bilei Zhu. 1111-1115 [doi]
- Adaptive Speech Emotion Representation Learning Based On Dynamic GraphYingxue Gao, Huan Zhao, Zixing Zhang 0001. 1116-11120 [doi]
- STEMGEN: A Music Generation Model That ListensJulian D. Parker, Janne Spijkervet, Katerina Kosta, Furkan Yesiler, Boris Kuznetsov, Ju-Chiang Wang, Matt Avent, Jitong Chen, Duc Le. 1116-1120 [doi]
- Perceptually-Motivated Spatial Audio Codec for Higher-Order Ambisonics CompressionChristoph Hold, Leo McCormack, Archontis Politis, Ville Pulkki. 1121-1125 [doi]
- Joint Music and Language Attention Models for Zero-Shot Music TaggingXingjian Du, Zhesong Yu, Jiaju Lin, Bilei Zhu, Qiuqiang Kong. 1126-1130 [doi]
- SPATIALCODEC: Neural Spatial Speech CodingZhongweiyang Xu, Yong Xu, Vinay Kothapally, Heming Wang, Muqiao Yang, Dong Yu. 1131-1135 [doi]
- AutoPrep: An Automatic Preprocessing Framework for In-The-Wild Speech DataJianwei Yu, Hangting Chen, Yanyao Bian, Xiang Li, Yi Luo, Jinchuan Tian, Mengyang Liu, Jiayi Jiang, Shuai Wang. 1136-1140 [doi]
- An Experimental Comparison of Multi-View Self-Supervised Methods for Music TaggingGabriel Meseguer-Brocal, Dorian Desblancs, Romain Hennequin. 1141-1145 [doi]
- Ainur: Harmonizing Speed and Quality in Deep Music Generation Through Lyrics-Audio EmbeddingsGiuseppe Concialdi, Alkis Koudounas, Eliana Pastor, Barbara Di Eugenio, Elena Baralis. 1146-1150 [doi]
- Parody Detection Using Source-Target Attention with Teacher-Forced LyricsTomoki Ariga, Yosuke Higuchi, Kazutoshi Hayasaka, Naoki Okamoto, Tetsuji Ogawa. 1151-1155 [doi]
- Generation or Replication: Auscultating Audio Latent Diffusion ModelsDimitrios Bralios, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux. 1156-1160 [doi]
- Recap: Retrieval-Augmented Audio CaptioningSreyan Ghosh, Sonal Kumar, Chandra Kiran Reddy Evuru, Ramani Duraiswami, Dinesh Manocha. 1161-1165 [doi]
- Bass Accompaniment Generation Via Latent DiffusionMarco Pasini, Maarten Grachten, Stefan Lattner. 1166-1170 [doi]
- Learning Speaker-Listener Mutual Head Orientation by Leveraging HRTF and Voice Directivity on HeadphonesHarshvardhan C. Takawale, Nirupam Roy. 1171-1175 [doi]
- Unsupervised Harmonic Parameter Estimation Using Differentiable DSP and Spectral Optimal TransportBernardo Torres, Geoffroy Peeters, Gaël Richard. 1176-1180 [doi]
- Parameter Efficient Audio Captioning with Faithful Guidance Using Audio-Text Shared Latent RepresentationArvind Krishna Sridhar, Yinyi Guo, Erik Visser, Rehana Mahfuz. 1181-1185 [doi]
- Fine-Grained Engine Fault Sound Event Detection Using Multimodal SignalsDennis Fedorishin, Livio Forte, Philip Schneider, Srirangaraj Setlur, Venu Govindaraju. 1186-1190 [doi]
- Hyperbolic Distance-Based Speech SeparationDarius Petermann, Minje Kim. 1191-1195 [doi]
- DPM-TSE: A Diffusion Probabilistic Model for Target Sound ExtractionJiarui Hai, Helin Wang, Dongchao Yang, Karan Thakkar, Najim Dehak, Mounya Elhilali. 1196-1200 [doi]
- Binaural Angular Separation NetworkYang Yang 0010, George Sung, Shao-fu Shih, Hakan Erdogan, Chehung Lee, Matthias Grundmann. 1201-1205 [doi]
- MusicLDM: Enhancing Novelty in text-to-music Generation Using Beat-Synchronous mixup StrategiesKe Chen, Yusong Wu, Haohe Liu, Marianna Nezhurina, Taylor Berg-Kirkpatrick, Shlomo Dubnov. 1206-1210 [doi]
- Exploring Meta Information for Audio-Based Zero-Shot Bird ClassificationAlexander Gebhard, Andreas Triantafyllopoulos, Teresa Bez, Lukas Christ, Alexander Kathan, Björn W. Schuller. 1211-1215 [doi]
- Scalable and Efficient Speech Enhancement Using Modified Cold Diffusion: A Residual Learning ApproachMinje Kim, Trausti Kristjansson. 1216-1220 [doi]
- Spatial Scaper: A Library to Simulate and Augment Soundscapes for Sound Event Localization and Detection in Realistic RoomsIrán R. Román, Christopher Ick, Sivan Ding, Adrian S. Roman, Brian McFee, Juan Pablo Bello. 1221-1225 [doi]
- A Foundation Model for Music InformaticsMinz Won, Yun-Ning Hung, Duc Le. 1226-1230 [doi]
- From RIR to BRIR: A Sparse Recovery Beamforming Approach for Virtual Binaural Sound RenderingHuiyuan Sun, Howe Yuan Zhu, Minh T. D. Nguyen, Vincent Nguyen, Chin-Teng Lin, Craig T. Jin. 1231-1235 [doi]
- Active Noise Control Over 3D Space with A Dynamic Noise SourceHuiyuan Sun, Craig T. Jin, Thushara D. Abhayapala, Prasanga N. Samarasinghe. 1236-1240 [doi]
- Investigating Self-Supervised Deep Representations for EEG-Based Auditory Attention DecodingKaran Thakkar, Jiarui Hai, Mounya Elhilali. 1241-1245 [doi]
- Quantization Noise Masking in Perceptual Neural Audio CoderSeungmin Shin, Joon Byun, Jongmo Sung, Seungkwon Beack, Youngcheol Park. 1246-1250 [doi]
- Generative De-Quantization for Neural Speech Codec Via Latent DiffusionHaici Yang, Inseon Jang, Minje Kim. 1251-1255 [doi]
- Piano Transcription with Harmonic AttentionRuimin Wu, Xianke Wang, Yuqing Li, Wei Xu, Wenqing Cheng. 1256-1260 [doi]
- Dual-Path Minimum-Phase and All-Pass Decomposition Network for Single Channel Speech DereverberationXi Liu, Szu-Jui Chen, John H. L. Hansen. 1261-1265 [doi]
- A Dual-Path Framework with Frequency-and-Time Excited Network for Anomalous Sound DetectionYucong Zhang, Juan Liu, Yao Tian, Haifeng Liu, Ming Li. 1266-1270 [doi]
- First-Shot Unsupervised Anomalous Sound Detection with Unknown Anomalies Estimated by Metadata-Assisted Audio GenerationHejing Zhang, Qiaoxi Zhu, Jian Guan 0001, Haohe Liu, Feiyang Xiao, Jiantong Tian, Xinhao Mei, Xubo Liu, Wenwu Wang 0001. 1271-1275 [doi]
- SCNet: Sparse Compression Network for Music Source SeparationWeinan Tong, Jiaxu Zhu, Jun Chen, Shiyin Kang, Tao Jiang, Yang Li, Zhiyong Wu, Helen Meng. 1276-1280 [doi]
- Exploring Self-supervised Contrastive Learning of Spatial Sound Event RepresentationXilin Jiang, Cong Han, Yinghao Aaron Li, Nima Mesgarani. 1281-1285 [doi]
- SynthTab: Leveraging Synthesized Data for Guitar Tablature TranscriptionYongyi Zang, Yi Zhong, Frank Cwitkowitz, Zhiyao Duan. 1286-1290 [doi]
- Timbre-Trap: A Low-Resource Framework for Instrument-Agnostic Music TranscriptionFrank Cwitkowitz, Kin Wai Cheuk, Woosung Choi, Marco A. Martínez Ramírez, Keisuke Toyama 0002, Wei-Hsiang Liao, Yuki Mitsufuji. 1291-1295 [doi]
- A Comparative Analysis of Poetry Reading Audio: Singing, Narrating, or Somewhere in Between?Kahyun Choi, Minje Kim. 1296-1300 [doi]
- A Hybrid Deep-Online Learning Based Method for Active Noise Control in Wave DomainDonghang Wu, Xihong Wu, Tianshu Qu. 1301-1305 [doi]
- Towards High Resolution Weather Monitoring With Sound DataEnis Berk Çoban, Megan Perra, Michael I. Mandel. 1306-1310 [doi]
- Stealthy Backdoor Attack Towards Federated Automatic Speaker VerificationLongling Zhang, Lyqi Liu, Dan Meng, Jun Wang, Shengshan Hu. 1311-1315 [doi]
- Transferable Models for Bioacoustics with Human Language SupervisionDavid Robinson, Adelaide Robinson, Lily Akrapongpisak. 1316-1320 [doi]
- Robust DoA Estimation from Deep Acoustic ImagingAdrian S. Roman, Irán R. Román, Juan Pablo Bello. 1321-1325 [doi]
- Exploring Large Scale Pre-Trained Models for Robust Machine Anomalous Sound DetectionBing Han, Zhiqiang Lv, Anbai Jiang, Wen Huang, Zhengyang Chen, YuFeng Deng, Jiawei Ding, Cheng Lu 0007, Wei-Qiang Zhang 0001, Pingyi Fan, Jia Liu 0001, Yanmin Qian. 1326-1330 [doi]
- Adapting Frechet Audio Distance for Generative Music EvaluationAzalea Gui, Hannes Gamper, Sebastian Braun, Dimitra Emmanouilidou. 1331-1335 [doi]
- Sparse Sound Field Representation Using Complex Orthogonal Matching PursuitShaoheng Xu, Jihui Aimee Zhang, Thushara D. Abhayapala, Amy Bastine, Wei-Ting Lai, Prasanga N. Samarasinghe. 1336-1340 [doi]
- Attention Is All You Need For Blind Room Volume EstimationChunxi Wang, Maoshen Jia, Meiran Li, Changchun Bao, Wenyu Jin 0003. 1341-1345 [doi]
- Plug-and-Play MVDR Beamforming for Speech SeparationChengbo Chang, Ziye Yang, Jie Chen. 1346-1350 [doi]
- Broadband Personal Sound Zone Control in the Presence of NonlinearitiesSankha Subhra Bhattacharjee, Srikanth Burra, Jesper Rindom Jensen, Liming Shi, Guoli Ping, Jingkai Weng, Mads Græsbøll Christensen. 1351-1355 [doi]
- Tempo Estimation as Fully Self-Supervised Binary ClassificationFlorian Henkel, Jaehun Kim, Matthew C. McCallum, Samuel E. Sandberg, Matthew E. P. Davies. 1356-1360 [doi]
- AAT: Adapting Audio Transformer for Various Acoustics Recognition TasksYun Liang, Hai Lin, Shaojian Qiu, Yihang Zhang. 1361-1365 [doi]
- MIR-MLPop: A Multilingual Pop Music Dataset with Time-Aligned Lyrics and AudioJun-You Wang, Chung-Che Wang, Chon-In Leong, Jyh-Shing Roger Jang. 1366-1370 [doi]
- A Real-Time Lyrics Alignment System Using Chroma and Phonetic Features for Classical Vocal PerformanceJiyun Park, Sangeon Yong, Taegyun Kwon, Juhan Nam. 1371-1375 [doi]
- Improving Speech Attenuation in Headphones using Harmonic Model Decomposition and Multiple-Frequency ANCYurii Iotov, Sidsel Marie Nørholm, Peter John McCutcheon, Mads Græsbøll Christensen. 1376-1380 [doi]
- Noise-Aware Speech Separation with Contrastive LearningZizheng Zhang, Chen Chen, Hsin-Hung Chen, Xiang Liu, Yuchen Hu, Eng Siong Chng. 1381-1385 [doi]
- Efficient High-Performance Bark-Scale Neural Network for Residual Echo and Noise SuppressionErnst Seidel, Pejman Mowlaee, Tim Fingscheidt. 1386-1390 [doi]
- Few-Shot Anomalous Sound Detection Based on Anomaly Map Estimation Using Pseudo Abnormal DataRyosuke Tanaka, Satoshi Tamura. 1391-1395 [doi]
- Improving Target Sound Extraction with Timestamp Knowledge DistillationDail Kim, Min-Sang Baek, Yungyeo Kim, Joon-Hyuk Chang. 1396-1400 [doi]
- Class: Continual Learning Approach for Speech Super-ResolutionDonghyun Kim, Yungyeo Kim, Joon-Hyuk Chang. 1401-1405 [doi]
- Multi-Scale Permutation Entropy for Audio Deepfake DetectionChenglong Wang, Jiayi He, Jiangyan Yi, Jianhua Tao 0001, Chu-Yuan Zhang, Xiaohui Zhang 0006. 1406-1410 [doi]
- 6DoF SELD: Sound Event Localization and Detection Using Microphones and Motion Tracking Sensors on Self-Motioning HumanMasahiro Yasuda, Shoichiro Saito, Akira Nakayama, Noboru Harada. 1411-1415 [doi]
- From Coarse to Fine: Efficient Training for Audio Spectrogram TransformersJiu Feng, Mehmet Hamza Erol, Joon Son Chung, Arda Senocak. 1416-1420 [doi]
- Speech Foundation Models on Intelligibility Prediction for Hearing-Impaired ListenersSantiago Cuervo, Ricard Marxer. 1421-1425 [doi]
- Stethoscope-Guided Supervised Contrastive Learning for Cross-Domain Adaptation on Respiratory Sound ClassificationJune-Woo Kim, Sangmin Bae, Won-Yang Cho, Byungjo Lee, Ho-Young Jung. 1431-1435 [doi]
- Crowdsourced Multilingual Speech Intelligibility TestingLaura Lechler, Kamil Wójcicki. 1441-1445 [doi]
- Audio Prompt Tuning for Universal Sound SeparationYuzhuo Liu, Xubo Liu, Yan Zhao, Yuanyuan Wang, Rui Xia, Pingchuan Tain, Yuxuan Wang 0002. 1446-1450 [doi]
- Selecting N-Lowest Scores for Training MOS Prediction ModelsYuto Kondo, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko. 1451-1455 [doi]
- Audio Difference Learning for Audio CaptioningTatsuya Komatsu, Yusuke Fujita, Kazuya Takeda, Tomoki Toda. 1456-1460 [doi]
- Phase Reconstruction in Single Channel Speech Enhancement Based on Phase Gradients and Estimated Clean-Speech AmplitudesYanjue Song, Nilesh Madhu. 1461-1465 [doi]
- Anomalous Sound Detection by Feature-Level Anomaly SimulationVitjan Zavrtanik, Matija Marolt, Matej Kristan, Danijel Skocaj. 1466-1470 [doi]
- Generating Stereophonic Music with Single-Stage Language ModelsXingda Li, Fan Zhuo, Dan Luo, Jun Chen, Shiyin Kang, Zhiyong Wu, Tao Jiang, Yang Li, Han Fang, Yahui Zhou. 1471-1475 [doi]
- Reconstruction of Sound Field Through Diffusion ModelsFederico Miotello, Luca Comanducci, Mirco Pezzoli, Alberto Bernardini, Fabio Antonacci, Augusto Sarti. 1476-1480 [doi]
- A Lightweight Hybrid Multi-Channel Speech Extraction System with Directional Voice Activity DetectionTianchi Sun, Tong Lei, Xu Zhang, Yuxiang Hu, Changbao Zhu, Jing Lu. 1486-1490 [doi]
- String Sound Synthesizer On Gpu-Accelerated Finite Difference SchemeJin Woo Lee, Min-Jun Choi, Kyogu Lee. 1491-1495 [doi]
- SMMA-Net: An Audio Clue-Based Target Speaker Extraction Network with Spectrogram Matching and Mutual AttentionYing Hu, Haitao Xu, Zhongcun Guo, Hao Huang, Liang He. 1496-1500 [doi]
- Mixed Informed Transformer for Few-Shot Medical Image SegmentationJiuqiang Li, Zheng Wang, Shilei Zhu. 1501-1505 [doi]
- Predict and Interpret Health Risk Using Ehr Through Typical PatientsZhihao Yu, Chaohe Zhang, Yasha Wang, Wen Tang, Jiangtao Wang 0001, Liantao Ma. 1506-1510 [doi]
- BrainFC-CGAN: A Conditional Generative Adversarial Network for Brain Functional Connectivity Augmentation and Aging SynthesisYee Fan Tan, Junn Yong Loo, Chee-Ming Ting, Fuad Noman, Raphaël C.-W. Phan, Hernando Ombao. 1511-1515 [doi]
- Blind Inpainting with Object-Aware Discrimination for Artificial Marker RemovalXuechen Guo, Wenhao Hu 0002, Chiming Ni, Wenhao Chai, Shiyan Li, Gaoang Wang. 1516-1520 [doi]
- Embedded Feature Similarity Optimization with Specific Parameter Initialization for 2D/3D Medical Image RegistrationMinheng Chen, Zhirun Zhang, Shuheng Gu, Youyong Kong. 1521-1525 [doi]
- Predicting Adverse Events for Patients with Type-1 Diabetes Via Self-Supervised LearningXinzhe Zheng, Sijie Ji, Chenshu Wu. 1526-1530 [doi]
- Breaking the Barrier: Selective Uncertainty-Based Active Learning for Medical Image SegmentationSiteng Ma, Haochang Wu, Aonghus Lawlor, Ruihai Dong. 1531-1535 [doi]
- Symmetric Consistency with Cross-Domain Mixup for Cross-Modality Cardiac SegmentationZhuotong Cai, Jingmin Xin, Siyuan Dong, John A. Onofrey, Nanning Zheng 0001, James S. Duncan. 1536-1540 [doi]
- SSHNN: Semi-Supervised Hybrid NAS Network for Echocardiographic Image SegmentationRenqi Chen, Jingjing Luo, Fan Nian, Yuhui Cen, Yiheng Peng, Zekuan Yu. 1541-1545 [doi]
- Gland Instance Segmentation by Full Resolution Multi-Scale Dilation Residual NetworksMengjiao Yao, Xiang Gao. 1546-1550 [doi]
- Learning Hybrid Negative Probability Model for Weakly-Supervised Whole Slide Image RecognitionYining Qiu, Yuxi Li, Jiafu Wu, Zhenye Gan, Mingmin Chi, Yabiao Wang, Chengjie Wang, Pei Wang. 1551-1555 [doi]
- Dynamic Label Smoothing Strategy for Biosignal ClassificationPeiji Chen, Dian Li, Yifan Tang, Shunta Togo, Hiroshi Yokoi, Yinlai Jiang. 1556-1560 [doi]
- Mutualreg: Mutual Learning for Unsupervised Medical Image RegistrationJun Liu, Wenyi Wang, Nuo Shen, Wei Wang, Kuanquan Wang, Qince Li, Yongfeng Yuan, Henggui Zhang, Gongning Luo. 1561-1565 [doi]
- Learning Multiscale Consistency for Self-Supervised Electron Microscopy Instance SegmentationYinda Chen, Wei Huang 0036, Xiaoyu Liu, Shiyu Deng, Qi Chen, Zhiwei Xiong. 1566-1570 [doi]
- A Graph Neural Network Based Fusion of MRI-Derived Brain Network and Clinical Data for Glioblastoma Survival PredictionXingcan Hu, Li Xiao, Yu-Ping Wang. 1571-1575 [doi]
- LK-UNet: Large Kernel Design for 3D Medical Image SegmentationJiang Shang, Sifan Zhou. 1576-1580 [doi]
- Stable Optimization for Large Vision Model Based Deep Image Prior in Cone-Beam CT ReconstructionMinghui Wu 0009, Yangdi Xu, Yingying Xu, Guangwei Wu, Qingqing Chen 0001, Hongxiang Lin. 1581-1585 [doi]
- Frequency Aware and Graph Fusion Network for Polyp SegmentationYan Li, Zhuoran Zheng, Wenqi Ren, Yunfeng Nie, Jingang Zhang, Xiuyi Jia. 1586-1590 [doi]
- TRLS: A Time Series Representation Learning Framework Via Spectrogram for Medical Signal ProcessingLuyuan Xie, Cong Li, Xin Zhang, Shengfang Zhai, Yuejian Fang, Qingni Shen, Zhonghai Wu. 1591-1595 [doi]
- SSR-GPCsT: Deep Learning Models Based on Functional Connectivity Maps in Autism ResearchJiacheng Hao, Junhai Xu, Mengting Liu, Jianguo Wei. 1596-1600 [doi]
- CALSeg: Improving Calibration of Medical Image Segmentation Via Variational Label SmoothingXutao Guo, Yanwu Yang, Chenfei Ye, Guoqing Cai, Ting Ma 0001. 1601-1605 [doi]
- Core Body Temperature and its Role in Detecting Acute Stress: A Feasibility StudyMehrab Bin Morshed, Md Mahbubur Rahman, Viswam Nathan, Li Zhu, Jungmok Bae, Christina Rosa, Wendy Berry Mendes, Jilong Kuang, Alex Gao 0001. 1606-1610 [doi]
- Weakly Semi-Supervised Tool Detection in Minimally Invasive Surgery VideosRyo Fujii, Ryo Hachiuma, Hideo Saito. 1611-1615 [doi]
- Chat: Cascade Hole-Aware Transformers with Geometric Spatial Consistency for Accurate Monocular Endoscopic Depth EstimationMing Wu, Hao Qi, Wenkang Fan, Sunkui Ke, Hui-Qing Zeng, Yinran Chen, Xiongbiao Luo. 1616-1620 [doi]
- Early Diagnosing Parkinson's Disease Via a Deep Learning Model Based on Augmented Facial Expression DataYintao Zhou, Meng Pang, Wei Huang, Binghui Wang. 1621-1625 [doi]
- DDN-Net: Deep Residual Shrinkage Denoising Networks with Channel-Wise Adaptively Soft Thresholds for Automated Major Depressive Disorder IdentificationYan Zhang, Xin Liu, Zuping Zhang. 1626-1630 [doi]
- Multi-Source Domain Generalization for ECG-Based Cognitive Load Estimation: Adversarial Invariant and Plausible Uncertainty LearningJiyao Wang, Ange Wang, Haolong Hu, Kaishun Wu, Dengbo He. 1631-1635 [doi]
- I3FDM: IRIS Inpainting Via Inverse Fusion of Diffusion ModelsChenyang Li, Zhili Zhang, Peipei Li, Zhaofeng He. 1636-1640 [doi]
- Matpr-Unet: A Multi Attention Two-Path Residual Unet for Focal Cortical Dysplasia Lesions SegmentationWenjing Zhang, Hao Yu, Manli Zhang, Gongpeng Cao, Guixia Kang, Lixin Cai. 1641-1645 [doi]
- Normalization is All You Need: Robust Full-Range Contactless SpO2 Estimation Across UsersQijia Shao, Li Zhu, Mohsin Y. Ahmed, Korosh Vatanparvar, Migyeong Gwak, Nafiul Rashid, Jungmok Bae, Jilong Kuang, Alex Gao 0001. 1646-1650 [doi]
- Memory-Augmented Dual-Domain Unfolding Network for MRI ReconstructionJiawei Jiang, Jie Wu, Yueqian Quan, Jiacheng Chen, Jianwei Zheng 0001. 1651-1655 [doi]
- Fedsoda: Federated Cross-Assessment and Dynamic Aggregation for Histopathology SegmentationYuan Zhang, Yaolei Qi, Xiaoming Qi, Lotfi Senhadji, Yongyue Wei, Feng Chen, Guanyu Yang. 1656-1660 [doi]
- Single-Source Domain Generalization in Fundus Image Segmentation Via Moderating and Interpolating Input Space AugmentationBoon Peng Yap, Beng-Koon Ng. 1661-1665 [doi]
- UNAD: Universal Anatomy-Initialized Noise Distribution Learning Framework Towards Low-Dose CT DenoisingLingrui Gu, Weijian Deng, Guoli Wang. 1671-1675 [doi]
- Deep Fusion of Shifted MLP and CNN for Medical Image SegmentationChengyu Yuan, Hao Xiong, Guoqing Shangguan, Hualei Shen, Dong Liu, Haojie Zhang, Zhonghua Liu, Kun Qian, Bin Hu, Björn W. Schuller, Yoshiharu Yamamoto, Shlomo Berkovsky. 1676-1680 [doi]
- In-The-Wild Physiological-Based Stress Detection Using Federated StrategyPo-Chen Lin, Jeng-Lin Li, Woan-Shiuan Chien, Chi-Chun Lee. 1681-1685 [doi]
- Freeze the Backbones: a Parameter-Efficient Contrastive Approach to Robust Medical Vision-Language Pre-TrainingJiuming Qin, Che Liu, Sibo Cheng, Yike Guo, Rossella Arcucci. 1686-1690 [doi]
- A Method for X-Ray Image Landmarks Localization using Cyclic Coordinate-Guided StrategyXianglong Wang, Xifeng An, Eric Rigall, Shu Zhang 0002, Hui Yu 0001, Junyu Dong. 1691-1695 [doi]
- Fedmm: Federated Multi-Modal Learning with Modality Heterogeneity in Computational PathologyYuanzhe Peng, Jieming Bian, Jie Xu 0001. 1696-1700 [doi]
- Fall Prediction by a Spatio-Temporal Multi-Channel Causal Model from Wearable Sensors DataGuorui Liao, Jiawei Liu, Yuxuan Liang, Shu Wang, Li Liu. 1701-1705 [doi]
- Brain Structure-Function Interaction Network for Fluid Cognition PredictionJing Xia, Yi Hao Chan, Deepank Girish, Jagath C. Rajapakse. 1706-1710 [doi]
- Pre-Post Interaction Learning for Brain Tumor Segmentation with Missing MRI ModalitiesLinyu Xing, Mengxi Chen, Jiangchao Yao, Ya Zhang 0002, Yanfeng Wang. 1711-1715 [doi]
- CT and MRI Fusion with Anisotropic Guided FilteringYuping Huang, Weisheng Li 0001, Guofen Wang, Xiaoyu Qiao, Huanyu Chen. 1716-1720 [doi]
- Effective Connectivity-Based Multi-View Feature Learning Method for Dementia Diagnosis with FNIRS SignalYingwei Zhang, Changru Guo, Yiqiang Chen, Zeping Lv, Qing Li. 1721-1725 [doi]
- Image2Points: A 3D Point-Based Context Clusters GAN for High-Quality Pet Image ReconstructionJiaqi Cui, Yan Wang, Lu Wen, Pinxian Zeng, Xi Wu, Jiliu Zhou, Dinggang Shen. 1726-1730 [doi]
- Sam-Guided Enhanced Fine-Grained Encoding with Mixed Semantic Learning for Medical Image CaptioningZhenyu Zhang, Benlu Wang, Weijie Liang, Yizhi Li, Xuechen Guo, Guanhong Wang, Shiyan Li, Gaoang Wang. 1731-1735 [doi]
- SDEMG: Score-Based Diffusion Model for Surface Electromyographic Signal DenoisingYu-Tung Liu, Kuan-Chen Wang, Kai-Chun Liu, Sheng-Yu Peng, Yu Tsao. 1736-1740 [doi]
- EEG-Based Fast Auditory Attention Detection in Real-Life Scenarios Using Time-Frequency Attention MechanismZhuang Xie, Jianguo Wei, Wenhuan Lu, Zhongjie Li, Chunli Wang, Gaoyan Zhang. 1741-1745 [doi]
- A Bi-Pyramid Multimodal Fusion Method for the Diagnosis Of Bipolar DisordersGuoxin Wang, Sheng Shi, Shan An, Fengmei Fan, Wenshu Ge, Qi Wang, Feng Yu, Zhiren Wang. 1746-1750 [doi]
- Multi-Task Self-Supervised Learning for Medical Image SegmentationBo Wang, Hang Zhao, Xiongfei Li, Mingjie Tian, Bo Huang, Feiyang Yang. 1751-1755 [doi]
- Cross-Modal Synthesis of Structural MRI and Functional Connectivity Networks via Conditional ViT-GANsYuda Bi, Anees Abrol, Jing Sui, Vince D. Calhoun. 1756-1760 [doi]
- Self-Supervised Learning for Sleep Stage Classification with Temporal Augmentation and False Negative SuppressionFangyao Shen, Zehao Zhang, Yong Peng, Hongjie Guo, Lina Chen, Hong Gao 0001. 1761-1765 [doi]
- VMCC-NET: Uncovering Challenging Regions in Semi-Supervised Medical Image Segmentation with Voxel Mask Based Cyclic-Consistency NetworkYujie Liu, Peng Zhou, Zongmin Li. 1766-1770 [doi]
- SAM-OCTA: A Fine-Tuning Strategy for Applying Foundation Model OCTA Image Segmentation TasksChengliang Wang, Xinrun Chen, Haojian Ning, Shiying Li. 1771-1775 [doi]
- Semi-Supervised Domain Adaptation for Eeg-Based Sleep Stage ClassificationShitao Zheng, Dongrui Wu. 1776-1780 [doi]
- Self-Supervised Cross-Level Consistency Learning For Fundus Image ClassificationQi Bi, Hao Zheng, Xu Sun 0006, Jingjun Yi, Wentian Zhang, Yawen Huang, Yuexiang Li, Yefeng Zheng 0001. 1781-1785 [doi]
- Tackling Electrode Shift in Gesture Recognition with HD-EMG Electrode SubsetsJoao Pereira, Dimitrios Halatsis, Balint Hodossy, Dario Farina. 1786-1790 [doi]
- Flattening Singular Values of Factorized Convolution for Medical ImagesZexin Feng, Na Zeng, Jiansheng Fang, Xingyue Wang, XiaoXi Lu, Heng Meng, Jiang Liu 0001. 1791-1795 [doi]
- Confidence-Aware Spatial-Temporal Attention Graph Convolutional Network for Skeleton-Based Expert-Novice Level ClassificationTatsuki Seino, Naoki Saito 0006, Takahiro Ogawa 0001, Satoshi Asamizu, Miki Haseyama. 1796-1800 [doi]
- Deep Manifold Transformation for Protein Representation LearningBozhen Hu, Zelin Zang, Cheng Tan 0012, Stan Z. Li. 1801-1805 [doi]
- Temporal-Spatial Prediction: Pre-Training on Diverse Datasets for EEG ClassificationZiyi Li, Li-Ming Zhao, Wei-Long Zheng, Bao-Liang Lu. 1806-1810 [doi]
- Nonlinearity Detection and Compensation for EEG-Based Speech TrackingJohanna Wilroth, Emina Alickovic, Martin A. Skoglund, Martin Enqvist. 1811-1815 [doi]
- Out-of-Distribution Detection for Learning-Based Chest X-Ray DiagnosisWenlong Chen, Chuanwen Feng, Ao Ke, Xike Xie, S. Kevin Zhou. 1816-1820 [doi]
- Prompt-Based Personalized Federated Learning for Medical Visual Question AnsweringHe Zhu, Ren Togo, Takahiro Ogawa 0001, Miki Haseyama. 1821-1825 [doi]
- Efficient Polyp Segmentation via Integrity LearningZiqiang Chen, Kang Wang, Yun Liu. 1826-1830 [doi]
- A Robust and Scalable Method with an Analytic Solution for Multi-Subject FMRI Data AnalysisTrung Vu, Hanlu Yang 0001, Francisco Laport, Ben Gabrielson, Vince D. Calhoun, Tülay Adali. 1831-1835 [doi]
- Multitask Classification of Antimicrobial Peptides for Simultaneous Assessment of Antimicrobial Property and Structural FoldMichaela Areti Zervou, Effrosyni Doutsi, Yannis Pantazis, Panagiotis Tsakalides. 1836-1840 [doi]
- Functional Emotion Transformer for EEG-Assisted Cross-Modal Emotion RecognitionWei-Bang Jiang, Ziyi Li, Wei-Long Zheng, Bao-Liang Lu. 1841-1845 [doi]
- Breast Ultrasound Computer-Aided Diagnosis Using Structure-Aware Triplet Path NetworksErlei Zhang, Weihao Chen, Xiaowei Xu 0004, Zhicheng Zhang, Jinglei Li. 1846-1850 [doi]
- Unidirectional Brain-Computer Interface: Artificial Neural Network Encoding Natural Images to FMRI Response in the Visual CortexRuixing Liang, Xiangyu Zhang, Qiong Li, Lai Wei, Hexin Liu, Avisha Kumar, Kelley M. Kempski Leadingham, Joshua Punnoose, Leibny Paola Garcia, Amir Manbachi. 1851-1855 [doi]
- Progressive Learning Based Knowledge Distillation for Low Resolution Cerebral Microbleed SegmentationTianxiang Xia, Rong Zhang, Zhenzuo Chen, Guomin Xie, Xiping Wu, Zhongyue Lv, Lijun Guo. 1856-1860 [doi]
- PN-DetX: A Dedicated Framework for Pulmonary Nodule Detection in X-Ray ImagesChenglin Liu, Binquan Wang, Zhi Wu. 1861-1865 [doi]
- Real-Time Privacy-Preserving Fall Risk Assessment with a Single Body-Worn Tracking CameraChiao-Yi Wang, Faranguisse Kakhi Sadrieh, Yi-Ting Shen, Giovanni Oppizzi, Li-Qun Zhang, Yang Tao. 1866-1870 [doi]
- CEMOAE: A Dynamic Autoencoder with Masked Channel Modeling for Robust EEG-Based Emotion RecognitionYu-Ting Lan, Wei-Bang Jiang, Wei-Long Zheng, Bao-Liang Lu. 1871-1875 [doi]
- DCL-Net: Dual Contrastive Learning Network for Semi-Supervised Multi-Organ SegmentationLu Wen, Zhenghao Feng, Yun Hou, Peng Wang, Xi Wu, Jiliu Zhou, Yan Wang. 1876-1880 [doi]
- Dual Contrastive Learning Guided Pathological Image Re-StainingYuexiao Liang, Zhineng Chen, Xin Chen, Caiyan Jia, Xiongjun Ye, Xieping Gao. 1881-1885 [doi]
- Model-Based Label-to-Image Diffusion for Semi-Supervised Choroidal Vessel SegmentationKun Huang, Xiao Ma 0011, Na Su, Songtao Yuan, Qiang Chen 0004. 1886-1890 [doi]
- Medical Vision-Language Representation Learning with Cross-Modal Multi-Teacher Contrastive DistillationBingzhi Chen, Jiawei Zhu, Yishu Liu, Biqing Zeng, Jiahui Pan, Meirong Ding. 1891-1895 [doi]
- Label Rectified and Graph Adaptive Semi-Supervised Regression for Electrode Shifted Gesture RecognitionChengxi Zhu, Yong Peng 0001, Yinfeng Fang, Wanzeng Kong. 1896-1900 [doi]
- High-Accuracy Anxiety Disorder Identification Through Subspace-Enhanced Hypergraph Neural NetworkYibin Tang, Jikang Ding, Aimin Jiang, Chun Wang, Yuan Gao. 1901-1905 [doi]
- Hybrid Module with Multiple Receptive Fields and Self-Attention Layers for Medical Image SegmentationWenbo Qi, Wenyong Zhou, Ngai Wong, S. C. Chan 0001. 1906-1910 [doi]
- ADHD Diagnosis and Biomarker Detection Based on Multimodal Graph Convolutional Neural NetworkYuan Gao 0007, Xiaotong Wang, Aimin Jiang, Ying Chen 0013, Yibin Tang. 1911-1915 [doi]
- HIQ: One-Shot Network Quantization for Histopathological Image ClassificationXinrui Chen, Renao Yan, Yizhi Wang, Jiawen Li 0005, Junru Cheng, Tian Guan, Yonghong He. 1916-1920 [doi]
- EEG Emotion Recognition Based on Dynamical Graph Attention NetworkYi Guo, Chao Tang, Hao Wu, Badong Chen. 1921-1925 [doi]
- Eigendecomposition-Based Spatial-Temporal Attention for Brain Cognitive States IdentificationJiwon Lee, Eunsong Kang, Junyeong Maeng, Heung-Il Suk. 1921-1925 [doi]
- Multimodal Multi-View Spectral-Spatial-Temporal Masked Autoencoder for Self-Supervised Emotion RecognitionPengxuan Gao, Tianyu Liu, Jia-Wen Liu, Bao-Liang Lu, Wei-Long Zheng. 1926-1930 [doi]
- Semi-Supervised Volumetric Medical Image Segmentation via Class Prototype Guided Distribution-Aligned Representation LearningXiangyu Kong, Zeyu Ren, Lu Liu. 1931-1935 [doi]
- CC-DA: Cross-Domain Consistency Data Augmentation for 3D Tumor SegmentationJiezhou He, Zhiming Luo, Wei Peng, Songzhi Su, Shaozi Li. 1936-1940 [doi]
- A DenseNet-Based Method for Decoding Auditory Spatial Attention with EEGXiran Xu, Bo Wang, Yujie Yan, Xihong Wu, Jing Chen. 1946-1950 [doi]
- SPTESleepNet: Automatic Sleep Staging Model Based On Strip Patch Embeddings And Transformer EncoderXiao Chen, Xiaokun Dai, Xueli Liu, Xinrong Chen. 1951-1955 [doi]
- Non-iterative Pyramid Network for Unsupervised Deformable Medical Image RegistrationZongmin Li, Xuanting Li, Jiayue Fan, Zhonghao Du, Chaozhi Yang. 1956-1960 [doi]
- A Novel Medical Image Fusion Framework Integrating Multi-scale Encoder-Decoder with Discrete Wavelet DecompositionRenhe Liu, Yu Liu, Han Wang, Kai Hu, Shan Du. 1961-1965 [doi]
- An Accurate and Efficient Neural Network for OCTA Vessel Segmentation and a New DatasetHaojian Ning, Chengliang Wang, Xinrun Chen, Shiying Li. 1966-1970 [doi]
- GM-VRC: Semantic Topological Data Ensemble Approach for EEG Signal ClassificationSrikireddy Dhanunjay Reddy, Tharun Kumar Reddy. 1971-1975 [doi]
- A Learning-Based Multi-Node Fusion Positioning Method Using Wearable Inertial SensorsYifan Song, Songpengcheng Xia, Jiarui Yang, Ling Pei. 1976-1980 [doi]
- MMS: Morphology-Mixup Stylized Data Generation for Single Domain Generalization in Medical Image SegmentationXiaochen He, Baoyao Yang, Fei Lyu 0004. 1981-1985 [doi]
- DualGCN-MIL: Whole Slide Image Classification Based on Double Relationship Graph LearningMei Yu, Hexin Wang, Xuzhou Fu, Jie Gao, Zhiqiang Liu, Xuewei Li. 1986-1990 [doi]
- Distribution-Aware Contrastive Learning for Robust Medical Image SegmentationZheyun Qin, Xiaoming Xi, Yilong Yin. 1991-1995 [doi]
- Modeling Quasi-Periodic Dependency via Self-Supervised Pre-Training for Respiratory Sound ClassificationWenjie Song 0003, Jiqing Han 0001, Jianchen Li, Guibin Zheng, Tieran Zheng, Yongjun He. 1996-2000 [doi]
- CEDNet: A Continuous Emotion Detection Network for Naturalistic Stimuli Using MEG SignalsZeming He, Gaoyan Zhang. 2001-2005 [doi]
- Texture-Unet: A Texture-Aware Network for Bone Marrow Smear Whole-Slide Image Region of Interest SegmentationJian Chen, Xing Wu, Chengliang Wang, Zailin Yang, Xuelian Wu, Longrong Ran, Yao Liu. 2006-2010 [doi]
- Improving Limited Supervised Foot Ulcer Segmentation Using Cross-Domain Augmentation StrategiesShang-Jui Kuo, Po-Han Huang, Chia-Ching Lin, Jeng-Lin Li, Ming-Ching Chang. 2011-2015 [doi]
- BNMTrans: A Brain Network Sequence-Driven Manifold-Based Transformer for Cognitive Impairment Detection Using EEGRuihan Qin, Zhenxi Song, Huixia Ren, Zian Pei, Lin Zhu, Xue Shi, Yi Guo, Honghai Liu 0001, Min Zhang, Zhiguo Zhang. 2016-2020 [doi]
- EmoTVR: A Hybrid Model to Estimate Continuous-Time and Continuous-Level Emotion from ElectroencephalographyXinxu Zhou, Zhen Liang, Weishan Ye, Junqi Xue, Honghai Liu, Min Zhang, Zhiguo Zhang. 2021-2025 [doi]
- Clinical Scores Prediction and Medication Adjustment for Course of Parkinson's DiseaseHan Chen, Wenxuan Wu, Xiaofen Xing, Xiangmin Xu. 2026-2030 [doi]
- Learning a Convex Patch-Based Synthesis Model via Deep EquilibriumStanislas Ducotterd, Sebastian Neumayer, Michael Unser. 2031-2035 [doi]
- A Neurophysiological-Auditory "Listen Receipt" for Communication EnhancementChristine Beauchene, Michael S. Brandstein, Thomas F. Quatieri, Eric Thompson, Christopher J. Smalt. 2036-2040 [doi]
- Transforming Cardiovascular Health: a Transformer-Based Approach to Continuous, Non-Invasive Blood Pressure Estimation via Radar SensingNastassia Vysotskaya, Noah Maul, Alessandra Fusco, Souvik Hazra, Jens Harnisch, Tomás Arias-Vergara, Andreas K. Maier. 2041-2045 [doi]
- Multimodal Breathing Rate Estimation Using Facial Motion and RPPG From RGB CameraMigyeong Gwak, Korosh Vatanparvar, Li Zhu, Nafiul Rashid, Mohsin Y. Ahmed, Jungmok Bae, Jilong Kuang, Alex Gao 0001. 2046-2050 [doi]
- A Neural Syntax Parser for Coronary Artery Anatomical Labeling in Coronary CT AngiographyChen Zhou, Lingjing Hu. 2051-2055 [doi]
- Adaptive Multiview Community-Preserved Graph Convolutional Network for Multiatlas-Based Functional Connectivity AnalysisWei Wang, Xingcan Hu, Li Xiao, Yu-Ping Wang. 2056-2060 [doi]
- Augmenting Transformer Autoencoders with Phenotype Classification for Robust Detection of Psychotic RelapsesNiki Efthymiou, George Retsinas, Panagiotis Paraskevas Filntisis, Petros Maragos. 2061-2065 [doi]
- Localization and Tracking of Gold Nanoparticles Using mmWave FMCW RadarYonathan Eder, Ravit Abel, Avi Schroeder, Yonina C. Eldar. 2066-2070 [doi]
- Multimodal Imaging Feature Extraction with Reference Canonical Correlation Analysis Underlying IntelligenceRam Sapkota, Bishal Thapaliya, Pranav Suresh, Bhaskar Ray, Vince D. Calhoun, Jingyu Liu 0001. 2071-2075 [doi]
- Graph-Based Permutation Patterns for the Analysis of Task-Related FMRI Signals on DTI Networks in Mild Cognitive ImpairmentJohn Stewart Fabila-Carrasco, Avalon Campbell-Cousins, Mario A. Parra-Rodriguez, Javier Escudero. 2076-2080 [doi]
- Patient-Adaptive and Learned Mri Data Undersampling Using Neighborhood ClusteringSiddhant Gautam, Angqi Li, Saiprasad Ravishankar. 2081-2085 [doi]
- Multi-Source Domain Adaptation with Transformer-Based Feature Generation for Subject-Independent EEG-Based Emotion RecognitionShadi Sartipi, Mujdat Çetin. 2086-2090 [doi]
- Heart Rate Variability Estimation with Dynamic Fine Filtering and Global-Local Context Outlier RemovalRamesh Kumar Sah, Md. Mahbubur Rahman, Viswam Nathan, Li Zhu, Jungmok Bae, Christina Rosa, Wendy Berry Mendes, Jilong Kuang, Jun Alex Gao. 2091-2095 [doi]
- Inducing Inductive Bias in Vision Transformer for EEG ClassificationRabindra Khadka, Pedro G. Lind, Gustavo B. M. Mello, Michael A. Riegler, Anis Yazidi. 2096-2100 [doi]
- End-To-End Personalized Cuff-Less Blood Pressure Monitoring Using ECG and PPG SignalsSuhas BN, Rakshith Sharma Srinivasa, Yashas Malur Saidutta, Jaejin Cho, Ching Hua Lee, Chouchang Yang, Yilin Shen, Hongxia Jin. 2101-2105 [doi]
- Domain Generalization with fourier Transform and soft thresholdingHongyi Pan, Bin Wang, Zheyuan Zhang, Xin Zhu, Debesh Jha, Ahmet Enis Çetin, Concetto Spampinato, Ulas Bagci. 2106-2110 [doi]
- Ballistocardiogram-Based Heart Rate Variability Estimation for Stress Monitoring using Consumer EarbudsDavid J. Lin, Md Mahbubur Rahman, Li Zhu, Viswam Nathan, Jungmok Bae, Christina Rosa, Wendy Berry Mendes, Jilong Kuang, Jun Alex Gao. 2111-2115 [doi]
- FEDKA: Federated Knowledge Augmentation for Multi-Center Medical Image Segmentation on non-IID DataYuhao Zhang, Shaoming Duan, Xinyu Zha, Jinhang Su, Peiyi Han, Chuanyi Liu. 2116-2120 [doi]
- De Novo Molecule Generation with Graph Latent Diffusion ModelConghao Wang, Hiok Hian Ong, Shunsuke Chiba, Jagath C. Rajapakse. 2121-2125 [doi]
- A Novel Discrete Fractional Complex Hadamard Transform for Medical Image EncryptionZi-Chen Fan, Di Li 0006, Susanto Rahardja. 2126-2130 [doi]
- Situational Signal Processing with Ecological Momentary Assessment: Leveraging Environmental Context for Cochlear Implant UsersTaylor Lawson, John H. L. Hansen. 2131-2135 [doi]
- Federated Learning of Tensor Generalized Linear Models with low Separation RankJose Hoyos Sanchez, Batoul Taki, Waheed U. Bajwa, Anand D. Sarwate. 2136-2140 [doi]
- Subgroup Identification Through Multiplex Community Structure Within Functional Connectivity NetworksH. Yang, Meiby Ortiz-Bouza, T. Vu, Francisco Laport, Vince D. Calhoun, Selin Aviyente, Tülay Adali. 2141-2145 [doi]
- Addressing Confounds in Functional Connectivity Analyses of Calcium ImagingDingding Ye, Charan Santhirasegaran, Ryan Pai, Genevera I. Allen, Joseph Young. 2146-2150 [doi]
- Lesion-Aware Open Set Medical Image Recognition with Domain ShiftYiqian Xu, Rui-Wei Zhao, Rui Feng. 2151-2155 [doi]
- Estimating Directed Spectral Information Flow between Multi-Resolution Time SeriesQiqi Xian, Zhe Sage Chen. 2156-2159 [doi]
- Electroencephalogram Sensor Data Compression Using an Asymmetrical Sparse Autoencoder with a Discrete Cosine Transform LayerXin Zhu, Hongyi Pan, Shuaiang Rong, Ahmet Enis Çetin. 2160-2164 [doi]
- Digital Pathology Image Deblurring Via Local Focus Quality AssessmentYuanpin Zhou, Huogen Wang, Yanfeng Bai, Yidong Wan, Chaohui Jin, Ming Chen, Xiaodong Teng. 2165-2169 [doi]
- An Audio-Textual Diffusion Model for Converting Speech Signals into Ultrasound Tongue Imaging DataYudong Yang, Rongfeng Su, Xiaokang Liu, Nan Yan, Lan Wang. 2170-2174 [doi]
- YOLO-Med : Multi-Task Interaction Network for Biomedical ImagesSuizhi Huang, Shalayiding Sirejiding, Yuxiang Lu, Yue Ding 0001, Leheng Liu, Hui Zhou, Hongtao Lu. 2175-2179 [doi]
- EOFD-Net: Edge Optimization and Feature Denoising for Weakly Supervised Deep Nuclei Segmentation with Point AnnotationsXipeng Pan, Feihu Hou, Zhenbing Liu, Siyang Feng, Rushi Lan. 2180-2184 [doi]
- Motion-Tolerant Radar-Based Heart Sound DetectionYu Rong, Kawon Han, Isabella Lenz, Daniel W. Bliss. 2185-2189 [doi]
- Semantic Reconstruction of Continuous Language from Meg SignalsBo Wang, Xiran Xu, Longxiang Zhang, Boda Xiao, Xihong Wu, Jing Chen. 2190-2194 [doi]
- Subtype-Specific Biomarkers of Alzheimer's Disease from Anatomical and Functional Connectomes via Graph Neural NetworksYi Hao Chan, Jun Liang Ang, Sukrit Gupta, Yinan He, Jagath C. Rajapakse. 2195-2199 [doi]
- Neural2speech: A Transfer Learning Framework for Neural-Driven Speech ReconstructionJiawei Li, Chunxu Guo, Li Fu, Lu Fan, Edward F. Chang, Yuanning Li. 2200-2204 [doi]
- Shifted-Rectangle-Window Based Transformer for non-Displaced Femoral Neck Fracture DiagnosisQichang Chen, Zhonghang Zhu, Lianxin Wang, Liansheng Wang. 2205-2209 [doi]
- Mosic: Multimodal Semantic Integrated Communication for Health Monitoring in Iot ScenariosMinxi Yang, Dahua Gao, Jiaxuan Li, Wenlong Xu, Xiaodan Song, Guangming Shi. 2210-2214 [doi]
- 2-Net: When Diffusion Meets DiscriminatorYuda Jin, Weidong Chen, Yuanhe Tian, Yan Song 0004, Chenggang Yan, Zhendong Mao. 2215-2219 [doi]
- Enhancing Generalization in Medical Visual Question Answering Tasks Via Gradient-Guided Model PerturbationGang Liu, Hongyang Li, Zerui He, Shenjun Zhong. 2220-2224 [doi]
- Do Self-Supervised Speech and Language Models Extract Similar Representations as Human Brain?Peili Chen, Linyang He, Li Fu, Lu Fan, Edward F. Chang, Yuanning Li. 2225-2229 [doi]
- Detection of Epileptic Seizures in Long Eeg Recordings Using an Anomaly Detector with Artifact RejectionKazi Mahmudul Hassan, Xuyang Zhao, Hidenori Sugano, Toshihisa Tanaka. 2230-2234 [doi]
- Joint Spatio-Temporal Filtering of Motion Imagery EEG Signals for Data Alignment in Transfer LearningAimin Jiang, Shanshan Hou, Yibin Tang, Yanping Zhu. 2235-2239 [doi]
- Patch-Level Knowledge Distillation and Regularization for Missing Modality Medical Image SegmentationRuilin Wang, Xiongfei Li, Mingjie Tian, Feiyang Yang, Xiaoli Zhang. 2240-2244 [doi]
- Learn From Zoom: Decoupled Supervised Contrastive Learning For WCE Image ClassificationKunpeng Qiu, Zhiying Zhou, Yongxin Guo. 2245-2249 [doi]
- V-DDPM: MRI Rician Noise Removal Model Based on VST and DDPMYue Hu, Huiying Xu, Xinzhong Zhu, Negalign Wake Hundera. 2250-2254 [doi]
- Deep Regression for Biological Age Estimation in Multiple Organs: Investigations on 40, 000 Subjects of the UK BiobankVeronika Ecker, Marcel Früh, Bin Yang, Sergios Gatidis, Thomas Küstner. 2255-2259 [doi]
- Contrmix: Progressive Mixed Contrastive Learning for Semi-Supervised Medical Image SegmentationMeisheng Zhang, Chenye Wang, Wenxuan Zou, Xingqun Qi, Muyi Sun, Wanting Zhou. 2260-2264 [doi]
- Multi-Label Abnormality Classification from 12-Lead ECG Using A 2D Residual U-NetSeorim Hwang, Jaebin Cha, Junyeong Heo, Sungpil Cho, Youngcheol Park. 2265-2269 [doi]
- Towards Disease-Aware Self-Supervised Dynamic Brain Network Learning For Mental DiagnosisZhiyong Jin, Guangqi Wen, Peng Cao 0001, Lingwen Liu, Jinzhu Yang, Xinrong Zhu, Osmar R. Zaïane, Fei Wang 0056. 2270-2274 [doi]
- Delineation of Prostate Cancer Via Enhanced AI-Based Algorithm In Ultrasound ImagesYiwen Ruan, Rui Jin, Zhaorui Liu, Caishan Wang, Lei Zhang, Tao Peng. 2275-2279 [doi]
- Residual Dense Swin Transformer for Continuous Depth-Independent Ultrasound ImagingJintong Hu, Hui Che, Zishuo Li, Wenming Yang. 2280-2284 [doi]
- Predicting RTMS Treatment Effects Using Open-Loop Control and Neural ManifoldHongyu Shi, Kaizhong Zheng, Huaning Wang, Baojuan Li, Badong Chen. 2285-2289 [doi]
- SRECT: Machine-Specific Spatial-Resolution Enhancement in Computed TomographyLi Li, Jiahui He, Yunxin Tang, Youjian Zhang, Jie Wang, Guanqun Zhou, Zhicheng Zhang. 2290-2294 [doi]
- A Novel Multi-Atlas Fusion Model Based On Contrastive Learning For Functional Connectivity Graph DiagnosisJiayu Zhang, Dexuan Xu, Yiwei Lou, Yu Huang. 2295-2299 [doi]
- 3D Automated Quantitative Calculations Based on CT Images of the Hip JointPeng Du, Baijia Ni, Xiaodong Ju, Xingce Wang, Zhongke Wu, Gege Lou, Keying Hua. 2300-2304 [doi]
- Enhancing Healthcare with EOG: A Novel Approach to Sleep Stage ClassificationSuvadeep Maiti, Shivam Kumar Sharma, Raju S. Bapi. 2305-2309 [doi]
- An Attention-Enhanced Retentive Broad Learning System for Subject-Generic Emotion Recognition from EEG SignalsXiaolong Zhong, Fei Wu, Zhong Yin, Gang Liu. 2310-2314 [doi]
- Coupling Self-Supervised and Supervised Contrastive Learning for Multiple Classification of Cervical Cytological Whole Slide ImagesLang Wang, Peng Jiang, Wensi Duan, Dehua Cao, Baochuan Pang, Juan Liu. 2315-2319 [doi]
- Robust Decoding of the Auditory Attention from EEG Recordings Through Graph Convolutional NetworksSiqi Cai, Ran Zhang, Haizhou Li 0001. 2320-2324 [doi]
- A Supervised Information Enhanced Multi-Granularity Contrastive Learning Framework for EEG Based Emotion RecognitionXiang Li, Jian Song, Zhigang Zhao, Chunxiao Wang, Dawei Song 0001, Bin Hu 0001. 2325-2329 [doi]
- Multimodal Survival Ensemble Network: Integrating Genomic and Histopathological Insights for Enhanced Cancer PrognosisChenyi Zhou, Hualiang Wang, Xiaomeng Li 0001, Wanlu Liu, Zuozhu Liu. 2330-2334 [doi]
- Selective Domain-Invariant Feature for Generalizable Deepfake DetectionYingxin Lai, Guoqing Yang, Yifan He, Zhiming Luo, Shaozi Li. 2335-2339 [doi]
- Multi-Task Cascaded Attention Network for Brain Tumor Segmentation and ClassificationGaoxiang Li, Ying Zhang, Yanlin Luo. 2340-2344 [doi]
- Gland Segmentation Via Dual Encoders and Boundary-Enhanced AttentionHuadeng Wang, Jiejiang Yu, Bingbing Li, Xipeng Pan, Zhenbing Liu, Rushi Lan, Xiaonan Luo. 2345-2349 [doi]
- Compact and De-Biased Negative Instance Embedding for Multi-Instance Learning on Whole-Slide Image ClassificationJoohyung Lee 0003, Heejeong Nam, Kwanhyung Lee, Sangchul Hahn. 2350-2354 [doi]
- TD-GPT: Target Protein-Specific Drug Molecule Generation GPTZhengda He, Linjie Chen, Jiaying Xu, Hao Lv, Rui-ning Zhou, Jianhua Hu, Yadong Chen, Yang Gao. 2355-2359 [doi]
- A Complete Method for the 3D Reconstruction of Axonal Pathways from 2 Orthogonal 3D OCT Images of the Lamina CribrosaNan Ding, Florence Rossant, Hélène Urien, Jérémie Sublime, Paul Bastelica, Christophe Baudouin, Michel Pâques. 2360-2364 [doi]
- A Property-Guided Diffusion Model For Generating Molecular GraphsChangsheng Ma, Taicheng Guo, Qiang Yang 0015, Xiuying Chen, Xin Gao 0001, Shangsong Liang, Nitesh Chawla, Xiangliang Zhang 0001. 2365-2369 [doi]
- SASA: Saliency-Aware Self-Adaptive Snapshot Compressive ImagingYaping Zhao, Edmund Y. Lam. 2370-2374 [doi]
- Fast Alignment Algorithm for Cryo-EM Particle Images Based on Harmonic AnalysisMingtao Huang, Ranhao Zhang, Xueming Li, Yuan Shen 0001. 2375-2379 [doi]
- Unified Srgb Real Noise Synthesizing with Adaptive Feature ModulationWenbo Li 0001, Zhipeng Mo, Yilin Shen, Hongxia Jin. 2380-2384 [doi]
- Dual Directional Complementary Gradient Fusion and Deep Refinement for Hyperspectral Image Super ResolutionYinwei Du, Jian Wang, Xing Wu, Xian-Hua Han. 2385-2389 [doi]
- Deep Versatile Hyperspectral Reconstruction Model from A Snapshot Measurement with Arbitrary MasksTakumi Takabe, Xian-Hua Han, Yen-Wei Chen 0001. 2390-2394 [doi]
- Hybrid Convolution-Transformer for Lightweight Single Image Super-ResolutionJiuqiang Li, Yutong Ke. 2395-2399 [doi]
- Hybrid Domain Learning towards Light Field Spatial Super-Resolution using Heterogeneous ImagingZean Chen, Yeyao Chen, Mei Yu 0001, Haiyong Xu, Gangyi Jiang. 2400-2404 [doi]
- SPGFusion: A Semantic Prior Guided Infrared and Visible Image Fusion NetworkQuanquan Xiao, Haiyan Jin, Haonan Su, Fengyuan Zuo, Yuanlin Zhang 0003, Zhaolin Xiao, Bin Wang 0046. 2405-2409 [doi]
- Darkshot: Lighting Dark Images with Low-Compute and High-QualityJiazhang Zheng, Lei Li, Qiuping Liao, Cheng Li, Li Li, Yangxing Liu. 2410-2414 [doi]
- IFNet: Imaging and Focusing Network for handheld mmWave DevicesYadong Li, Dongheng Zhang, Ruixu Geng, Jincheng Wu, Yang Hu, Qibin Sun, Yan Chen. 2415-2419 [doi]
- Sandwiched Lo-Res Simulation for Scalable Flood ModelingRefaldi I. D. Putra, Tatsuya Ishikawa, Naomi Simumba, Michiaki Tatsubori. 2420-2424 [doi]
- Enhanced Low-Rank and Sparse Tucker Decomposition For Image CompletionWenwu Gong, Zhejun Huang, Lili Yan. 2425-2429 [doi]
- Seam Mask Guided Partial Reconstruction with Quantum-Inspired Local Aggregation For Deep Image StitchingChen-Bin Feng, Jie Zhang, Jiaxue Li, Yicong Zhou. 2430-2434 [doi]
- T-Pixel2Mesh: Combining Global and Local Transformer for 3D Mesh Generation from a Single ImageShijie Zhang, Boyan Jiang, Keke He, Junwei Zhu, Ying Tai, Chengjie Wang, Yinda Zhang 0001, Yanwei Fu 0001. 2435-2439 [doi]
- PVitNet: An Effective Approach for Android Malware Detection Using Pyramid Feature Processing and Vision TransformerDenghui Yang, Yifan Ding, Hao Zhang, Yizhou Li. 2440-2444 [doi]
- SAM-DEBLUR: Let Segment Anything Boost Image DeblurringSiwei Li, Mingxuan Liu, Yating Zhang, Shu Chen, Haoxiang Li, Zifei Dou, Hong Chen. 2445-2449 [doi]
- Reference Line Network: On Simultaneous Gaussian Line Detection and Connection Graph InferenceQian Li, Rao Fu, Cheng Wen. 2450-2454 [doi]
- Live Iterative Ptychography with Projection-Based AlgorithmsSimon Welker, Tal Peer, Henry N. Chapman, Timo Gerkmann. 2455-2459 [doi]
- Deep Plug-and-Play Algorithm for Unsaturated ImagingJorge Bacca, Brayan Monroy, Henry Arguello. 2460-2464 [doi]
- Iteratively Preconditioned Guidance of Denoising (Diffusion) Models For Image RestorationTom Tirer. 2465-2469 [doi]
- Score-based Diffusion Models for Photoacoustic Tomography Image ReconstructionSreemanti Dey, Snigdha Saha, Berthy T. Feng, Manxiu Cui, Laure Delisle, Oscar Leong, Lihong V. Wang, Katherine L. Bouman. 2470-2474 [doi]
- Imaging An Evolving Black Hole By Leveraging Shared StructureYvette Y. Lin, Angela F. Gao, Katherine L. Bouman. 2475-2479 [doi]
- A Fast Blind Deblurring Algorithm Using Local Gradient Product PriorJixuan Liang, Yanshan Li. 2480-2484 [doi]
- SPEC-NERF: Multi-Spectral Neural Radiance FieldsJiabao Li, Yuqi Li, Ciliang Sun, Chong Wang, Jinhui Xiang. 2485-2489 [doi]
- KD-Former: Transformer Knowledge Distillation for Image MattingZiwen Li, Bo Xu, Cheng Lu. 2490-2494 [doi]
- Detection in Complex Scenes Using Rgb and Depth Multimodal Feature FusionShengli Yan, Yuan Rao, Wenhui Hou. 2495-2499 [doi]
- Hyperspectral Image Reconstruction Using Hierarchical Neural Architecture Search from A Snapshot ImageXian-Hua Han, Huiyan Jiang, Yen-Wei Chen. 2500-2504 [doi]
- Plug-And-Play Algorithm Coupled with Low-Rank Quadratic Envelope Regularization for Compressive Spectral ImagingJorge Bacca, Marcus Carlsson, Brayan Monroy, Henry Arguello. 2505-2509 [doi]
- SGM: A Dataset for 3D Garment Reconstruction from Single Hand-Drawn SketchJia Chen, Jinlong Qin, Saishang Zhong, Kai Yang, Xinrong Hu, Tao Peng, Rui Li. 2510-2514 [doi]
- Image Restoration with Generalized L2 Loss and Convergent Plug-and-Play PriorsKartheek Kumar Reddy Nareddy, Abijith Jagannath Kamath, Chandra Sekhar Seelamantula. 2515-2519 [doi]
- Temporally-Guided Total Variation For Robust Spatiotemporal Fusion Of Satellite ImagesRyosuke Isono, Shunsuke Ono. 2520-2524 [doi]
- Variational Analysis of Adversarial Regularization for Solving Inverse ProblemsAbhishek Shreekant Bhandiwad, Abijith Jagannath Kamath, Siddarth Asokan, Chandra Sekhar Seelamantula. 2525-2529 [doi]
- Single-Pixel Imaging Of Dynamic Flows Using Neural Ode RegularizationAleksei Sholokhov, Joshua Rapp, Saleh Nabi, Steven L. Brunton, J. Nathan Kutz, Hassan Mansour. 2530-2534 [doi]
- Two-Edge-Resolved 3d Non-Line-of-Sight Imaging: A Fisher Information Equalized DiscretizationRobinson Czajkowski, John Murray-Bruce. 2535-2539 [doi]
- Fusion of Multi-Resolution Seismic Tomography Maps with Physics-Informed Probability Graphical ModelsZheng Zhou, Peter Gerstoft, Kim Olsen. 2540-2544 [doi]
- PMDI: Combining Parametric-Model and Depth-Aware Implicit Function for Single-View Human ReconstructionSaishang Zhong, Jiashu Wang, Xinrong Hu. 2545-2549 [doi]
- An Efficient Algorithm For Clustered Multi-Task Compressive SensingAlexander Lin, Demba E. Ba. 2550-2554 [doi]
- Deep Learning Based Single-Shot Profilometry by Three-Channel Binary-Defocused ProjectionTianbo Liu 0007, Songping Mai, Xiaoyu Wang. 2555-2559 [doi]
- Self-Supervised Spatially Variant PSF Estimation for Aberration-Aware Depth-from-DefocusZhuofeng Wu 0003, Yusuke Monno, Masatoshi Okutomi. 2560-2564 [doi]
- Flare-Free Vision: Empowering Uformer with Depth InsightsYousef Kotp, Marwan Torki. 2565-2569 [doi]
- Reflection Removal Using Recurrent Polarization-to-Polarization NetworkWenjiao Bian, Yusuke Monno, Masatoshi Okutomi. 2570-2574 [doi]
- An Efficient Transformer For Demosaicing Via Compressed Multi-Branch Attention MechanismXun Wu, Fanqing Meng, Yaqi Wu, Jiawei Zhang 0002, Feng Zhang. 2575-2576 [doi]
- TA2P: Task-Aware Adaptive Pruning Method for Image Classification on Edge DevicesYanting Wang, Feng Li, Han Zhang. 2580-2584 [doi]
- Coordinate-Based Neural Network for Fourier Phase RetrievalTingyou Li, Zixin Xu, Yong S. Chu, Xiaojing Huang, Jizhou Li. 2585-2589 [doi]
- Spectro-Spatial Hyperspectral Image Reconstruction From Interferometric AcquisitionsDaniele Picone, Mohamad Jouni, Mauro Dalla Mura. 2590-2594 [doi]
- Opnet: Deep Occlusion Perception Network with Boundary Awareness for Amodal Instance SegmentationShihui Zhang, Ziteng Xue, Yuhong Jiang, Houlin Wang. 2595-2599 [doi]
- Toward Quantifiable Face age TransformationLing Lin 0002, Congcong Zhu, Lin Zhou, Jingrun Chen. 2600-2604 [doi]
- IMFIT: Normal Estimation via Learning Neural Implicit SurfaceRao Fu, Cheng Wen, Qian Li. 2605-2609 [doi]
- Semi-Decoupled 6D Pose Estimation via Multi-Modal Feature FusionZhenhu Zhang, Xin Cao, Li Jin, Xueying Qin, Ruofeng Tong. 2610-2614 [doi]
- DAP: Domain-Aware Prompt Learning for Vision-and-Language NavigationTing Liu, Yue Hu, Wansen Wu, Youkai Wang, Kai Xu, Quanjun Yin. 2615-2619 [doi]
- Unsupervised Disparity Estimation for Light Field VideosShansi Zhang, Edmund Y. Lam. 2620-2624 [doi]
- SBM: Smoothness-Based Minimization for Domain GeneralizationChunqing Ruan, Mengzhu Wang, Shanshan Wang, Tianyi Liang, Wei Yu. 2625-2629 [doi]
- Segment Anything Model Meets Image HarmonizationHaoxing Chen, Yaohui Li, Zhangxuan Gu, Zhuoer Xu, Jun Lan, Huaxiong Li. 2630-2634 [doi]
- CoSLR: Contrastive Chinese Sign Language Recognition with prior knowledge And Multi-Tasks Joint LearningTian Yang, Cong Shen, Tiantian Yuan. 2635-2639 [doi]
- Efficient Fusion of Depth Information for Defocus DeblurringJucai Zhai, Yang Liu, Pengcheng Zeng, Chihao Ma, Xinan Wang, Yong Zhao 0010. 2640-2644 [doi]
- Highlight Removal Network Based on an Improved Dichromatic Reflection ModelKun Hu, Zhaoyangfan Huang, Xingjun Wang. 2645-2649 [doi]
- Complementary Fusion Network Based on Frequency Hybrid Attention for PansharpeningYinghui Xing, Litao Qu, Kai Zhang, Yan Zhang, Xiuwei Zhang, Yanning Zhang. 2650-2654 [doi]
- Dropout Multi-Head Attention for Single Image Super-ResolutionChao Yang, Yong Fan, Cheng Lu. 2655-2659 [doi]
- Part Representation Learning with Teacher-Student Decoder for Occluded Person Re-IdentificationShang Gao, Chenyang Yu, Pingping Zhang, Huchuan Lu. 2660-2664 [doi]
- Flipping Consistent and Counterfactual Attention Network for Facial Expression RecognitionWenjie Liu, Xinlong Shi, Xianzhong Liu. 2665-2669 [doi]
- Mutuality Attribute Makes Better Video Anomaly DetectionXingshuo Han, Xiao Wang, Kui Jiang, Wei Liu, Ruimin Hu, Xuefeng Pan, Xin Xu. 2670-2674 [doi]
- Multi-Modality Conditional Diffusion Model for Time Series Forecasting of Live Sales VolumeLijun Wang. 2675-2679 [doi]
- PseKD: Phase-Shift Encoded Knowledge Distillation for Oriented Object Detection in Remote Sensing ImagesChao Wang, Yubiao Yue, Bingchun Luo, Yujie Chen, Jun Xue. 2680-2684 [doi]
- Channel-Spatial Transformer for Efficient Image Super-ResolutionJiuqiang Li, Shilei Zhu. 2685-2689 [doi]
- HMNet: Hierarchical Microscale-Aware Network for Infrared Small Target DetectionYueqian Quan, Honghui Xu, Yidong Yan, Hang Zheng, Jianwei Zheng 0001. 2690-2694 [doi]
- A Hierarchical Multi-Proxy Loss with Dynamic Main-Proxy for Deep Metric LearningLei Zhao, Xiao-lei Zhang. 2695-2699 [doi]
- SPY-Watermark: Robust Invisible Watermarking for Backdoor AttackRuofei Wang, Renjie Wan, Zongyu Guo, Qing Guo, Rui Huang. 2700-2704 [doi]
- Camera Calibration using a Single View of a Symmetric ObjectHui Zhang, Bingran Kuang, Yajie Zhao. 2705-2709 [doi]
- Correcting Faulty Road Maps by Image InpaintingSoojung Hong, KwangHee Choi. 2710-2714 [doi]
- SRP-UOD: Multi-Branch Hybrid Network Framework Based on Structural Re-Parameterization for Underwater Small Object DetectionJinyu Shi, Wenjie Wu. 2715-2719 [doi]
- Read, Spell and Repeat: Scene Text Recognition with Vision-Language Circular RefinementTaiwei Zhang, Zhenghui Hu, Weixin Li, Qingjie Liu, Yunhong Wang. 2720-2724 [doi]
- Changenet: Multi-Temporal Asymmetric Change Detection DatasetDeyi Ji, Siqi Gao, Mingyuan Tao, Hongtao Lu, Feng Zhao. 2725-2729 [doi]
- Diffusioninst: Diffusion Model for Instance SegmentationZhangxuan Gu, Haoxing Chen, Zhuoer Xu. 2730-2734 [doi]
- COLORFLOW: A Conditional Normalizing Flow for Image ColorizationWang-Yin, Peng Lu, Xujun Peng. 2735-2739 [doi]
- MTIDNet: A Multimodal Temporal Interest Detection Network for Video SummarizationXiaoyan Tian, Ye Jin, Zhao Zhang, Peng Liu, Xianglong Tang. 2740-2744 [doi]
- Skin Tone Disentanglement in 2D Makeup Transfer With Graph Neural NetworksMasoud Mokhtari, Fatemeh Taheri Dezaki, Timo Bolkart, Betty Mohler Tesch, Rahul Suresh, Amin Banitalebi-Dehkordi. 2745-2749 [doi]
- Child FER: Domain-Agnostic Facial Expression Recognition in Children Using a Secondary Image Diffusion ModelEungi Lee, Eung-Joo Lee, Syed Muhammad Anwar, Seok Bong Yoo. 2750-2754 [doi]
- Window-Based Convolutional Sparse Coding: Towards A Unified FrameworkLijian Yang, Jian-Xun Mi, Guofen Wang, Weisheng Li 0001. 2755-2759 [doi]
- NERF-GAZE: A Head-Eye Redirection Parametric Model for Gaze EstimationPengwei Yin, Jingjing Wang, Jiawu Dai, Xiaojun Wu. 2760-2764 [doi]
- VGDIFFZERO: Text-To-Image Diffusion Models Can Be Zero-Shot Visual GroundersXuyang Liu, Siteng Huang, Yachen Kang, Honggang Chen, Donglin Wang. 2765-2769 [doi]
- Mrtnet: Multi-Resolution Temporal Network for Video Sentence GroundingWei Ji 0008, You Qin, Long Chen 0016, Yinwei Wei, Yiming Wu 0005, Roger Zimmermann. 2770-2774 [doi]
- Human Guided Cross-Modal Reasoning with Semantic Attention Learning for Visual Question AnsweringLei Liao, Mao Feng, Meng Yang. 2775-2779 [doi]
- Learn to Cluster Faces with Better SubgraphsYuan Cao, Di Jiang, Guanqun Hou, Fan Deng 0007, Xinjia Chen, Qiang Yang 0001. 2780-2784 [doi]
- Hdrtvformer: Efficient Sdrtv-to-Hdrtv via Affine Transformation and Spatial-Aware TransformerHengsheng Zhang, Xinning Chai, Yuhong Zhang, Rong Xie, Li Song 0001. 2785-2789 [doi]
- Attribute-Aware Head Swapping Guided by 3d ModelingWenbo Zhou, Dongdong Chen, Jing Liao 0001, Jie Zhang 0073, Kejiang Chen, Weiming Zhang 0001, Nenghai Yu. 2790-2794 [doi]
- FFT-Based Selection and Optimization of Statistics for Robust Recognition of Severely Corrupted ImagesElena Camuffo, Umberto Michieli, Jijoong Moon, Daehyun Kim, Mete Ozay. 2795-2799 [doi]
- Estimating Exercise-Induced Fatigue from Thermal Facial ImagesManuel Lage Cañellas, Constantino Álvarez Casado, Le-Nguyen, Miguel Bordallo López. 2800-2804 [doi]
- Image Aesthetics Assessment Via Learnable QueriesZhiwei Xiong, Yunfan Zhang, Zhiqi Shen 0001, Peiran Ren, Han Yu 0001. 2805-2809 [doi]
- A Crowdsourcing Approach to Video Quality AssessmentBabak Naderi, Ross Cutler. 2810-2814 [doi]
- Style Adaptation for Domain-Adaptive Semantic SegmentationTing Li, Jianshu Chao, Deyu An. 2815-2819 [doi]
- Anomaly-Aware Semantic Self-Alignment Framework for Video-Based Person Re-IdentificationZhidan Ran, Xiaobo Lu, Wei Liu. 2820-2824 [doi]
- Implicit Neural Representation For Low-Overhead Graph-Based Holographic-Type CommunicationsTakuya Fujihashi, Sorachi Kato, Toshiaki Koike-Akino. 2825-2829 [doi]
- RL-LOGO: Deep Reinforcement Learning Localization for Logo RecognitionMasato Fujitake. 2830-2834 [doi]
- POSE-HMR: Heuristic Transformer with Postural Prior Constraints for 3D Human Mesh ReconstructionSongqi Pan, Sheng Liu, Yuan Feng, Yineng Zhang, Xiaopeng Tian, Jiantao Yang. 2835-2839 [doi]
- MuSR: Multi-Scale 3D Scenes Reconstruction based on Monocular VideoHan Gao, Hao Wu, Peiwen Dong, Yixin Xu, Fengyuan Xu, Sheng Zhong 0002. 2840-2844 [doi]
- Glocal Cascading Network for Topic Enhanced Visual StorytellingJiaqi Su, Weiran Chen, Yi Ji 0001, Chunping Liu. 2845-2849 [doi]
- Attention Decoupling for Query-Based Object DetectionJia-Wei Ma, Min Liang, Haixia Man, Shu Tian, Jingyan Qin, Xu-Cheng Yin. 2850-2854 [doi]
- Improving Learned Video Compression by Exploring Spatial RedundancyJiayu Yang, Chunhui Yang, Yongqi Zhai, Qi Wang, Xinghao Pan, Ronggang Wang. 2860-2864 [doi]
- NPRF: Neural Painted Radiosity Fields for Neural Implicit Rendering and Surface ReconstructionDriton Salihu, Adam Misik, Yuankai Wu, Constantin Patsch, Eckehard G. Steinbach. 2865-2869 [doi]
- Locality-Enhanced Transformer for Semantic Segmentation of High-Resolution Remote Sensing ImagesXin Li, Feng Xu, Runliang Xia, Nan Xu, Fan Liu, Chi Yuan, Qian Huang, Xin Lyu 0001. 2870-2874 [doi]
- Tokenmotion: Motion-Guided Vision Transformer for Video Camouflaged Object Detection VIA Learnable Token SelectionZifan Yu, Erfan Bank-Tavakoli, Meida Chen, Suya You, Raghuveer Rao, Sanjeev Agarwal, Fengbo Ren. 2875-2879 [doi]
- Feature-Distribution Perturbation and Calibration for Generalized ReidQilei Li, Jiabo Huang, Jian Hu 0002, Shaogang Gong. 2880-2884 [doi]
- Domain-Adaptive and Subgroup-Specific Cascaded Temperature Regression for Out-of-Distribution CalibrationJiexin Wang, Jiahao Chen, Bing Su. 2885-2889 [doi]
- ADIFT: Zero-Shot Generative Model Adaption Via Adaptive Domain-Invariant Feature TransferChaofei Wang, Xiangan Zhao, Kai Wang, Shuai Wu, Jiayu Xiao, Guotong Geng. 2890-2894 [doi]
- MGRL: Mutual-Guidance Representation Learning for Text-to-Image Person RetrievalTianle Lv, Shuang Li, Jiaxu Leng, Xinbo Gao 0001. 2895-2899 [doi]
- Improving Motion Deblur By Multi-Output LearningSidun Liu, Peng Qiao, Yong Dou. 2900-2904 [doi]
- Bandwidth-Efficient Inference for Nerual Image CompressionShanzhi Yin, Tongda Xu, Yongsheng Liang, Yuanyuan Wang, Yanghao Li, Yan Wang 0002, Jingjing Liu. 2905-2909 [doi]
- Language-Free Compositional Action Generation via Decoupling RefinementXiao Liu, Guangyi Chen, Yansong Tang, Guangrun Wang, Xiao-Ping Zhang, Ser-Nam Lim. 2910-2914 [doi]
- REGIR: Refined Geometry for Single-Image Implicit Clothed Human ReconstructionLi Yao, Ao Gao, Yan Wan. 2915-2919 [doi]
- Entwined Inversion: Tune-Free Inversion For Real Image Faithful Reconstruction and EditingJiancheng Huang, Yifan Liu 0001, Jiaxi Lv, Shifeng Chen. 2920-2924 [doi]
- Self-Supervised Multi-Scale Hierarchical Refinement Method for Joint Learning of Optical Flow and DepthRokia Abdein, Xuezhi Xiang, Yiming Chen, Mingliang Zhai, Abdulmotaleb El-Saddik. 2925-2929 [doi]
- Self-Supervised Face Image Restoration with a One-Shot ReferenceYanhui Guo, Fangzhou Luo, Shaoyuan Xu. 2930-2934 [doi]
- Low Redundant Attention Network for Efficient Image Super-ResolutionYican Liu, Jiacheng Li, Delu Zeng. 2930-2954 [doi]
- Multiscale Scoring Model for Enhanced Urban Perception EvaluationXukai Zhao, Yuxing Lu, Jinzhuo Wang. 2935-2939 [doi]
- Recognition-Guided Diffusion Model for Scene Text Image Super-ResolutionYuxuan Zhou, Liangcai Gao, Zhi Tang 0001, Baole Wei. 2940-2944 [doi]
- Feature-Constrained and Attention-Conditioned Distillation Learning for Visual Anomaly DetectionShuo Zhang 0013, Jing Liu. 2945-2949 [doi]
- LVC-LGMC: Joint Local and Global Motion Compensation for Learned Video CompressionWei Jiang, Junru Li, Kai Zhang, Li Zhang. 2955-2959 [doi]
- RGB Images Enhancing Hyperspectral Image Denoising with Diffusion ModelKeli Deng, Peng Wang, Yuntao Qian. 2960-2964 [doi]
- A Reduced-Reference Quality Assessment Metric for Textured Mesh Digital HumansZicheng Zhang, Yingjie Zhou, Chunyi Li, Kang Fu, Wei Sun, Xiaohong Liu, Xiongkuo Min, Guangtao Zhai. 2965-2969 [doi]
- Implicit Foreground-Guided Network for Anomaly Detection and LocalizationXiaolu Chen, Haote Xu, Chenghao Deng, Xiaotong Tu, Xinghao Ding, Yue Huang 0001. 2970-2974 [doi]
- Look, Listen and Recognise: Character-Aware Audio-Visual SubtitlingBruno Korbar, Jaesung Huh, Andrew Zisserman. 2975-2979 [doi]
- Instant Photorealistic Neural Radiance Fields StylizationShaoxu Li, Ye Pan. 2980-2984 [doi]
- Scale-Free And Task-Generic Attack: Generating Photo-Realistic Adversarial Patterns With Patch Quilting GeneratorXiangbo Gao, Qinliang Lin, Cheng Luo, Weicheng Xie 0001, LinLin Shen, Keerthy Kusumam, Siyang Song. 2985-2989 [doi]
- Text-Video Completion Networks With Motion Compensation And Attention AggregationJianan Wang, Zhiliang Wu, Hanyu Xuan, Yan Yan 0002. 2990-2994 [doi]
- Robust Single-Particle Cryo-Em Image Denoising and RestorationJing Zhang, Tengfei Zhao, Shiyu Hu, Xin Zhao. 2995-2999 [doi]
- AHRNET: Attention and Heatmap-Based Regressor for Hand Pose Estimation and Mesh RecoveryFeng Zhou, Pei Shen, Ju Dai, Na Jiang, Yong Hu, Yu-Kun Lai, Paul L. Rosin. 3000-3004 [doi]
- Sketch-Based 3D Shape Retrieval With Multi-View Fusion TransformerCunjuan Zhu, Dongdong Cui, Qi Jia, Weimin Wang, Yu Liu 0012, Michael S. Lew. 3005-3009 [doi]
- JM-CLIP: A Joint Modal Similarity Contrastive Learning Model for Video-Text RetrievalMingyuan Ge, Yewen Li, Honghao Wu, Mingyong Li. 3010-3014 [doi]
- MSSTNet: A Multi-Scale Spatio-Temporal CNN-Transformer Network for Dynamic Facial Expression RecognitionLinhuang Wang, Xin Kang, Fei Ding, Satoshi Nakagawa, Fuji Ren. 3015-3019 [doi]
- CDCNet: A Fast and Lightweight Dehazing Network with Color Distortion CorrectionYilian Zhong, Jiaming Liu, Xuan Huang, Jingjing Liu, Yibo Fan, Minfeng Wu. 3020-3024 [doi]
- WAVER: Writing-Style Agnostic Text-Video Retrieval Via Distilling Vision-Language Models Through Open-Vocabulary KnowledgeHuy Le, Tung Kieu, Anh Nguyen 0003, Ngan Le. 3025-3029 [doi]
- Multiscale Augmented Normalizing Flows for Image CompressionMarc Windsheimer, Fabian Brand, André Kaup. 3030-3034 [doi]
- ProAug: Prototype-Based Augmentation for Long-Tailed Image ClassificationYan Hong, Jianfu Zhang 0003, Zhongyi Sun 0002, Ke Yan. 3035-3039 [doi]
- Dynamic Mutual-Activated Transformer for Human Motion PredictionShaobo Zhang, Sheng Liu, Fei Gao, Yuan Feng. 3040-3044 [doi]
- Arbitrary Style Transfer with Prototype-Based Channel AlignmentYan Hong, Li Niu 0002, Jianfu Zhang 0003. 3045-3049 [doi]
- One-Stage Deep Stereo NetworkZiming Liu, Ezio Malis, Philippe Martinet. 3050-3054 [doi]
- Leveraging Redundancy in Feature for Efficient Learned Image CompressionPeng Qin, Youneng Bao, Fanyang Meng, Wen Tan, Chao Li, Genhong Wang, Yongsheng Liang. 3055-3059 [doi]
- 3DSAM: Segment Anything in NeRFShangjie Wang, Yan Zhang. 3060-3064 [doi]
- TCMP: End-to-End Topologically Consistent Magnitude Pruning for Miniaturized Graph Convolutional NetworksHichem Sahbi. 3065-3069 [doi]
- DAMP: Distribution-Aware Magnitude Pruning for Budget-Sensitive Graph Convolutional NetworksHichem Sahbi. 3070-3074 [doi]
- Active Learning with Core-Set Sampling and Scale-Sensitive Loss for 3D Object DetectionDejun Zhang, Xiaowei Lin, Benxin Yi, Yiqi Wu. 3075-3079 [doi]
- DEEPOREDNET: Contrastive Learning-Based Attention-Weighted Dual Channel Residual Network for Ocular Redness AssessmentShaopan Wang, Jiezhou He, Xin He, Jiaoyue Hu, Zuguo Liu, Zhiming Luo. 3080-3084 [doi]
- Time-Interval Visual Saliency Prediction in Mammogram ReadingJianxun Lou, Xinbo Wu, Richard White, Yingying Wu, Hantao Liu. 3085-3089 [doi]
- Single Image Reflection removal Using Feature Difference EnhancementHaifeng Zhao 0001, Rui Zhou, Shaojie Zhang, Yanping Fu. 3090-3094 [doi]
- CROCFUN: Cross-Modal Conditional Fusion Network for PansharpeningMengting Ma, Chenlu Hu, Huanting Zhang, Xiaowen Ma, Tian Feng, Wei Zhang. 3095-3099 [doi]
- Redefining Night Vision: The Power of MSR-Driven Neural ISPJingchao Hou, Guanghui He. 3100-3104 [doi]
- High Resolution Image Quality DatabaseHuang Huang, Qiang Wan, Jari Korhonen. 3105-3109 [doi]
- CAGEN: Controllable Anomaly Generator using Diffusion ModelBolin Jiang, Yuqiu Xie, Jiawei Li, Naiqi Li, Yong Jiang 0001, Shu-Tao Xia. 3110-3114 [doi]
- Fast and Physically Enriched Deep Network for Joint Low-Light Enhancement and Image DeblurringTrung Hoang, Jon S. McElvain, Vishal Monga. 3115-3119 [doi]
- Pseudo-Outlier Synthesis Using Q-Gaussian Distributions for Out-of-Distribution DetectionRyo Nakamura, Ryu Tadokoro, Eisuke Yamagata, Yusuke Kondo, Kensho Hara, Hirokatsu Kataoka, Nakamasa Inoue. 3120-3124 [doi]
- Customized Treatment Per Pixel for Blind Image Super-ResolutionGuanqun Liu 0005, Xiaoshuai Hao. 3125-3129 [doi]
- Multi-Scale Fusion of Gated Neighborhood Attention Transformers for Single Image DerainingYijin Liu, Guoqiang Xiao 0001, Michael S. Lew, Song Wu 0003. 3130-3134 [doi]
- Arbitrary Style Transfer Based on Content Integrity and Style Consistency EnhancementLu Kang, Guoqiang Xiao 0001, Michael S. Lew, Song Wu 0003. 3135-3139 [doi]
- Tail Classes Matter: Long-Tailed Object Detection RevisitedYinglu Zhang, Chenbo Zhang, Lu Zhang 0060, Tianying Liu, Jihong Guan, Xinkai Liang, Jiajia Zhao, Shuigeng Zhou. 3140-3144 [doi]
- A Facial Expression Transfer Method Based on 3DMM and Diffusion ModelsYongjian Zhao, Xinyan Cao, Siqi Liu, Jinming Che, Wei Ren, Jian Cao 0002, Jinlong Lin. 3145-3149 [doi]
- RK-CORE: An Established Methodology for Exploring the Hierarchical Structure within DatasetsYao Lu, Yutian Huang, Jiaqi Nie, Zuohui Chen, Qi Xuan. 3150-3154 [doi]
- Latent Degradation Representation Constraint for Single Image DerainingYuhong He, Long Peng, Lu Wang, Jun Cheng. 3155-3159 [doi]
- Efficient Scene Text Image Super-Resolution with Semantic GuidanceLeoWu TomyEnrique, Xiangcheng Du, Kangliang Liu, Han Yuan, Zhao Zhou, Cheng Jin 0001. 3160-3164 [doi]
- Domain-Wise Invariant Learning for Panoptic Scene Graph GenerationLi Li 0091, You Qin, Wei Ji, Yuxiao Zhou 0006, Roger Zimmermann. 3165-3169 [doi]
- VT-ReID: Learning Discriminative Visual-Text Representation for Polyp Re-IdentificationSuncheng Xiang, Cang Liu, Jiacheng Ruan, Shilun Cai, Sijia Du, Dahong Qian. 3170-3174 [doi]
- DF-VTON: Dense Flow Guided Virtual Try-On NetworkHaoye Dong, Jun Liu, Dong Huang. 3175-3179 [doi]
- DI-MVS: Learning Efficient Multi-View Stereo With Depth-Aware IterationsJianfei Jiang 0005, Mingwei Cao, Jun Yi, Chenglong Li 0002. 3180-3184 [doi]
- Drop Sparse Convolution for 3D Object DetectionTaohong Zhu, Jun Shen, Chali Wang, Huiyuan Xiong. 3185-3189 [doi]
- Gradient-Aware Logit Adjustment Loss for Long-Tailed ClassifierFan Zhang, Wei Qin, Weijieying Ren, Lei Wang, Zetong Chen, Richang Hong. 3190-3194 [doi]
- Open-Vocabulary Skeleton Action Recognition with Diffusion Graph Convolutional Network and Pre-Trained Vision-Language ModelsChao Wei, Zhidong Deng. 3195-3199 [doi]
- Differentiable Resolution Compression and Alignment for Efficient Video Classification and RetrievalRui Deng, Qian Wu, Yuke Li, Haoran Fu. 3200-3204 [doi]
- A Self-Supervised Pressure Map Human Keypoint Detection Approch: Optimizing Generalization and Computational Efficiency Across DatasetsChengzhang Yu, Xianjun Yang, Wenxia Bao, Shaonan Wang, Zhiming Yao. 3205-3209 [doi]
- Robust Face Recognition Based on an Angle-Aware Loss and Masked Autoencoder Pre-TrainingJaehyeop Choi, Youngbaek Kim, Younghyun Lee. 3210-3214 [doi]
- Geometry-Corrected Geodesic Motion Modeling with Per-Frame Camera Motion for 360-Degree Video CompressionAndy Regensky, André Kaup. 3215-3219 [doi]
- Toward Sufficient Spatial-Frequency Interaction for Gradient-Aware Underwater Image EnhancementChen Zhao, Weiling Cai, Chenyu Dong, Ziqi Zeng. 3220-3224 [doi]
- Multi-Dimension Queried and Interacting Network for Stereo Image DerainingYuanbo Wen, Tao Gao 0001, Ziqi Li, Jing Zhang 0052, Ting Chen 0003. 3225-3229 [doi]
- ARFA: An Asymmetric Receptive Field Autoencoder Model for Spatiotemporal PredictionWenxuan Zhang, Xuechao Zou, Li Wu, Xiaoying Wang 0002, Jianqiang Huang, Junliang Xing. 3230-3234 [doi]
- A Lightweight Change Detection Method Based on Feature Interaction and Transformer for High Resolution Remote Sensing ImagesYingjie Tang, Shou Feng, Chunhui Zhao 0003, Yongqi Chen, Yuanze Fan, Maosheng Wei. 3235-3239 [doi]
- SIANet: Support Information-Aware Network for Category-Agnostic Pose EstimationHaisheng Li, Fang Yuan. 3240-3244 [doi]
- Glance, Focus and Refinement Network for Remote Sensing Change DetectionHao Zhang, Zixuan Sun, Yuhui Zheng, Kaihua Zhang, Gang Dong, Lingyan Liang, Yaqian Zhao. 3245-3249 [doi]
- WaterDiff: Perceptual Image Watermarks Via Diffusion ModelYuqi Tan, Yuang Peng, Hao Fang, Bin Chen, Shu-Tao Xia. 3250-3254 [doi]
- AttentionLUT: Attention Fusion-Based Canonical Polyadic LUT for Real-Time Image EnhancementKang Fu, Yicong Peng, Zicheng Zhang, Qihang Xu, Xiaohong Liu, Jia Wang 0004, Guangtao Zhai. 3255-3259 [doi]
- Self-Distilled Dynamic Fusion Network for Language-Based Fashion RetrievalYiming Wu, Hangfei Li, Fangfang Wang, Yilong Zhang, Ronghua Liang. 3260-3264 [doi]
- A Guided Upsampling Network for Short wave Infrared Images Using Graph RegularizationFrank Sippel, Jürgen Seiler, André Kaup. 3265-3269 [doi]
- Complexity Reduction of Template Matching-Based Reference Picture Padding in Video CodingNicolas Horst, Mathias Wien. 3270-3274 [doi]
- Deep Residual W-Unit Learning with Semantic Embedding for Automatic Pulmonary CT Artery-Vein SeparationHao Qi, Ming Wu, Sunkui Ke, Xiangxing Chen, Hui-Qing Zeng, Yinran Chen, Xiongbiao Luo. 3275-3279 [doi]
- Self-Training Domain Adaptation Via Weight Transmission Between GeneratorsXing Wei, Zhaoxin Ji, Fan Yang, Chong Zhao, Bin Wen, Yang Lu 0015. 3280-3284 [doi]
- TransCycle: A Data Augmentation Method for 3D Human Pose EstimationBowei Zhang, Rongting Xu, Peng Cui. 3285-3289 [doi]
- Granger Connectivity Analysis as a Block-Term Tensor Regression for eSport PlayersAirat Kotliar-Shapirov, Sergei Gostilovich, Anastasia Sozykina, Anh Huy Phan, Andrzej Cichocki. 3290-3294 [doi]
- MapFlow: Multi-Agent Pedestrian Trajectory Prediction Using Normalizing FlowAntonio Luigi Stefani, Niccoló Bisagno, Nicola Conci. 3295-3299 [doi]
- Gravitated Latent Space Loss Generated by Metric Tensor for High-Dynamic Range ImagingHeunseung Lim, Jungkyoo Shin, Hyoungki Choi, Dohoon Kim, Eunwoo Kim, Joonki Paik. 3300-3304 [doi]
- RVDNet: A Two-Stage Network for Real-World Video Desnowing with Domain AdaptationTianhao Xue, Gang Zhou, Runlin He, Zhong Wang, Juan Chen, Zhenhong Jia. 3305-3309 [doi]
- Efficient Architecture Search for Real-Time Instance SegmentationRenqiu Xia, Dongyuan Zhang, Yixin Dong, Juanping Zhao, Wenlong Liao, Tao He, Junchi Yan. 3310-3314 [doi]
- Beyond the Snowfall: Enhancing Snowy Day Object Detection Through Progressive Restoration and Multi-Feature FusionZhong Wang, Gang Zhou, Jing Ma, Tianhao Xue, Zhenhong Jia. 3315-3319 [doi]
- Language-Driven Open-Vocabulary 3D Semantic Segmentation with Knowledge DistillationYuting Wu, Xian-Feng Han, Guoqiang Xiao 0001. 3320-3324 [doi]
- Fine-Grained Features Alignment and Fusion for Text-Video Cross-Modal RetrievalShuili Zhang, Hongzhang Mu, Quangang Li, Chenglong Xiao, Tingwen Liu. 3325-3329 [doi]
- DMKD: Improving Feature-Based Knowledge Distillation for Object Detection Via Dual Masking AugmentationGuang Yang, Yin Tang, Zhijian Wu, Jun Li, Jianhua Xu, Xili Wan. 3330-3334 [doi]
- Exploring Targeted Universal Adversarial Attack for Deep HashingFei Zhu, Wanqian Zhang, Dayan Wu, Lin Wang, Bo Li, Weiping Wang. 3335-3339 [doi]
- CUTDEM: Depth-Aware Enhanced Multi-View Image Mixing for Light Field Super-ResolutionZe-Yu Mi, Yu-Bin Yang. 3340-3344 [doi]
- BEVoxSeg: BEV-Voxel Representation for Fast and Accurate Camera-Based 3D SegmentationHaiyi Liu, Beibei Wang, Lu Zhang, Jianmin Ji, Yanyong Zhang. 3345-3349 [doi]
- Causal-Story: Local Causal Attention Utilizing Parameter-Efficient Tuning for Visual Story SynthesisTianyi Song, Jiuxin Cao, Kun Wang, Bo Liu, Xiaofeng Zhang. 3350-3354 [doi]
- A Multiscale Objective Function for Camera Color CorrectionBahador Rashidi, Kiarash Aghakasiri, Chao Gao, Shuting Zhang, Yue Zhang 0025, Ying Liu, Fengyu Sun. 3355-3359 [doi]
- End-To-End Spatially-Constrained Multi-Perspective Fine-Grained Image CaptioningYifan Zhang, Chunzhen Lin, Donglin Cao, Dazhen Lin. 3360-3364 [doi]
- Efficient Learning on Successive Test Time AugmentationSiyang Pan, Jiaqian Yu, Dongwook Lee, Yiwei Chen, Chao Zhang, Qiang Wang, ByungIn Yoo. 3365-3369 [doi]
- Encoding Time and Energy Model for SVT-AV1 Based on Video ComplexityLena Eichermüller, Gaurang Chaudhari, Ioannis Katsavounidis, Zhijun Lei, Hassene Tmar, Christian Herglotz, André Kaup. 3370-3374 [doi]
- Supplementing Missing Visions Via Dialog for Scene Graph GenerationsZhenghao Zhao, Ye Zhu, Xiaoguang Zhu, Yuzhang Shang, Yan Yan 0002. 3375-3379 [doi]
- Texture and Normal Map Estimation for 3D Face ReconstructionSavas Özkan, Mete Ozay, Tom Robinson. 3380-3384 [doi]
- Autoregressive 3D Shape Completion via Sphere-Guided Disentangled RepresentationJiahui Li, Pourya Shamsolmoali, Yue Lu. 3385-3389 [doi]
- AQF: Assessing the Quality of Hyperspectral Reconstruction with a Learnable MetricPai Chet Ng, Juwei Lu, Konstantinos N. Plataniotis. 3390-3394 [doi]
- DITW: A High-Performance Deep-Independent Template-Based WatermarkingYaokun Fang, Changxi Huang, Chengxin Zhao, Hefei Ling, Xunjie Lin, Jinlong Guo. 3395-3399 [doi]
- Refining 3D Human Mesh via Model-Free Offsets EstimationYouze Xue, Jiansheng Chen, Hongbing Ma, Huimin Ma 0001. 3400-3404 [doi]
- A Comprehensive Framework for Occluded Human Pose EstimationLinhao Xu, Lin Zhao, Xinxin Sun, Di Wang, Guangyu Li, Kedong Yan. 3405-3409 [doi]
- M2SUM: Multi-Granularity Scale-Adaptive Video Summarizer towards Informative Context Representation LearningYunzuo Zhang, Yameng Liu, Weili Kang. 3410-3414 [doi]
- Adaptive Chroma Block Vector Derivation from Luma for Screen Content CodingJunyan Huo, Xue Hao, Shuai Wan, FuZheng Yang 0001, Ming Li. 3415-3419 [doi]
- FPN with GMM Based Feature Enhancement Strategy for Object Detection in Remote Sensing ImagesHongning Liu, Pengming Feng, Mingjie Xie, Dongli Xu, Jian Guan, Guangjun He, Rubo Zhang. 3420-3424 [doi]
- Corner Detection Based on a Rotation-Invariant and Noise-Insensitive Curvature MeasurementXun Sun, Baojiang Zhong, Kai-Kuang Ma. 3425-3429 [doi]
- Enhancing Adversarial Training with Prior Knowledge Distillation for Robust Image CompressionZhi Cao, Youneng Bao, Fanyang Meng, Chao Li, Wen Tan, Genhong Wang, Yongsheng Liang. 3430-3434 [doi]
- Gradient and Brightness Guided Low-Light Enhancement with Attention-Based Self-Paced LearningXiaoyan Sun, Yan Li, De Cheng, Dingwen Zhang, Ling Gao, Luofeng Zhai, Jiande Sun 0001. 3435-3439 [doi]
- Lightweight High-Resolution Subject Matting in the Real WorldPeng Liu, Fanyi Wang, Jingwen Su, Yanhao Zhang, Guojun Qi. 3440-3444 [doi]
- Image Harmonization Based on Hierarchical DynamicsLiuxue Ju, Chengdao Pu, Jun Yu 0001, Wen Su. 3445-3449 [doi]
- Diffevent: Event Residual Diffusion for Image DeblurringPei Wang, Jiumei He, Qingsen Yan, Yu Zhu, Jinqiu Sun, Yanning Zhang. 3450-3454 [doi]
- Autost: Training-Free Neural Architecture Search For Spiking TransformersZiqing Wang, Qidong Zhao, Jinku Cui, Xu Liu 0001, Dongkuan Xu. 3455-3459 [doi]
- Center of Pressure Estimation by Analyzing Walking VideosJiansheng Chen, Yining Qin, Poyu Lin, Jiawei Li, Youze Xue, Huimin Ma 0001. 3460-3464 [doi]
- Trades++: Enhancing Multi-Object Tracking of Real Low Confidence Targets Using a Pyramid-Like Self-Attention ModelChenxin Wen, Yan Gao, Jie Li 0001. 3465-3469 [doi]
- RD-NERF: Neural Robust Distilled Feature Fields for Sparse-View Scene SegmentationYongjia Ma, Bin Dou, Tianyu Zhang, Zejian Yuan. 3470-3474 [doi]
- Align, Adapt and Inject: Audio-Guided Image Generation, Editing and StylizationYue Yang, Kaipeng Zhang, Yuying Ge, Wenqi Shao, Zeyue Xue, Yu Qiao 0001, Ping Luo. 3475-3479 [doi]
- Depth-Guided Dominant Plane Perception for Unsupervised Homography EstimationXiaomei Feng, Qi Jia, Yu Liu, Xin Fan 0001, Longin Jan Latecki. 3480-3484 [doi]
- Adaptive Head Pose Estimation with Real-Time Structured LightYijun Wang, Yuping Ye, Feifei Gu, Zhan Song, Xiaodong Bai. 3485-3489 [doi]
- NERF-AD: Neural Radiance Field With Attention-Based Disentanglement For Talking Face SynthesisChongke Bi, Xiaoxing Liu, Zhilei Liu. 3490-3494 [doi]
- RDANet: Reject Domain Attention Network For Confused Facial Expression RecognitionJintao Luo, Juan Li, Tonglin Cheng. 3495-3499 [doi]
- Privacy Preserving Gaze Estimation Via Federated Learning Adapted To Egocentric VideoYuhu Feng, Keisuke Maeda, Takahiro Ogawa 0001, Miki Haseyama. 3500-3504 [doi]
- Internal Location Assistance for Temporal Action Proposal GenerationSongsong Feng, Shengye Yan. 3505-3509 [doi]
- Template-Guided Data Augmentation for Unbiased Scene Graph GenerationYujie Zang, Yaochen Li, Luguang Cao, Ruitao Lu. 3510-3514 [doi]
- Prediction-Correction Line Segment DetectionZhongyi Sha, Baojiang Zhong. 3515-3519 [doi]
- HADGEO: Image Based 3-DoF Cross-View Geo-Localization with Hard Sample MiningChaoran Li, Chao Yan, Xiaojia Xiang, Jun Lai, Han Zhou, Dengqing Tang. 3520-3524 [doi]
- RTLBP-AN Efficient Local Pattern For Facial Images RetrievalNitin Arora, Prachi Sharma, Pradeep Kumar, Subhash Chander Sharma. 3525-3529 [doi]
- RD-cost Regression Speed Up Technique for VVC Intra Block PartitioningM. E. A. Kherchouche, Franck Galpin, Thierry Dumas, Daniel Ménard, L. Zhang. 3530-3534 [doi]
- A New Similarity-Based Relational Knowledge Distillation MethodXiaomeng Xin, Heping Song, Jianping Gou. 3535-3539 [doi]
- Multiscale Attention Distillation for Object DetectionFengshuo Zhang. 3540-3544 [doi]
- Mask6D: Masked Pose Priors for 6D Object Pose EstimationYuechen Xie, Haobo Jiang, Jin Xie 0001. 3545-3549 [doi]
- X-CAUNET: Cross-Color Channel Attention with Underwater Image-Enhancing TransformerAlik Pramanick, Sandipan Sarma, Arijit Sur. 3550-3554 [doi]
- A Two-Stage Dehazing Framework Based on Inverted Image Curve-EnhancementHongwei Luo, Wei Liu, Cheng Chen. 3555-3559 [doi]
- Attention-Based Spatial-Frequency Information Network for Underwater Single Image Super-ResolutionAlik Pramanick, Dhruvil Megha, Arijit Sur. 3560-3564 [doi]
- Label Correction For Sketch-Based 3d Shape RetrievalShuang Liang 0001, Jiaming Lu, Yiyang Cai. 3565-3569 [doi]
- CKT-RCM: Clip-Based Knowledge Transfer and Relational Context Mining for Unbiased Panoptic Scene Graph GenerationNanhao Liang, Yong Liu, Wenfang Sun, Yingwei Xia, Fan Wang. 3570-3574 [doi]
- Color Agnostic Cross-Spectral Disparity EstimationFrank Sippel, Nils Genser, Hannah Och, Jürgen Seiler, André Kaup. 3575-3579 [doi]
- HENet: Hyperbolic-Based Encoder-Decoder Network for Word Spotting in Historical Mongolian DocumentsJing Zhang, Hongxi Wei, Qing Zhang, Xiandong Chen, Jingtao Ma. 3580-3584 [doi]
- Outlier-Robust Feature Selection with ℓ2, 1-Norm Minimization and Group Row-Sparsity Induced ConstraintsJie Wang, Zheng Wang, Rong Wang, Feiping Nie 0001, Xuelong Li 0001. 3585-3589 [doi]
- Parallel Augmentation and Dual Enhancement for Occluded Person Re-IdentificationZi Wang, Huaibo Huang, Aihua Zheng, Chenglong Li 0002, Ran He 0001. 3590-3594 [doi]
- Surface-Constrained Progressive Feature Preserving Point Cloud CompressionBaoye Zhang, Wenxiang Shen, Bin Tan, Die Hu 0002, Jun Wu. 3595-3599 [doi]
- Fusing Structure and Appearance Features in Facial Expression Recognition TransformerSiwei Meng, Wuzhen Shi. 3600-3604 [doi]
- Image Coding for Analytics via Adversarially Augmented AdaptationXuelin Shen, Kangsheng Yin, Xu Wang, Yulin He, Shiqi Wang 0001, Wenhan Yang. 3605-3609 [doi]
- Dynamic Clustering and Cluster Contrastive Learning for Unsupervised Person Re-Id With Feature Distribution AlignmentZiqi He, Mengjia Xue, Yunhao Du, Zhicheng Zhao, Fei Su. 3610-3614 [doi]
- FDC-NeRF: Learning Pose-Free Neural Radiance Fields with Flow-Depth ConsistencyHuachen Gao, Shihe Shen, Zhe Zhang, Kaiqiang Xiong, Rui Peng, Zhirui Gao, Qi Wang, Yugui Xie, Ronggang Wang. 3615-3619 [doi]
- CLIP-Font: Sementic Self-Supervised Few-Shot Font Generation with ClipJialu Xiong, Yefei Wang, Jinshan Zeng. 3620-3624 [doi]
- Task Indicating Transformer for Task-Conditional Dense PredictionsYuxiang Lu, Shalayiding Sirejiding, Bayram Bayramli, Suizhi Huang, Yue Ding 0001, Hongtao Lu. 3625-3629 [doi]
- DiffDub: Person-Generic Visual Dubbing Using Inpainting Renderer with Diffusion Auto-EncoderTao Liu, Chenpeng Du, Shuai Fan 0005, Feilong Chen, Kai Yu 0004. 3630-3634 [doi]
- Unsupervised Learning of Facial Optical Flow via Occlusion-Aware Global-Local MatchingYungeng Zhang, Yuan Chang, Yun Shen, Peng Ding, Wei Liang, Mingchuan Yang. 3635-3639 [doi]
- ZE-FESG: A Zero-Shot Feature Extraction Method Based on Semantic Guidance for No-Reference Video Quality AssessmentYachun Mi, Yu Li, Yan Shu, Shaohui Liu. 3640-3644 [doi]
- Spatio-Temporal Action Detection with a Motion Sense and Semantic Correction FrameworkYong Zhang, Chunan Yu, Chenglong Fu 0003, Yuanqi Hu, Ying Zang. 3645-3649 [doi]
- Extending Implicit Neural Representations for Text-to-Image GenerationGuanming Liu, Zhihua Wei, Heng Zhang, Rui Wang, Aiquan Yuan, Chuanbao Liu, Biao Chen, Guodong Cao. 3650-3654 [doi]
- ESTGN: Enhanced Self-Mined Text Guided Super-Resolution Network for Superior Image Super ResolutionQipei Li, Zefeng Ying, Da Pan 0001, Zhaoxin Fan, Ping Shi 0001. 3655-3659 [doi]
- A Framework for Portrait Stylization with Skin-Tone Awareness and Nudity IdentificationSeungkwon Kim, Sangyeon Kim, Seung-Hun Nam. 3660-3664 [doi]
- A General Framework for Rotation Invariant Point Cloud AnalysisShuqing Luo, Wei Gao. 3665-3669 [doi]
- Enhanced Color Palette Modeling For Lossless Screen Content CompressionHannah Och, Shabhrish Reddy Uddehal, Tilo Strutz, André Kaup. 3670-3674 [doi]
- Learning Discriminative Style Representations for Unsupervised and Few-Shot Artistic Portrait Drawing GenerationJunkai Fang, Nan Fang, Fei Huang, Jinglin Zhou, Maoying Qiao, Fei Gao 0006. 3675-3679 [doi]
- Implicit-Knowledge-Guided Align Before Understanding for KB-VQAMao Feng, Lei Liao, Meng Yang. 3680-3684 [doi]
- Improved Screen Content Coding in VVC Using Soft Context FormationHannah Och, Shabhrish Reddy Uddehal, Tilo Strutz, André Kaup. 3685-3689 [doi]
- Efficient Joint Rectification of Photometric and Geometric Distortions in Document ImagesHao Tang, Junyuan Guo, Teng Wang, Yanwei Yu, Chao Wang. 3690-3694 [doi]
- HDPNERF: Hybrid Depth Priors for Neural Radiance Fields from Sparse Input ViewsWangze Xu, Qi Wang, Xinghao Pan, Ronggang Wang. 3695-3699 [doi]
- BFRFormer: Transformer-Based Generator for Real-World Blind Face RestorationGuojing Ge, Qi Song, Guibo Zhu, Yuting Zhang, Jinglu Chen, Miao Xin, Ming Tang 0001, Jinqiao Wang. 3700-3704 [doi]
- CSCNet: Class-Specified Cascaded Network for Compositional Zero-Shot LearningYanyi Zhang, Qi Jia 0001, Xin Fan 0001, Yu Liu 0012, Ran He. 3705-3709 [doi]
- Exploring Spatio-Temporal Discriminative Cues for Group Activity Recognition Via Contrastive LearningMeng Tian, Ye Xiang, Lifang Wu. 3710-3714 [doi]
- Learned Video Compression with Spatial-Temporal OptimizationYiming Wang, Qian Huang, Bin Tang, Wenting Liu, Wenchao Shan, Qian Xu. 3715-3719 [doi]
- DSIS: A Novel (K, N) Threshold Deniable Secret Image Sharing Scheme with Lossless RecoveryZikai Xu, Bin Liu, Fei Hu, Weihai Li, Nenghai Yu. 3720-3724 [doi]
- Towards Intelligent Design: A Self-Driven Framework for Collocated Clothing Synthesis Leveraging Fashion Styles and TexturesMinglong Dong, Dongliang Zhou, Jianghong Ma, Haijun Zhang 0002. 3725-3729 [doi]
- Incremental Tensor Decomposition for Few Shot Neural Radiance FieldQian Li, Cheng Wen, Rao Fu. 3730-3734 [doi]
- Balanced And Discriminative Contrastive Learning For Class-Imbalanced Medical ImagesXuewei Li, Yilong Fan, Hao Zheng, Jie Gao, Xi Wei, Mei Yu. 3735-3739 [doi]
- 3D Pose Estimation from Monocular Video with Camera-Bone Angle Regularization on the Image FeatureAsuka Ishii, Hiroo Ikeda. 3740-3744 [doi]
- Imitating the Human Visual System for Scanpath PredictingMengtang Li, Jie Zhu, Zhixin Huang, Chao Gou. 3745-3749 [doi]
- TALDS-Net: Task-Aware Adaptive Local Descriptors Selection for Few-Shot Image ClassificationQian Qiao, Yu Xie, Ziyin Zeng, Fanzhang Li. 3750-3754 [doi]
- SO-Net: Model-Agnostic Sequential Hand Pose Optimization FrameworkYuanyuan Gao, Pengfei Ren, Mingen Shu, Rui Chu, Jubiao Li, Jing Jin, Wei Li. 3755-3759 [doi]
- DRSM: Efficient Neural 4D Decomposition for Dynamic Reconstruction in Stationary Monocular CamerasWeixing Xie, Xiao Dong, Yong Yang, Qiqin Lin, Jingze Chen, Junfeng Yao, Xiaohu Guo. 3760-3764 [doi]
- Multi-Weather Degradation-Aware Transformer for Image RestorationRuoxi Zhu, Minfeng Wu, Xiankui Xiong, Xuanpeng Zhu, Yibo Fan. 3765-3769 [doi]
- Efficient Hierarchical Stripe Attention for Lightweight Image Super-ResolutionXiaying Chen, Yue Zhou 0005. 3770-3774 [doi]
- Scene Sketch-to-Image Synthesis Based on Multi-Object ControlZhenwei Cheng, Lei Wu 0002, Changshuo Wang 0003, Xiangxu Meng. 3775-3779 [doi]
- PFDM: Parser-Free Virtual Try-On via Diffusion ModelYunfang Niu, Dong Yi, Lingxiang Wu, Zhiwei Liu, Pengxiang Cai, Jinqiao Wang. 3780-3784 [doi]
- Facial Aesthetic Enhancement Network for Asian Faces Based on Differential Facial Aesthetic ActivationsHuanyu Chen, Weisheng Li, Xinbo Gao 0001, Bin Xiao 0002, Feiyan Li, Yuping Huang. 3785-3789 [doi]
- Energy-Aware Resolution Selection for Per-Title EncodingMohammad Ghasempour, Hadi Amirpour, Mohammad Ghanbari 0001, Christian Timmerer. 3790-3794 [doi]
- Lighting Image/Video Style Transfer Methods by Iterative Channel PruningKexin Wu, Fan Tang, Ning Liu, Oliver Deussen, Thi Ngoc Hanh Le, Weiming Dong, Tong-Yee Lee. 3800-3804 [doi]
- A Prior Driven Semi-Supervised ViTGAN for Image RecolorizationSuxian Xiang, Hao Yue, Chenxi Huang 0001, Ping Li. 3805-3809 [doi]
- A Real-Time Video Quality Metric for HTTP Adaptive StreamingHadi Amirpour, Jingwen Zhu, Patrick Le Callet, Christian Timmerer. 3810-3814 [doi]
- Content-Based Objective Evaluation of Artificially Generated Sign Language VideosNeha Tarigopula, Preyas Garg, Skanda Muralidhar, Sandrine Tornay, Dinesh Babu Jayagopi, Mathew Magimai-Doss. 3815-3819 [doi]
- CartoonDiff: Training-free Cartoon Image Generation with Diffusion Transformer ModelsFeihong He, Gang Li, Lingyu Si, Leilei Yan, Shimeng Hou, Hongwei Dong, Fanzhang Li. 3825-3829 [doi]
- Semantic-Guided Network with Contrastive Learning for Video CaptionKaixuan Chen 0006, Qianji Di, Yang Lu 0009, Hanzi Wang. 3830-3884 [doi]
- Eye Motion Matters for 3D Face ReconstructionXuan Wang, Mengyuan Liu. 3835-3839 [doi]
- Stereo-Matching Knowledge Distilled Monocular Depth Estimation Filtered by Multiple Disparity ConsistencyWoonghyun Ka, Jae Young Lee, Jaehyun Choi, Junmo Kim 0002. 3840-3844 [doi]
- SAMF: Small-Area-Aware Multi-Focus Image Fusion for Object DetectionXilai Li, Xiaosong Li, Haishu Tan, Jinyang Li. 3845-3849 [doi]
- Focus Fusion Network for Visible and Infrared Image FusionYihan Zhang, Yichu Fang, Qian Zhang. 3850-3854 [doi]
- Generalized Uncertainty-Based Evidential Fusion with Hybrid Multi-Head Attention for Weak-Supervised Temporal Action LocalizationYuanpeng He, Lijian Li, Tianxiang Zhan, Wenpin Jiao, Chi-Man Pun. 3855-3859 [doi]
- An Explicit Multi-Modal Fusion Method for Sign Language TranslationCong Hu, Biao Fu, Pei Yu, Liang Zhang, Xiaodong Shi, Yidong Chen 0001. 3860-3864 [doi]
- SAR2NDVI: Pre-Training for SAR-to-NDVI Image TranslationDaiki Kimura, Tatsuya Ishikawa, Masanori Mitsugi, Yasunori Kitakoshi, Takahiro Tanaka, Naomi Simumba, Kentaro Tanaka, Hiroaki Wakabayashi, Masato Sampei, Michiaki Tatsubori. 3865-3869 [doi]
- Video Anomaly Prediction: Problem, Dataset and MethodYang Wang, Jun Xu, Jiaogen Zhou, Jihong Guan. 3870-3874 [doi]
- Adaptive-Avg-Pooling Based Attention Vision Transformer for Face Anti-SpoofingJichen Yang, Fangfan Chen, Rohan Kumar Das, Zhengyu Zhu, Shunsi Zhang. 3875-3879 [doi]
- Dual Rank-1 Tensor Attention Module for Convolutional Neural NetworksBaihong Lin, Hanxing Chi, Zengrong Lin, Jun Hu, Liang Wang, Jianxiao Zou, Shicai Fan. 3880-3884 [doi]
- Building Lane-Level Maps from Aerial ImagesJiawei Yao, Xiaochao Pan, Tong Wu, Xiaofeng Zhang. 3890-3894 [doi]
- Proposal Distillation of Multi-Modal Feature Aggregation Network for Video Object DetectionZhenYu Qiu, Qiang Qi, Yang Lu 0009, Yan Yan 0001, Hanzi Wang. 3895-3899 [doi]
- Ellipse Detection Based On Structure-Preserving Anisotropic Edge ExtractionYang Su, Baojiang Zhong, Zikai Wang, Kai-Kuang Ma. 3900-3904 [doi]
- Near-Field Neural Rendering Guided by Single-Shot Photometric StereoJoshna Manoj Reddy, Tony Fredrick, Salman Siddique Khan, Kaushik Mitra. 3905-3909 [doi]
- Extremely Light-Weight Learning Based LDR to PQ HDR Conversion Using Bernstein CurvesDung Vo, ChenGuang Liu, McClain Nelson. 3910-3914 [doi]
- Volumetric 3d Point Cloud Attribute Compression: Learned Polynomial Bilateral Filter for PredictionTam Thuc Do, Philip A. Chou, Gene Cheung. 3915-3919 [doi]
- Slowfast Network for Continuous Sign Language RecognitionJunseok Ahn, Youngjoon Jang, Joon Son Chung. 3920-3924 [doi]
- Gradually Spatio-Temporal Feature Activation for Target TrackingYanfang Deng, Canlong Zhang, Zhixin Li 0001, Chunrong Wei, Zhiwen Wang, Shuqi Pan. 3925-3929 [doi]
- Domain-Adaptive Semantic Segmentation Emerges From Vision-Language Supervised Domain-Debiased Self-TrainingHuaYu Wang, Zekun Jiang, Lingxi Xie, Dongsheng Jiang, Wei Shen 0002, Qi Tian 0001. 3930-3934 [doi]
- Capturing Detail Variations for Lightweight Neural Radiance FieldsZheng Wang, Laurence T. Yang, Bocheng Ren, Jinglin Zhao, Zhe Li, Guolei Zeng. 3935-3939 [doi]
- Boosting Image Quality Assessment Performance: Unsupervised Score Fusion by Deep Maximum a Posteriori EstimationZhongling Wang, Raymond Zhou, Shahrukh Athar, Wenbo Yang, Zhou Wang 0001. 3940-3944 [doi]
- Perceiving Multi-Layer Representations for No-reference Image Quality AssessmentQunyue Huang, Bin Fang 0001, Xi Ai, Tianyu Nie. 3945-3949 [doi]
- Semanticmapper: Region-Specific Domain Adaptation for 3D Shapes Through Lexical DelineationTianci Xie, Siyang Luo, Zhenghan Chen, Xiaoxuan Liang 0002. 3950-3954 [doi]
- GAMAFlow: Estimating 3D Scene Flow via Grouped Attention and Global Motion AggregationZhiqi Li, Xiaosong Yang, Jianjun Zhang. 3955-3959 [doi]
- MMAFlow: Matching-Guided Motion Aggregation for Optical Flow EstimationYongpeng Chang, Guangchun Gao. 3960-3964 [doi]
- NLSIT: A Non-Local Stereo Interaction Transformer for Stereo Image Super-ResolutionHuiyun Cao, Wenqi Huang, Wenming Yang. 3965-3969 [doi]
- VCD: A Video Conferencing Dataset for Video CompressionBabak Naderi, Ross Cutler, Nabakumar Singh Khongbantabam, Yasaman Hosseinkashi, Henrik Turbell, Albert Sadovnikov, Quan Zou. 3970-3974 [doi]
- DT-NeRF: Decomposed Triplane-Hash Neural Radiance Fields For High-Fidelity Talking Portrait SynthesisYaoyu Su, Shaohui Wang, Haoqian Wang. 3975-3979 [doi]
- Buffered Gaussian Modeling for Vectorized HD Map ConstructionAnqi Shi, Huaqiu Chen, Hong Lu, Rui Zhang. 3980-3984 [doi]
- Quantized Decoder in Learned Image Compression for Deterministic ReconstructionEsin Koyuncu, Timofey Solovyev, Johannes Sauer, Elena Alshina, André Kaup. 3985-3989 [doi]
- Online Mouse Behavior Detection by Historical Dependency and Typical InstancesXinyu Yang, Feixiang Zhou, Huiyu Zhou 0001. 3990-3994 [doi]
- Local Optimization Networks for Multi-View Multi-Person Human Posture EstimationJucheng Song, Chi-Man Pun, Haolun Li, Rushi Lan, Jiucheng Xie, Hao Gao 0005. 3995-3999 [doi]
- Noisy Image Restoration Based on Conditional Acceleration Score ApproximationZiqiang Shi, Rujie Liu. 4000-4004 [doi]
- Joint Demosaicing And Denoising With Double Deep Image PriorsTaihui Li, Anish Lahiri, Yutong Dai 0004, Owen Mayer. 4005-4009 [doi]
- Zero-Shot Co-Salient Object Detection FrameworkHaoke Xiao, Lv Tang, Bo Li, Zhiming Luo, Shaozi Li. 4010-4014 [doi]
- Hierarchical Home Action Understanding with Implicit and Explicit Prior KnowledgeYuchen Zhou, Guang Tan, Chao Gou. 4015-4019 [doi]
- Learning Spatio-Temporal Relations with Multi-Scale Integrated Perception for Video Anomaly DetectionHongyu Ye, Ke Xu 0003, Xinghao Jiang, Tanfeng Sun. 4020-4024 [doi]
- Fast Graph-Based Denoising For Point Cloud Color InformationRyosuke Watanabe, Keisuke Nonaka, Eduardo Pavez, Tatsuya Kobayashi, Antonio Ortega. 4025-4029 [doi]
- AEGIS-Net: Attention-Guided Multi-Level Feature Aggregation for Indoor Place RecognitionYuhang Ming 0001, Jian Ma 0001, Xingrui Yang 0001, Weichen Dai 0001, Yong Peng 0001, Wanzeng Kong. 4030-4034 [doi]
- Fine-Granularity Face Sketch SynthesisYangdong Chen, Yanfei Wang, Yuejie Zhang, Rui Feng, Tao Zhang, Xuequan Lu, Shang Gao. 4035-4039 [doi]
- Efficient Learned Image Compression with Selective Kernel Residual Module and Channel-Wise Causal Context ModelHaisheng Fu, Feng Liang 0001, Jie Liang 0001, Zhenman Fang, Guohe Zhang, Jingning Han. 4040-4044 [doi]
- Topology-Regularized Self-Knowledge Distillation for Transductive-Inductive Learning of Brain Disorder DiagnosisYanwu Yang, Xutao Guo, Guoqing Cai, Chenfei Ye, Ting Ma. 4045-4049 [doi]
- Phase Learning Based on Interactive Perception for Limited-Sample Residential Area Semantic SegmentationXinran Lyu, Libao Zhang. 4050-4054 [doi]
- MAS-NET: Mixed-Feature Attention Siamese Network for Change Detection on Remote Sensing ImagesXingyu Ding, Weiqiang Wang. 4055-4059 [doi]
- Wavelet-Decoupling Contrastive Enhancement Network for Fine-Grained Skeleton-Based Action RecognitionHaochen Chang, Jing Chen, Yilin Li, Jixiang Chen, Xiaofeng Zhang. 4060-4064 [doi]
- Adaptive Pedestrian Trajectory Prediction via Target-Directed Angle AugmentationHao Kong, Jie Xu 0021, Shenjian Gong, Jian Yang 0003, Shanshan Zhang. 4065-4069 [doi]
- Embedded Graph Representation for Inter-Frame Coding of Dynamic MeshesXudong Jin, Jianfeng Xu, Kei Kawamura. 4070-4074 [doi]
- Blenda: Domain Adaptive Object Detection Through Diffusion-Based BlendingTzuhsuan Huang, Chen-Che Huang, Chung-Hao Ku, Jun-Cheng Chen. 4075-4079 [doi]
- Unsupervised Remote Sensing Haze Removal Based on Saliency-Guided Transmission RefinementRuohui Zheng, Libao Zhang. 4080-4084 [doi]
- Deep Unrolling Network for SAR Image DespecklingChe Chen, Lin Chen, Xue Jiang 0001, Xingzhao Liu, Abdelhak M. Zoubir. 4085-4089 [doi]
- SDRNet: Saliency-Guided Dynamic Restoration Network for Rain and Haze Removal in Nighttime ImagesWanning Zhu, Lin Tan, Libao Zhang. 4090-4094 [doi]
- Adaptive Multi-Exposure Fusion for Enhanced Neural Radiance FieldsYang Zou, Xingyuan Li 0005, Zhiying Jiang, Tiantian Yan, Jinyuan Liu 0001. 4095-4099 [doi]
- Adaptive Secondary Transform Sets for Video Coding Beyond AV1Yushin Cho, Madhu Krishnan, Xin Zhao, Shan Liu. 4100-4104 [doi]
- Hazy Remote Sensing Images Semantic Segmentation for Weakly Annotation Based on Saliency-Aware Alignment StrategyJunda Xu, Libao Zhang. 4105-4109 [doi]
- Multi-Stage Contrastive Regression for Action Quality AssessmentQi An, Mengshi Qi, Huadong Ma. 4110-4114 [doi]
- Adaptive Super Resolution for One-Shot Talking-Head GenerationLuchuan Song, Pinxin Liu, Guojun Yin, Chenliang Xu. 4115-4119 [doi]
- Mixed-Attention Auto Encoder for Multi-Class Industrial Anomaly DetectionJiangqi Liu, Feng Wang. 4120-4124 [doi]
- DEGAN: Discrimination Enhanced GAN for Perceptual-Oriented Super-ResolutionXiaoyu Jin, Wenqi Huang, Lingyu Liang, Yang Wu, Qunsheng Zeng, Ruiye Zhou, Zhuojun Cai, Jianing Shang, Wenming Yang. 4125-4129 [doi]
- The Devil is in Details: Delving Into Lite FFN Design for Vision TransformersZhiyang Chen, Yousong Zhu, Zhaowen Li, Fan Yang, Chaoyang Zhao, Jinqiao Wang, Ming Tang. 4130-4134 [doi]
- AEAM3D: Adverse Environment-Adaptive Monocular 3D Object Detection via Feature Extraction RegularizationYixin Lei, Xingyuan Li 0005, Zhiying Jiang, Xinrui Ju, Jinyuan Liu 0001. 4135-4139 [doi]
- M3sum: A Novel Unsupervised Language-Guided Video SummarizationHongru Wang 0003, Baohang Zhou, Zhengkun Zhang, Yiming Du, David Ho, Kam-Fai Wong. 4140-4144 [doi]
- CReStyler: Text-Guided Single Image Style Transfer Method Based on CNN and RestormerLong Feng, Guohua Geng, Yong Ren, Zhen Li, Yangyang Liu, Kang Li. 4145-4149 [doi]
- SR-VFA: Accurate Self-Refined Face Alignment in VideosSipeng Yang, Hongyu Huang, Qingchuan Zhu, Xiaogang Jin 0001. 4150-4154 [doi]
- Memory Self-Calibrated Network for Visual GroundingJie Wu, Chunlei Wu, Yiwei Wei, Xiuxuan Shen, Leiquan Wang. 4155-4159 [doi]
- Robust Lightweight Depth Estimation Model via Data-Free DistillationZihan Gao, Peng Gao, Wei Yin, Yifan Liu 0001, Zengchang Qin. 4160-4164 [doi]
- Semantic Latent Decomposition with Normalizing Flows for Face EditingBinglei Li, Zhizhong Huang, Hongming Shan, Junping Zhang. 4165-4169 [doi]
- Low-Light Raw Image Enhancement on a Dataset Suffering Light EffectsXu Zhang, Rui Tang, Guipeng Zhang, Dehui Kong, Ke Xu. 4170-4174 [doi]
- Zigzag Attention: A Structural Aware Module For Lane DetectionJiajun Ling, Yifan Chen, Qimin Cheng, Xiao Huang. 4175-4179 [doi]
- Multi-Object Tracking for Unmanned Aerial Vehicles Based on Multi-Frame Feature FusionJiayin Wen, Dianwei Wang, Jie Fang, Yuanqing Li, Zhijie Xu. 4180-4184 [doi]
- Towards Omniscient Feature Alignment for Video RescalingGuanchen Ding, Chang Wen Chen. 4190-4194 [doi]
- Semantic Segmentation for Multi-Scene Remote Sensing Images with Noisy Labels Based on Uncertainty PerceptionXinran Lyu, Libao Zhang. 4195-4199 [doi]
- Loop Structure-Aware Learning for Fully Automated Pulmonary Fissure Completeness AssessmentLinya Zheng, Fan Zhang, Haichao Peng, Yong Wang, Yinran Chen, Xiongbiao Luo. 4200-4204 [doi]
- Bounding Box-Guided Pseudo Point Clouds Early-Fusion and Density Optimize for 3D Object DetectionShiwei Zhao, Shengye Yan. 4205-4209 [doi]
- Geometry Compression Artifact Removal for V-PCC over a Wide Bitrate RangeJian Xiong 0005, Junhao Wu, Wang Luo, Jiucheng Xie, Hao Gao 0005. 4210-4214 [doi]
- Rate-Quality Based Rate Control Model for Neural Video CompressionShuhong Liao, Chuanmin Jia, Hongfei Fan, Jingwen Yan, Siwei Ma. 4215-4219 [doi]
- Modal Consensus and Contextual Separation for Weakly Supervised Temporal Action LocalizationPeng Liu, Chuanxu Wang, Min Zhao. 4220-4224 [doi]
- Denoising Diffusion Probabilistic Models for Action-Conditioned 3D Motion GenerationMengyi Zhao, Mengyuan Liu, Bin Ren, Shuling Dai, Nicu Sebe. 4225-4229 [doi]
- Segmentation-Driven Infrared and Visible Image Fusion Via Transformer-Enhanced Architecture SearchingHongming Fu, Guanyao Wu, Zhu Liu, Tiantian Yan, Jinyuan Liu 0001. 4230-4234 [doi]
- Unlocking Deep Learning: A BP-Free Approach for Parallel Block-Wise Training of Neural NetworksAnzhe Cheng, Heng Ping, Zhenkun Wang, Xiongye Xiao, Chenzhong Yin, Shahin Nazarian, Mingxi Cheng, Paul Bogdan. 4235-4239 [doi]
- SAM-GEBD: Zero-Cost Approach for Generic Event Boundary DetectionPranay Kashyap, Sourabh Vasant Gothe, Vibhav Agarwal, Jayesh Rajkumar Vachhani. 4240-4244 [doi]
- Maskstr: Guide Scene Text Recognition Models with MaskingBaole Wei, Minghang He, Liangcai Gao, Duoyou Zhou, Xiang Bai, Zhi Tang 0001. 4245-4249 [doi]
- Spatial Formation-Guided Network for Group Activity RecognitionDunbo Ning, Wenjing Chen 0003, Wei Xie 0008, Hao Sun. 4250-4254 [doi]
- Panoramic Image Inpainting with Gated Convolution and Contextual Reconstruction LossLi Yu 0004, Yanjun Gao, Farhad Pakdaman, Moncef Gabbouj. 4255-4259 [doi]
- Photovoltaic Power Forecasting Using Sky Images and Sun MotionArne Berresheim, Antonio Agudo. 4260-4264 [doi]
- Generalizable Two-Branch Framework for Image Class-Incremental LearningChao Wu, Xiaobin Chang, Ruixuan Wang. 4265-4269 [doi]
- Multi-Level Spatial-Temporal Feature Aggregation and Alignment-Based Selective Residual Dense Propagation Module for HDR Video ReconstructionYiyu Liu, Fengshan Zhao, Qin Liu 0002, Takeshi Ikenaga. 4270-4274 [doi]
- Accurate and Robust Scene Text Recognition via Adversarial TrainingXiaomeng Yang, Dongbao Yang, Zhi Qiao, Yu Zhou 0015. 4275-4279 [doi]
- Attribute-Aware Amplification of Facial Feature Sequences for Facial Emotion RecognitionTagon Sompong, Chawan Piansaddhayanon, Ekapol Chuangsuwanich. 4280-4284 [doi]
- Perceptual Quality Evaluation for Faster Playback VideosJiarun Song, Shengnan Wang, Fuzheng Yang. 4285-4289 [doi]
- Improved Image Captioning Via Knowledge Graph-Augmented ModelsSergio Sánchez Santiesteban, Sara Atito, Muhammad Awais 0001, Yi-Zhe Song, Josef Kittler. 4290-4294 [doi]
- Boosting of Implicit Neural Representation-Based Image DenoiserZipei Yan, Zhengji Liu, Jizhou Li. 4295-4299 [doi]
- Straightforward Adaptation of Particle Filter to Fish Eye Images for Top View Pedestrian TrackingHicham Talaoubrid, Khizar Hayat 0002, Baptiste Magnier. 4300-4304 [doi]
- HaltingVT: Adaptive Token Halting Transformer for Efficient Video RecognitionQian Wu, Ruoxuan Cui, Yuke Li, Haoqi Zhu. 4305-4309 [doi]
- Bridging the Gap: Sketch to Color Diffusion Model with Semantic Prompt LearningNing Wang, Yifei She, Rui Xu, Bin Liu, Haojie Li, Zhiyong Wang, Zhihui Wang. 4310-4314 [doi]
- Local Contrast Prior-Guided Cross Aggregation Model for Effective Infrared Small Target DetectionZihang Chen, Zhu Liu, Jinyuan Liu 0001. 4315-4319 [doi]
- Rating-Augmented No-Reference Point Cloud Quality Assessment Using Multi-Task LearningXinyu Wang, Xiaochuan Wang, Ruijun Liu, Xiankai Huang. 4320-4324 [doi]
- Exploring Phonetic Context-Aware Lip-Sync for Talking Face GenerationSe Jin Park, Minsu Kim, Jeongsoo Choi, Yong Man Ro. 4325-4329 [doi]
- PFCF-Net: A Network Based on Progressive Feature Interaction and Cross-Scale Feature Fusion for Remote Sensing Change DetectionXiuzhen He, Yan Wang, Qiaoli Sun, Fangxu Zhou. 4330-4334 [doi]
- Balancing Representation Abstractions and Local Details Preservation for 3d Point Cloud Quality AssessmentMarouane Tliba, Aladine Chetouani, Giuseppe Valenzise, Frédéric Dufaux. 4335-4339 [doi]
- Real-Oriented Object Detection Driven by Intelligent StockbreedingGuowen Kuang, Xin Lu, Jingran Xia, Hao Geng, Xu Wang, Jinfeng Yang. 4345-4349 [doi]
- SAMVG: A Multi-Stage Image Vectorization Model with the Segment-Anything ModelHaokun Zhu, Juang Ian Chong, Teng Hu, Ran Yi, Yu-Kun Lai, Paul L. Rosin. 4350-4354 [doi]
- DONE: Dynamic Neural Representation Via Hyperplane Neural ODEJiaxu Wang, Bo Xu, Hao Cheng, Renjing Xu. 4355-4359 [doi]
- EDM: Synthetic Data from Exemplar Diffusion Model Improves Non-Communicable Diseases DetectionXing Wu, Zhi Li, Junfeng Yao, Quan Qian, Jian Zhang, Qun Sun, Yike Guo. 4360-4364 [doi]
- 3D Hand Joint and Grasping Estimation for Teleoperation SystemLiyuan Qi, Olaoluwa R. Popoola, Jingyan Wang, Muhammad Ali Imran 0001, Wasim Ahmad. 4365-4369 [doi]
- LV-SEGFORMER: Towards More Accurate Leaf-Vein Segmentation with TransformerWanqiang Cai, Bin Wang 0041. 4370-4374 [doi]
- 3D Point Cloud Semantic Segmentation Based on Diffusion ModelChang Liu, Aimin Jiang, Yibin Tang, Yanping Zhu, Qi Chen. 4375-4379 [doi]
- Wavelet-Guided Acceleration of Text Inversion in Diffusion-Based Image EditingGwanhyeong Koo, Sunjae Yoon, Chang D. Yoo. 4380-4384 [doi]
- Face Recognition Using Lensless CameraHatef Otroshi-Shahreza, Alexandre Veuthey, Sébastien Marcel. 4385-4389 [doi]
- Progressively Learning from Macro-Expressions for Micro-Expression RecognitionFeifan Wang, Yuan Zong, Jie Zhu, Mengting Wei, Xiaolin Xu, Cheng Lu 0005, Wenming Zheng. 4390-4394 [doi]
- Fast Intra Mode Prediction Algorithms for SCBS in VVC SCCDayong Wang, Yishen Deng, Weisheng Li, Xin Lu, Frédéric Dufaux, Bo Hang, Ce Zhu. 4395-4399 [doi]
- Adaptive Gaussian Regularization Constrained Sparse Subspace Clustering for Image SegmentationSensen Song, Dayong Ren, Zhenhong Jia, Fei Shi. 4400-4404 [doi]
- Deep Neighbor Layer Aggregation for Lightweight Self-Supervised Monocular Depth EstimationBoya Wang, Shuo Wang, Dong Ye 0003, Ziwen Dou. 4405-4409 [doi]
- FPGNet: Single Image Deraining with High-Frequency Channel and Frequency Domain Prior GuidanceZhaoyong Yan, Gang Li, Lingyu Si, Hongwei Dong. 4410-4414 [doi]
- A 3D Virtual Try-On Method with Global-Local Alignment and Diffusion ModelShougan Pan, Zhengwentai Sun, Chenxing Wang 0002, Junkai Zhang. 4415-4419 [doi]
- Radardiff: Improving Sea Clutter Suppression Using Diffusion Models for Radar ImagesLingyu Si, Gang Li, Hongwei Dong, Changwen Zheng, Fanjiang Xu, Fuchun Sun. 4420-4424 [doi]
- Local Information Guided Global Integration for Infrared Small Target DetectionQiang Li, Qianchen Mao, Wenjie Liu, Jinbao Wang, Wenmin Wang, Bingshu Wang. 4425-4429 [doi]
- Think as People: Context-Driven Multi-Image News Captioning with Adaptive Dual AttentionQiang Yang 0015, Xiaodong Wu, Xiuying Chen, Xin Gao 0001, Xiangliang Zhang 0001. 4430-4434 [doi]
- Breaking Speaker Recognition with PaddingbackZhe Ye, Diqun Yan, Li Dong 0006, Kailai Shen. 4435-4439 [doi]
- Joint Learning of Identity and Vein Features for Enhanced Representations in Vascular BiometricsWeifeng Ou, Lai-Man Po, Xiu-Feng Huang. 4440-4444 [doi]
- Cross-Domain Cross-Task Transfer Mobile Touch-Stroke AuthenticationKensuke Wagata, Andrew Beng Jin Teoh. 4445-4449 [doi]
- Improving VGG-Style Convnet for JPEG SteganalysisZhuoFan Yang, Qiushi Li, Shenghai Luo, Shunquan Tan, Bin Li 0011. 4450-4454 [doi]
- A Multi-Carrier Information Hiding Algorithm Based on Layered Compression of 3d Point Cloud ModelShuai Ren, Yuxiao Li, Bo Li, Hao Gong, Qiuyu Feng. 4455-4459 [doi]
- SourceP: Detecting Ponzi Schemes on Ethereum with Source CodePengcheng Lu, Liang Cai, Keting Yin. 4465-4469 [doi]
- FAMIM: A Novel Frequency-Domain Augmentation Masked Image Model Framework for Domain Generalizable Face Anti-SpoofingTianyi Zheng, Qinji Yu, Zhaoyu Chen, Jia Wang. 4470-4474 [doi]
- Enhancing Targeted Transferability VIA Feature Space Fine-TuningHui Zeng, Biwei Chen, Anjie Peng. 4475-4479 [doi]
- Data-Free Watermark for Deep Neural Networks by Truncated Adversarial DistillationChao-Bo Yan, Fang-Qi Li, Shi-Lin Wang. 4480-4484 [doi]
- Delving Deeper Into Vulnerable Samples in Adversarial TrainingPengfei Zhao, Haojie Yuan, Qi Chu 0001, Shubin Xu, Nenghai Yu. 4490-4494 [doi]
- Language-Driven Ordinal Learning for Imbalanced Head Pose EstimationYaoxing Wang, Qian Yu, Ling Lin, Zhendong Li, Hao Liu. 4495-4499 [doi]
- Towards Generic Deepfake Detection with Dynamic CurriculumWentang Song, Yuzhen Lin, Bin Li 0011. 4500-4504 [doi]
- XMP: A Cross-Attention Multi-Scale Performer for File Fragment ClassificationJeong Gyu Park, Sisung Liu, Je Hyeong Hong. 4505-4509 [doi]
- Transformer Model with Multi-Type Classification Decisions for Intrusion Attack Detection of Track Traffic and VehicleQuanlong Guan, Tian Zhang, Yu Qin, Yuyu Zhou, Yangguang Zhu, Yuansheng Zhong, Xiujie Huang, Zhifei Duan, Zhefu Li, Changjiang Liu, Xiaofeng Wu. 4510-4514 [doi]
- Enhanced Screen Shooting Resilient Document WatermarkingHeng Wang, Hongxia Wang 0001, Xinyi Huang, Zhenhao Shi 0004. 4515-4519 [doi]
- Invertible Mosaic Image Hiding Network for Very Large Capacity Image SteganographyZihan Chen, Tianrui Liu, Jun-Jie Huang, Wentao Zhao, Xing Bi, Meng Wang. 4520-4524 [doi]
- FUR-API: Dataset and Baselines Toward Realistic API Anomaly DetectionYijun Liu, Honglan Yu, Feifei Dai, Xiaoyan Gu, Chenxu Cui, Bo Li, Weiping Wang. 4525-4529 [doi]
- Securely and Efficiently Outsourcing Neural Network Inference via Parallel MSB ExtractionXin Liu, Ning Xi 0002, Ke Cheng, Jiaxuan Fu, Xinghui Zhu, Yulong Shen, Jianfeng Ma 0001. 4530-4534 [doi]
- Uncovering Strong Ties: A Study of Indirect Sybil Attack on Signed Social NetworkYu Bu, Yulin Zhu, Longling Geng, Kai Zhou 0001. 4535-4539 [doi]
- Improving Visual Quality and Transferability of Adversarial Attacks on Face Recognition Simultaneously with Adversarial RestorationFengfan Zhou, Hefei Ling, Yuxuan Shi, Jiazhong Chen, Ping Li 0021. 4540-4544 [doi]
- Synthesizing Black-Box Anti-Forensics Deepfakes With High Visual QualityBing Fan, Shu Hu 0001, Feng Ding 0007. 4545-4549 [doi]
- Communication-Efficient Laplace Mechanism for Differential Privacy via Random QuantizationAli Moradi Shahmiri, Chih Wei Ling, Cheuk Ting Li. 4550-4554 [doi]
- ADVSV: An Over-the-Air Adversarial Attack Dataset for Speaker VerificationLi Wang, Jiaqi Li, Yuhao Luo, Jiahao Zheng, Lei Wang, Hao Li, Ke Xu, Chengfang Fang, Jie Shi, Zhizheng Wu 0001. 4555-4559 [doi]
- Controllable Semantic Linguistic Steganography via Summarization GenerationRuifan Zhang, Jianyi Liu, Ru Zhang. 4560-4564 [doi]
- A Novel Residual-Guided Learning Method for Image SteganographyMiaoxin Ye, Dongxia Huang, Kangkang Wei, Weiqi Luo 0001. 4565-4569 [doi]
- Attribution-Based Scanline Perturbation Attack on 3d Detectors of Lidar Point CloudsZiyang Yu, Ting Yang, Qiong Chang, Yu Liu 0012, Weimin Wang 0007. 4570-4574 [doi]
- JPEG Encryption with DC Prediction and Run-Based RS Pairs PermutationYanyixiao Wang, Peiya Li. 4575-4579 [doi]
- Scale-Aware Competition Network for Palmprint RecognitionChengrui Gao, Ziyuan Yang 0001, Min Zhu, Andrew Beng Jin Teoh. 4580-4584 [doi]
- Motion Transfer-Driven Intra-Class Data Augmentation for Finger Vein RecognitionXiu-Feng Huang, Lai-Man Po, Weifeng Ou. 4585-4589 [doi]
- Rényi Differential Privacy in the Shuffle Model: Enhanced Amplification BoundsE. Chen, Yang Cao, Yifei Ge. 4590-4594 [doi]
- Exploring Consistent Spatio-Temporal Distortion and Stable 3-D DCT Coefficients for Robust Blind Video WatermarkingFei Zhang, Hongxia Wang, Mingze He, Ling Yang. 4595-4599 [doi]
- Cross-Age Contrastive Learning for Age-Invariant Face RecognitionHaoyi Wang, Victor Sanchez, Chang-Tsun Li. 4600-4604 [doi]
- FSD: An Initial Chinese Dataset for Fake Song DetectionYuankun Xie, Jingjing Zhou, Xiaolin Lu, Zhenghao Jiang, Yuxin Yang, Haonan Cheng, Long Ye. 4605-4609 [doi]
- Adaptive Video Watermarking with Perceptual Guarantee and Efficiency OptimizationFei Zhang, Hongxia Wang, Mingze He, Ling Yang, Jinhe Li. 4610-4614 [doi]
- Enhancing Gender Privacy with Photo-Realistic Fusion of Disentangled Spatial SegmentsPeter Rot, Janez Krizaj, Peter Peer, Vitomir Struc. 4615-4619 [doi]
- CNFA: Conditional Normalizing Flow for Query-Limited AttackRenYang Liu, Wei Zhou, Jinhong Zhang, Haoran Li, Ruxin Wang. 4620-4624 [doi]
- Cross-Attention watermarking of Large Language ModelsFolco Bertini Baldassini, Huy H. Nguyen, Ching-Chung Chang, Isao Echizen. 4625-4629 [doi]
- Security Equivalence Assessment between Cloud Standards by Mapping of Control ItemsYuchen Wong, Chen Yan, Shengfang Zhai, Cong Li, Qingni Shen. 4630-4634 [doi]
- An Initial Investigation of Neural Replay Simulator for Over-The-Air Adversarial Perturbations to Automatic Speaker VerificationJiaqi Li, Li Wang, Liumeng Xue, Lei Wang, Zhizheng Wu 0001. 4635-4639 [doi]
- AdvShadow: Evading DeepFake Detection via Adversarial Shadow AttackJiatong Liu, Mingcheng Zhang, Jianpeng Ke, Lina Wang 0001. 4640-4644 [doi]
- Semantic Security: A Digital Watermark Method for Image Semantic PreservationTianwei Zuo, Yiping Duan, Qiyuan Du, Xiaoming Tao. 4645-4649 [doi]
- Maskmark: Robust Neuralwatermarking for Real and Synthetic SpeechPatrick O'Reilly, Zeyu Jin, Jiaqi Su, Bryan Pardo. 4650-4654 [doi]
- Unintended Memorization in Large ASR Models, and How to Mitigate ItLun Wang, Om Thakkar 0001, Rajiv Mathews. 4655-4659 [doi]
- Enhancing Adversarial Robustness of DNNS Via Weight Decorrelation in TrainingCong Zhang, Yuezun Li, Honggang Qi, Siwei Lyu. 4660-4664 [doi]
- Periocular Biometrics Enhancement Through Multimodal Embeddings And Classifier AdaptationJongWon Hwang, Andrew Beng Jin Teoh. 4665-4669 [doi]
- Scalable Ensemble-Based Detection Method Against Adversarial Attacks For Speaker VerificationHaibin Wu, Heng-Cheng Kuo, Yu Tsao 0001, Hung-yi Lee. 4670-4674 [doi]
- GI-PIP: Do We Require Impractical Auxiliary Dataset for Gradient Inversion Attacks?Yu Sun 0015, Gaojian Xiong, Xianxun Yao, Kailang Ma, Jian Cui. 4675-4679 [doi]
- NWS: Natural Textual Backdoor Attacks Via Word SubstitutionWei Du, Tongxin Yuan, Haodong Zhao, Gongshen Liu. 4680-4684 [doi]
- A One-Class Approach to Detect Super-Resolution Satellite Imagery with Spectral FeaturesEdoardo Daniele Cannas, P. Beaus, Paolo Bestagini, F. Marques, Stefano Tubaro. 4685-4689 [doi]
- Adapter-Based Incremental Learning for Face Forgery DetectionCaili Gao, Qisheng Xu, Peng Qiao, Kele Xu, Xifu Qian, Yong Dou. 4690-4694 [doi]
- Least-Effort Adversarial Attack Against Gait-Based Identity Recognition SystemJianmin Dong, Datian Peng, Taihao Li. 4695-4699 [doi]
- Invariant Motion Representation Learning for 3D Talking Face SynthesisJiyuan Liu, Wenping Wei, Zhendong Li, Guanfeng Li, Hao Liu. 4700-4704 [doi]
- Manticore: An Unsupervised Intrusion Detection System Based on Contrastive Learning in 5G NetworksLu Yuan, Jiyan Sun, Shangyuan Zhuang, YinLong Liu, Liru Geng, Jing Zou, Peizhe Xin, Weiqing Huang, Wei Ma. 4705-4709 [doi]
- CLPSD: Detecting Ethereum Phishing Scams based on Curriculum LearningWenhan Hou, Bo Cui, Yongxin Chen, Ru Li. 4710-4714 [doi]
- Boosting Adversarial Robustness Distillation Via Hybrid Decomposed KnowledgeYulun Wu, Mingrui Lao, Yanming Guo, DongMei Chen, Tianyuan Yu. 4715-4719 [doi]
- InvertedFontNet: Font Watermarking based on Perturbing Style ManifoldNan Sun, Chenxin Zhao, Sijing Xie, Hefei Ling. 4720-4724 [doi]
- Speaker Anonymization Using Neural Audio Codec Language ModelsMichele Panariello, Francesco Nespoli, Massimiliano Todisco, Nicholas W. D. Evans. 4725-4729 [doi]
- MMHSV: A Multimodal Handwritten Signature Verification Fusing Dynamic and Static FeatureQixiang Li, Zhaoya Wang, Lianwen Jin, Nurbiya Yadikar, Kurban Ubul. 4730-4734 [doi]
- Structure Matters: Analyzing Videos Via Graph Neural Networks for Social Media Platform AttributionAndrea Gemelli, Dasara Shullani, Daniele Baracchi, Simone Marinai, Alessandro Piva. 4735-4739 [doi]
- Interpretable Multimodal Out-of-Context Detection with Soft Logic RegularizationHuanhuan Ma, Jinghao Zhang, Qiang Liu, Shu Wu, Liang Wang. 4740-4744 [doi]
- LOFT: Latent Space Optimization and Generator Fine-Tuning for Defending Against DeepfakesShaoyou Zeng, Wenhao Wang, Fangjun Huang, Yanmei Fang. 4750-4754 [doi]
- FREmax: A Simple Method Towards Truly Secure Generative Linguistic SteganographyKaiyi Pang, Minhao Bai, Jinshuai Yang, Huili Wang, Minghu Jiang, Yongfeng Huang 0001. 4755-4759 [doi]
- Poisoning-Free Defense Against Black-Box Model ExtractionHaitian Zhang, Guang Hua 0001, Wen Yang 0001. 4760-4764 [doi]
- A Targeted Adversarial Attack Method for Multi-Classification Malicious Traffic DetectionPeishuai Sun, Chengxiang Si, Shuhao Li, Zhenyu Cheng 0001, Shuyuan Zhao, Qingyun Liu. 4765-4769 [doi]
- VL-FAS: Domain Generalization via Vision-Language Model For Face Anti-SpoofingHao Fang, Ajian Liu, Ning Jiang, Quan Lu, Guoqing Zhao, Jun Wan 0001. 4770-4774 [doi]
- SE-SIS: Shadow-Embeddable Lossless Secret Image Sharing for Greyscale ImagesZikai Xu, Bin Liu, Fei Hu, Weihai Li, Nenghai Yu. 4775-4779 [doi]
- Heterogeneous Face Recognition Using Domain Invariant UnitsAnjith George, Sébastien Marcel. 4780-4784 [doi]
- Voice Anonymization for All-Bias Evaluation of the Voice Privacy Challenge Baseline SystemsAnna Leschanowsky, Ünal Ege Gaznepoglu, Nils Peters. 4785-4789 [doi]
- A Codec-Based Approach for Video Life-Cycle Characterization in Social NetworksGiulia Bertazzini, Daniele Baracchi, Dasara Shullani, Massimo Iuliani, Alessandro Piva. 4790-4794 [doi]
- Uncertainty-Guided Person Search Model with Auxiliary Shallow Feature ExplorationZongyi Li, Zhongyang Li, Yuxuan Shi, Hefei Ling, Jiazhong Chen, Runsheng Wang, Ping Li. 4795-4799 [doi]
- A Fast, Performant, Secure Distributed Training Framework For LLMWei Huang, Yinggui Wang, Anda Cheng, Aihui Zhou, Chaofan Yu, Lei Wang. 4800-4804 [doi]
- Audio Transformer for Synthetic Speech Detection via Formant Magnitude and Phase AnalysisLuca Cuccovillo, Milica Gerhardt, Patrick Aichroth. 4805-4809 [doi]
- Noise Masking Attacks and Defenses for Pretrained Speech ModelsMatthew Jagielski, Om Thakkar 0001, Lun Wang 0001. 4810-4814 [doi]
- Functional Invariants To Watermark Large TransformersPierre Fernandez, Guillaume Couairon, Teddy Furon, Matthijs Douze. 4815-4819 [doi]
- Crypto-Mine: Cryptanalysis Via Mutual Information Neural EstimationBenjamin D. Kim, Vipindev Adat Vasudevan, Jongchan Woo, Alejandro Cohen, Rafael G. L. D'Oliveira, Thomas Stahlbuhk, Muriel Médard. 4820-4824 [doi]
- Innovative Methods for Non-Destructive Inspection of Handwritten DocumentsEleonora Breci, Luca Guarnera, Sebastiano Battiato. 4825-4829 [doi]
- Vulnerability of Face age Verification to Replay AttacksPavel Korshunov, Anjith George, Gökhan Özbulak, Sébastien Marcel. 4830-4834 [doi]
- Gradient Inversion Attacks on Acoustic Signals: Revealing Security Risks in Audio Recognition SystemsPretom Roy Ovi, Aryya Gangopadhyay. 4835-4839 [doi]
- AdvTTS: Adversarial Text-to-Speech Synthesis Attack on Speaker Identification SystemsChu-Xiao Zuo, Zhi-Jun Jia, Wu-Jun Li. 4840-4844 [doi]
- Discovering Malicious Signatures in Software from Structural InteractionsChenzhong Yin, Hantang Zhang, Mingxi Cheng, Xiongye Xiao, Xinghe Chen, Xin Ren, Paul Bogdan. 4845-4849 [doi]
- Image Steganography with Deep Orthogonal Fusion of Multi-Scale Channel AttentionYinyin Peng, Donghui Hu, Gang Pei, Yaofei Wang. 4850-4854 [doi]
- GSTNet: Gait Spatio-Temporal Network for Gait Recognition Using Millimeter-Wave RadarQiuxia Wu, ZiCheng Wang, Kunming Su, Sangni Xu. 4855-4859 [doi]
- Universal Adversarial Attack Against Speaker Recognition ModelsShoham Hanina, Alon Zolfi, Yuval Elovici, Asaf Shabtai. 4860-4864 [doi]
- On the Privacy of Federated Clustering: a Cryptographic ViewQiongxiu Li, Lixia Luo. 4865-4869 [doi]
- DROPFL: Client Dropout Attacks Against Federated Learning Under Communication ConstraintsWenjun Qian, Qingni Shen, Haoran Xu, Xi Huang, Zhonghai Wu. 4870-4874 [doi]
- Detection and Attribution of Models Trained on Generated DataGe Han, Ahmed Salem, Zheng Li, Shanqing Guo, Michael Backes 0001, Yang Zhang. 4875-4879 [doi]
- MLMTD: A Multi-Layer Malicious Traffic Detection Model Based on Multi-Branch Octave Convolution and Attention MechanismZiang Li, Chengxiang Si, Zhenyu Cheng 0001, Shuyuan Zhao, Yong Ding. 4880-4884 [doi]
- Nebnet: Exploiting Node-Edge Bi-Level Network for Gene Expression PredictionCui Chen, Zuping Zhang, Panrui Tang, Junyu Zhang. 4885-4889 [doi]
- The Collaboration of 3D Convolutions and CRO-TSM in LipreadingYangzhao Xiang, Mutellip Mamut, Nurbiya Yadikar, Ghalipjan Ibrahim, Kurban Ubul. 4890-4894 [doi]
- Effective Image Tampering Localization Via Enhanced Transformer and Co-Attention FusionKun Guo, Haochen Zhu, Gang Cao. 4895-4899 [doi]
- Cross-Modality and Within-Modality Regularization for Audio-Visual Deepfake DetectionHeqing Zou, Meng Shen 0002, Yuchen Hu, Chen Chen 0075, Eng Siong Chng, Deepu Rajan. 4900-4904 [doi]
- HySense: Hybrid Event Occurrence Detection Method for IoT DevicesJian Ge, Jianwu Rui, Hengtai Ma, Bin Li, Yeping He. 4905-4909 [doi]
- Robust and Imperceptible Commercial Camera-Screen Communication with 60Hz Refresh RateHan Zhang, Xutao Yu, Zaichen Zhang, Bingcheng Zhu. 4910-4914 [doi]
- Domain Adaptive Graph ClassificationSiyang Luo, Ziyi Jiang, Zhenghan Chen, Xiaoxuan Liang 0002. 4915-4919 [doi]
- Deep Variational Privacy Funnel: General Modeling with Applications in Face RecognitionBehrooz Razeghi, Parsa Rahimi, Sébastien Marcel. 4920-4924 [doi]
- Causality-Inspired Single-Source Domain Generalization for Face Anti-SpoofingYan He, Fei Peng, Min Long, Kwok-Yan Lam. 4925-4929 [doi]
- Face Reconstruction from Partially Leaked Facial EmbeddingsHatef Otroshi-Shahreza, Sébastien Marcel. 4930-4934 [doi]
- Exploiting Modality-Specific Features for Multi-Modal Manipulation Detection and GroundingJiazhen Wang, Bin Liu 0016, Changtao Miao, Zhiwei Zhao, Wanyi Zhuang, Qi Chu 0001, Nenghai Yu. 4935-4939 [doi]
- Cost Aware Untargeted Poisoning Attack Against Graph Neural NetworksYuwei Han, Yuni Lai, Yulin Zhu, Kai Zhou 0001. 4940-4944 [doi]
- Enhancing Steganography of Generative Image Based on Image RetouchingYue Gao 0003, Jinshuai Yang, Cheng Chen, Kaiyi Pang, Yongfeng Huang 0001. 4945-4949 [doi]
- CPMSVD: Cross-Project Multiclass Software Vulnerability Detection Via Fused Deep Feature and Domain AdaptationGewangzi Du, Liwei Chen, Tongshuai Wu, Chenguang Zhu, Gang Shi. 4950-4954 [doi]
- Streaming Active Learning for Regression Problems Using Regression via ClassificationShota Horiguchi, Kota Dohi, Yohei Kawaguchi. 4955-4959 [doi]
- Seeking Similarities While Removing Differences: Graph Neural Networks Based on Node CorrelationShuangjie Li, Baoming Zhang, Jianqing Song, Yifan Xia, Junyuan Xie, Chongjun Wang. 4960-4964 [doi]
- Mutual Information-Based Fair Active LearningRyosuke Sonoda, Ramya Srinivasan 0002. 4965-4969 [doi]
- AutoFGNN: A Framework for Extracting All Frequency Information from Large-Scale GraphsQi Zhang, Yanfeng Sun, Jipeng Guo 0001, Shaofan Wang, Jinghua Li, Junbin Gao, Baocai Yin. 4970-4974 [doi]
- Audio-Aided Learning Framework for Image Classification with Limited Training ImagesQi Wu, Chengjia Wang, Xiaohui Li, Guangxing Wu, Marta Vallejo, Ruixuan Wang. 4975-4979 [doi]
- Unsupervised Continual Learning of Image Representation Via Rememory-Based SimsiamFeifei Fu, Yizhao Gao, Zhiwu Lu 0001, Haoran Wu, Shiqi Zhao. 4980-4984 [doi]
- Cross-Camera Human Motion Transfer by Time Series AnalysisYaping Zhao, Guanghan Li, Edmund Y. Lam. 4985-4989 [doi]
- Pmmwdeconv: Unsupervised Data-Consistent Blind Passive Millimeterwave Image Deconvolution with Global Context PriorsHao Yang, Ruochen Gu, Zihan Yang, Anyong Hu, Tie Jun Cui, Jungang Miao. 4990-4994 [doi]
- Stable Knowledge Transfer for Contrastive DistillationQiankun Tang. 4995-4999 [doi]
- Enhancing Performance of Coarsened Graphs with Gradient-MatchingWenjie Yang, Shengzhong Zhang, Zengfeng Huang. 5000-5004 [doi]
- DMEL: The Differentiable Log-Mel Spectrogram as a Trainable Layer in Neural NetworksJohn Martinsson, Maria Sandsten. 5005-5009 [doi]
- Offline Reinforcement Learning with Policy Guidance and Uncertainty EstimationLan Wu, Quan Liu, Lihua Zhang, Zhigang Huang. 5010-5014 [doi]
- J-MAE: Jigsaw Meets Masked Autoencoders in X-Ray Security InspectionWeichen Xu, Jian Cao, Tianhao Fu, Awen Bai, Ruilong Ren, Zicong Hu, Xixin Cao, Xing Zhang. 5015-5019 [doi]
- Structure-Aware in-Air Handwritten Text Recognition with Graph-Guided Cross-Modality TranslatorYuyan Chen, Xing Zhao, Ji Gan, Jiaxu Leng, Yan Zhang, Xinbo Gao 0001. 5020-5024 [doi]
- Local Distance Correlation Embedding for Time-Series Analysis on Riemannian ManifoldsLincon S. Souza, Takumi Kobayashi 0001, Yasunori Nishimori, Yasuko Sugase-Miyamoto, Kenji Kawano, Shotaro Akaho, Narihisa Matsumoto. 5025-5029 [doi]
- Dynamic Video Frame Interpolation with Integrated Difficulty Pre-AssessmentBan Chen, Xin Jin, Youxin Chen, Longhai Wu, Jie Chen, Jayoon Koo, Cheul-Hee Hahm. 5030-5034 [doi]
- Language-Guided Few-Shot Semantic SegmentationJing Wang, Yuang Liu, Qiang Zhou, Fan Wang. 5035-5039 [doi]
- Contrastive Learning with Bidirectional Transformers for Knowledge TracingHuijing Zhan, Jung-Jae Kim 0001, Guimei Liu. 5040-5044 [doi]
- Performance Conditioning for Diffusion-Based Multi-Instrument Music SynthesisBen Maman, Johannes Zeitler, Meinard Müller, Amit H. Bermano. 5045-5049 [doi]
- Flexible Keyword Spotting Based on Homogeneous Audio-Text EmbeddingKumari Nishu, Minsik Cho, Paul Dixon, Devang Naik. 5050-5054 [doi]
- Multi-Agent Exploration via Self-Learning and Social LearningShaokang Dong, Chao Li, Wubing Chen, Hongye Cao, Wenbin Li, Yang Gao. 5055-5059 [doi]
- DuNet: A Robust End-to-End Deep Neural Network Framework for Imbalanced ClassificationHaotian Zhang, Hong Qi. 5060-5064 [doi]
- 3TN: Multi-Gate Mixture-of-Experts Based Multi-Valued Treatment Network for Uplift ModelingZexu Sun, Xu Chen. 5065-5069 [doi]
- Boosting Pruned Networks with Linear Over-ParameterizationYu Qian, Xiaoshuang Li, Jian Cao, Jie Zhang, Hufei Li, Jue Chen. 5070-5074 [doi]
- A Density-Guided Temporal Attention Transformer for Indiscernible Object Counting in Underwater VideosCheng-Yen Yang, Hsiang-Wei Huang, Zhongyu Jiang, Hao Wang, Farron Wallace, Jenq-Neng Hwang. 5075-5079 [doi]
- Contrastive Learning for Regression on Hyperspectral DataMohamad Dhaini, Maxime Berar, Paul Honeine, Antonin Van Exem. 5080-5084 [doi]
- Representation Learning across Feature and Topology Views with Output Correction for Graph Convolutional NetworksShuhao Shi, Zhengyan Wang, Jian Chen 0025, Kai Qiao, Jie Yang 0053, Bin Yan 0002. 5085-5089 [doi]
- Learning Inference-Time Drift Sensor-Actuator for Domain GeneralizationShuoshuo Chen, Yushun Tang, Zhehan Kan, Zhihai He. 5090-5094 [doi]
- CLT: Cooperative Lottery Ticket Hypothesis in Live Streaming Sales PredictionLijun Wang. 5095-5099 [doi]
- Efficient Posenet with Coarse to Fine TransformerShaohua Li, Haixiang Zhang, Hanjie Ma, Jie Feng 0010, Mingfeng Jiang. 5100-5104 [doi]
- SweepMM: A High-Quality Multimodal Dataset for Sweeping Robots in Home Scenarios for Vision-Language ModelWeichen Xu, Xinxin Xu, Tianhao Fu, Jian Cao, Xiaoyang Xu, Yuetian Huang, Xixin Cao, Xing Zhang. 5105-5109 [doi]
- Mitigating Optimization Conflict in Domain Adversarial Neural Network via Uncertainty-AwareZhiqun Pan, Yongxiong Wang, Jiapeng Zhang, Xiaoming Wang, Guangpeng Wang. 5110-5114 [doi]
- ProbMCL: Simple Probabilistic Contrastive Learning for Multi-Label Visual ClassificationAhmad Sajedi, Samir Khaki, Yuri A. Lawryshyn, Konstantinos N. Plataniotis. 5115-5119 [doi]
- Trend-Heuristic Reinforcement Learning Framework for News-Oriented Stock Portfolio ManagementWei Ding, Zhennan Chen, Hanpeng Jiang, Yuanguo Lin, Fan Lin. 5120-5124 [doi]
- Learning Invariant Representation with Consistency and Diversity for Semi-Supervised Source Hypothesis TransferXiaodong Wang, Junbao Zhuo, Shuhao Cui, Shuhui Wang, Yuejian Fang. 5125-5129 [doi]
- Diffusion-Based Pose Refinement and Multi-Hypothesis Generation for 3D Human Pose EstimationHongbo Kang, Yong Wang, Mengyuan Liu, Doudou Wu, Peng Liu, Xinlin Yuan, Wenming Yang. 5130-5134 [doi]
- Credible Teacher for Semi-Supervised Object Detection in Open SceneJingyu Zhuang, Kuo Wang, Liang Lin, Guanbin Li. 5135-5139 [doi]
- Federated Learning on Distributed Graphs Considering Multiple HeterogeneitiesBaiqi Li, Yedi Ma, Yufei Liu, Hongyan Gu, Zhenghan Chen, Xinli Huang. 5140-5144 [doi]
- Real-Time Multi-Human Parsing on Embedded DevicesRockson Agyeman, Bernhard Rinner. 5145-5149 [doi]
- G2G: Generalized Learning by Cross-Domain Knowledge Transfer for Federated Domain GeneralizationXinqian Chen, Jin Zhang, Xiaoli Gong. 5150-5154 [doi]
- Search Robust and Adaptable ArchitectureRuicheng Niu, Ziyuan Zhu, Chaofei Li, Dan Meng. 5155-5159 [doi]
- DMT: Comprehensive Distillation with Multiple Self-Supervised TeachersYuang Liu, Jing Wang, Qiang Zhou, Fan Wang, Jun Wang, Wei Zhang. 5160-5164 [doi]
- Neural Network Training Strategy To Enhance Anomaly Detection Performance: A Perspective On Reconstruction Loss AmplificationYeongHyeon Park, Sungho Kang 0002, Myung-Jin Kim, Hyeonho Jeong, Hyunkyu Park 0003, Hyeong-Seok Kim, Juneho Yi. 5165-5169 [doi]
- A Stochastic Gradient Approach for Communication Efficient Confederated LearningBin Wang, Jun Fang, Hongbin Li 0001, Yonina C. Eldar. 5170-5174 [doi]
- GPTCN: Gated Parallel Transformer Convolutional Networks for Downstream-Task User Representation Learning on App UsageYingjie Sun, Fanrui Zeng, Jiamin Xiao, Yuxiao Deng, Yifan Ding, Yizhou Li. 5175-5179 [doi]
- SPASE: Spatial Saliency Explanation For Time Series ModelsPranay Lohia, Badri Narayana Patro, Naveen Panwar, Vijay Agneeswaran. 5180-5184 [doi]
- Self-Knowledge Distillation with Learning from Role-Model SamplesKai Xu 0012, Lichun Wang 0002, Huiyong Zhang, Baocai Yin. 5185-5189 [doi]
- Spatio-Temporal Data Mining with Information Integrity Protection: Graph Signal Based Air Quality PredictionJuncheng Jin, Junhao Zhang, Junjie Tang, Shengrui Liang, Zehui Qu. 5190-5194 [doi]
- Meta Structure Search for Link Weight Prediction in Heterogeneous GraphsXiaoou Zhang, Yang Gao, Yang Liu, Yujia Zhu, Peng Zhang, Chuan Zhou 0001, Qingyun Liu, Hongyang Chen. 5195-5199 [doi]
- DeformMLP: Dynamic Large-Scale Receptive Field MLP Networks for Human Motion PredictionHaitao Huang, Chi-Man Pun, Haolun Li, Mengqi Liu, Jian Xiong 0005, Hao Gao 0005. 5200-5204 [doi]
- Enhanced Unsupervised Domain Adaptation with Dual-Attention Between Classification and Domain AlignmentYifan Pan, Guibo Luo, Bairong Li, Yuesheng Zhu. 5205-5209 [doi]
- AdaPlus: Integrating Nesterov Momentum and Precise Stepsize Adjustment on Adamw BasisLei Guan. 5210-5214 [doi]
- Segmented Error Minimisation (Semi) for Robust Training of Deep Learning Models with Non-Linear Shifts in Reference DataHarry J. Davies, Yuyang Miao, Amir Nassibi, Morteza Khaleghimeybodi, Danilo P. Mandic. 5215-5219 [doi]
- Meta-Learning With Versatile Loss Geometries for Fast Adaptation Using Mirror DescentYilang Zhang, Bingcong Li, Georgios B. Giannakis. 5220-5224 [doi]
- Sequential Acquisition of Features and Experts for Datum-Wise ClassificationSachini Piyoni Ekanayake, Daphney-Stavroula Zois. 5225-5229 [doi]
- Edge Attention Learning for Efficient Camouflaged Object DetectionZijian Liu, Ping Jiang, Lixin Lin, Xiaoheng Deng. 5230-5234 [doi]
- DURRNET: Deep Unfolded Single Image Reflection Removal Network with Joint PriorJun-Jie Huang, Tianrui Liu, Jingyuan Xia, Meng Wang, Pier Luigi Dragotti. 5235-5239 [doi]
- Token-Based Spatiotemporal Representation of the EventsBin Jiang, Zhihao Li, M. Salman Asif, Xun Cao, Zhan Ma. 5240-5244 [doi]
- Dynamic Frequency Domain Graph Convolutional Network for Traffic ForecastingYujie Li, Zezhi Shao, Yongjun Xu, Qiang Qiu, Zhaogang Cao, Fei Wang. 5245-5249 [doi]
- Improving Cross-Domain Few-Shot Classification with Multilayer PerceptronShuanghao Bai, Wanqi Zhou, Zhirong Luan, Donglin Wang, Badong Chen. 5250-5254 [doi]
- Offline Reinforcement Learning with Generative Adversarial Networks and Uncertainty EstimationLan Wu, Quan Liu, Lihua Zhang, Zhigang Huang. 5255-5259 [doi]
- Interpretable Face Aging: Enhancing Conditional Adversarial Autoencoders with Lime ExplanationsChristos Korgialas, Evangelia Pantraki, Constantine Kotropoulos. 5260-5264 [doi]
- Touring Sampling With Pushforward MapsVivien Cabannes, Charles Arnal. 5265-5269 [doi]
- Enhancing Multi-Task Models For Recommendation with Tensor Trace NormBoqi Dai, Kai Ouyang, Jun Yuan, Miaoxin Chen, Xingyu Lu, Weiwen Liu, Rui Zhang, Hai-Tao Zheng 0002. 5270-5274 [doi]
- LaCViT: A Label-Aware Contrastive Fine-Tuning Framework for Vision TransformersZijun Long, Richard McCreadie, Gerardo Aragon-Camarasa, Zaiqiao Meng. 5275-5279 [doi]
- Dynamic Model Structure Adjustment to Realize Quantum Continual Learning Based on Quantum DataHailiang Xu, Haozhen Situ. 5280-5284 [doi]
- Adaptive Multi-Armed Bandit Learning for Task Offloading in Mobile Edge ComputingLin Wang, Jingjing Zhang. 5285-5289 [doi]
- Enhancing Generalization Of Invisible Facial Privacy Cloak Via Gradient AccumulationXuannan Liu, Yaoyao Zhong, Weihong Deng, Hongzhi Shi, Xingchen Cui, Yunfeng Yin, Dongchao Wen. 5290-5294 [doi]
- Modeling Route Representation With Mixed-Scale Hierarchical TransformerHanyuan Zhang, Yuqi Chen 0018, Xinyu Zhang, Qize Jiang, Liang Li, Baihua Zheng, Weiwei Sun. 5295-5299 [doi]
- Randomized Maximum Likelihood Via High-Dimensional Bayesian OptimizationValentin Breaz, Richard Wilkinson. 5300-5304 [doi]
- Deep Neural Network Models Trained with a Fixed Random Classifier Transfer Better Across DomainsHafiz Tiomoko Ali, Umberto Michieli, Ji Joong Moon, Daehyun Kim, Mete Ozay. 5305-5309 [doi]
- Multi-Source DOA Estimation With Statistical Coverage GuaranteesIshan D. Khurjekar, Peter Gerstoft. 5310-5314 [doi]
- On the Open Prompt Challenge in Conditional Audio GenerationErnie Chang, Sidd Srinivasan, Mahi Luthra, Pin-Jie Lin, Varun Nagaraja, Forrest N. Iandola, Zechun Liu, Zhaoheng Ni, Changsheng Zhao 0002, Yangyang Shi, Vikas Chandra. 5315-5319 [doi]
- In-Context Prompt Editing for Conditional Audio GenerationErnie Chang, Pin-Jie Lin, Yang Li 0183, Sidd Srinivasan, Gaël Le Lan, David Kant, Yangyang Shi, Forrest N. Iandola, Vikas Chandra. 5320-5324 [doi]
- Synchformer: Efficient Synchronization From Sparse CuesVladimir Iashin, Weidi Xie, Esa Rahtu, Andrew Zisserman. 5325-5329 [doi]
- Learning Active Subspaces for Effective and Scalable Uncertainty Quantification in Deep Neural NetworksSanket R. Jantre, Nathan M. Urban, Xiaoning Qian, Byung-Jun Yoon. 5330-5334 [doi]
- Multivariate Fourier Distribution Perturbation: Domain Shifts with Uncertainty in Frequency DomainXianfeng Li, Weijie Chen, Shicai Yang, Yishuang Li, Wenhao Guan, Lin Li. 5335-5339 [doi]
- A Probability Gradient Based Approach for Sampling Boundaries of In-Domain DataMiao Jing, Vidhyasaharan Sethu, Beena Ahmed. 5340-5344 [doi]
- BPDO: Boundary Points Dynamic Optimization for Arbitrary Shape Scene Text DetectionJinzhi Zheng, Libo Zhang 0001, Yanjun Wu, Chen Zhao. 5345-5349 [doi]
- RCIF: Towards Robust Distributed DNN Collaborative Inference Under Highly Lossy NetworksYujun Cheng, Zhewei Zhang, Shengjin Wang. 5350-5354 [doi]
- View Crafting For Instance-Level Representation from Scene ImagesBin Liu, YuChen Luo, Shaofeng Zhang, Zehuan Yuan, Changdong Xu, Boan Chen, Junchi Yan. 5355-5359 [doi]
- Social Lode: Human Trajectory Prediction with Latent OdesKexin Ke, Jian Yang, Yingjie Liu, Mingsong Chen, Xian Wei, Xuan Tang. 5360-5364 [doi]
- Towards a Unified View of Adversarial Training: A Contrastive PerspectiveJen-Tzung Chien, Yuan-An Chen. 5365-5369 [doi]
- FDIG: A Fine-Grained Data Integration Approach for Group RecommendationQiwei Ye, Fan Yang, Jing Lu, Yu Tang, Linbo Qiao, Yunwei Zhao. 5370-5374 [doi]
- Active Explainable Recommendation with Limited Labeling BudgetsJingsen Zhang, Xiaohe Bo, Chenxi Wang, Quanyu Dai, Zhenhua Dong, Ruiming Tang, Xu Chen. 5375-5379 [doi]
- Image Mixing and Gradient Smoothing to Enhance the SAR Image Attack TransferabilityYue Xu, Xin Liu, Kun He, Shao Huang, Yaodong Zhao, Jie Gu. 5380-5384 [doi]
- Learning Generalizable Visual Representations via Self-Supervised Information BottleneckXin Liu, Yali Li, Shengjin Wang. 5385-5389 [doi]
- Joint Classification of Hyperspectral and Lidar Data Using Cross-Modal Hierarchical Frequency Fusion NetworkZheng Zeng, Tiecheng Song, Xinran Ma, Yinghao Jiu, Huaiyi Sun. 5390-5394 [doi]
- A Sequential Averaging Plug-and-Play Method for Image Restoration Via Fixed-Point ProjectionShuchang Zhang, Hongxia Wang. 5395-5399 [doi]
- COPHTC: Contrastive Learning with Prompt Tuning for Hierarchical Text ClassificationFuhan Cai, Zhongqiang Zhang, Duo Liu, Xiangzhong Fang. 5400-5404 [doi]
- Prompting to Prompt for Rehearsal-Free Class Incremental LearningGuangzhi Zhao, Yuting Hou, Kedian Mu. 5405-5409 [doi]
- Enhancing Short-and Long-Term Sea Surface Temperature Forecasting with a Static and Dynamic Learnable Personalized Graph Convolution NetworkXiaohan Li, Zhaofeng He, Kai Huang, Zhibo Yang, Gaowei Zhang. 5410-5414 [doi]
- Unravel Anomalies: an End-to-End Seasonal-Trend Decomposition Approach for Time Series Anomaly DetectionZhenwei Zhang, Ruiqi Wang, Ran Ding, Yuantao Gu. 5415-5419 [doi]
- Decoupled Self-Adaptive Distribution Regularization for Few-Shot Image ClassificationBingzhi Chen, Haoming Zhou, Yishu Liu, Biqing Zeng, Guangming Lu, Zheng Zhang 0006. 5420-5424 [doi]
- Exploiting Spatial-Temporal Data for Sleep Stage Classification via Hypergraph LearningYuze Liu, Ziming Zhao 0010, Tiehua Zhang, Kang Wang, Xin Chen, Xiaowei Huang, Jun Yin, Zhishu Shen. 5430-5434 [doi]
- uaMix-MAE: Efficient Tuning of Pretrained Audio Transformers with Unsupervised Audio MixturesAfrina Tabassum, Dung N. Tran, Trung Dang, Ismini Lourentzou, Kazuhito Koishida. 5435-5439 [doi]
- Spatial-Temporal Interaction Decoding Transformer for Unsupervised Multivariate Time Series Anomaly DetectionSonglin Yang, Jing Li, Kuanzhi Shi, Yu Chen, Yunlong Zhu, Xudong He, Jinlong Wu, Chenling Pan. 5440-5444 [doi]
- Signal Transformer: Complex-Valued Attention and Meta-Learning for Signal RecognitionYing Peng, Yihong Dong, Muqiao Yang, Songtao Lu, Qingjiang Shi. 5445-5449 [doi]
- Multi-Signal Fusion of Social Diffusion Graph with Bi-Directional Semantic ConsistencyHuacheng Li, Chunhe Xia, Tianbo Wang, Wanshuang Lin, Changnan Jiang, Chen Chen 0098, Yuan Zhao. 5450-5454 [doi]
- Inputmix: A Strategy to Regularize and Balance Multi-Modality and Multi-View Model LearningJunjie Wang, Tomas Nordström. 5455-5459 [doi]
- Hearing Loss Detection From Facial Expressions in One-On-One ConversationsYufeng Yin 0002, Ishwarya Ananthabhotla, Vamsi Krishna Ithapu, Stavros Petridis, Yu-Hsiang Wu, Christi Miller. 5460-5464 [doi]
- PolarDB: Formula-Driven Dataset for Pre-Training Trajectory EncodersSota Miyamoto, Takuma Yagi, Yuto Makimoto, Mahiro Ukai, Yoshitaka Ushiku, Atsushi Hashimoto 0001, Nakamasa Inoue. 5465-5469 [doi]
- Enhancing the Domain Robustness of Self-Supervised pre-Training with Synthetic ImagesMohamad Hassan N. C, Avigyan Bhattacharya, Victor G. Turrisi da Costa, Biplab Banerjee, Elisa Ricci 0001. 5470-5474 [doi]
- MERG: Multi-Dimensional Edge Representation Generation Layer for Graph Neural NetworksYuxin Song, Cheng Luo, Aaron Jackson, Xi Jia, Weicheng Xie 0001, LinLin Shen, Hatice Gunes, Siyang Song. 5475-5479 [doi]
- One-Stage Training Generative Paradigm for Generalized Zero-Shot LearningShiran Bian, Xiaofan Li, Yachao Zhang, Jiayong Zhong, Yanyun Qu. 5480-5484 [doi]
- Maximal Coding Rate Reduction for Graph EmbeddingsZhengyang Chi, Junbin Gao. 5485-5489 [doi]
- Bi-Directional Motion Attention with Contrastive Learning for few-shot Action RecognitionHanyu Guo, Wanchuan Yu, Yan Yan 0001, Hanzi Wang. 5490-5494 [doi]
- Trusted Deep Domain Adaptation with Uncertainty Measure Based on Evidence TheoryYing Lv, Jianpeng Ma, Qilin Li, Gang Xu. 5495-5499 [doi]
- Defending against Clean-Image Backdoor Attack in Multi-Label ClassificationCheng-Yi Lee, Cheng-Chang Tsai, Ching-Chia Kao, Chun-Shien Lu, Chia-Mu Yu. 5500-5504 [doi]
- Dataset Distillation with Channel Efficient ProcessWenbo Zhou, Guoqing Zheng, Xinghao Ding. 5505-5509 [doi]
- Visually Dehallucinative Instruction GenerationSungguk Cha, Jusung Lee, Younghyun Lee, Cheoljong Yang. 5510-5514 [doi]
- Towards Resource-Efficient and Secure Federated Multimedia RecommendationGuohui Li 0001, Xuanang Ding, Ling Yuan, Lu Zhang, Qian Rong. 5515-5519 [doi]
- Multi-Teacher Distillation for Incremental Object DetectionLe Jiang, Hongqiang Cheng, Xiaozhou Ye, Ye Ouyang. 5520-5524 [doi]
- Enhancing Adversarial Transferability in Object Detection with Bidirectional Feature DistortionXinlong Ding, Jiansheng Chen, Hongwei Yu, Yu Shang, Huimin Ma 0001. 5525-5529 [doi]
- From Convolutional Sparse Coding To *-NMF Factorization of Time-Frequency CoefficientsJean-Baptiste Malagnoux, Matthieu Kowalski. 5530-5534 [doi]
- Diffusion Optimistic Learning for Min-Max OptimizationH. Cai, Sulaiman A. Alghunaim, Ali H. Sayed. 5535-5539 [doi]
- Hierarchical VAE Based Semantic Communications for POMDP TasksDezhao Chen, Wenhui Hua. 5540-5544 [doi]
- Hypergraph-Enhanced Self-Supervised Robust Graph Learning for Social RecommendationShiwei Liu, Yong Xu, Siliang Ma. 5545-5549 [doi]
- Anomaly Detection from a Frequency Perspective: M-Band Wavelet Packet Anomaly Detection NetworkZuogang Shang, Zhibin Zhao, Shibin Wang, Ruqiang Yan 0001. 5550-5554 [doi]
- FastGAT: Simple and Efficient Graph Attention Neural Network with Global-Aware Adaptive Computational Node AttentionShenzhi Yang, Li Zhang, Xiaofang Zhang. 5555-5559 [doi]
- Urban Traffic Flow Forecasting Based on Spatial-Temporal Graph Contrastive LearningLin Pan, Qianqian Ren. 5560-5564 [doi]
- Interpreting Memorization in Deep Learning from Data DistributionLikun Zhang, Jingwei Sun, Shoukun Guo, FengHua Li, Jin Cao, Ben Niu 0001. 5565-5569 [doi]
- Joint INDSCAL Decomposition Meets Blind Source SeparationLe Trung Thanh, Karim Abed-Meraim, Philippe Ravier, Olivier Buttelli, Ales Holobar. 5570-5574 [doi]
- Tensorial Convolutive Blind Source SeparationLe Trung Thanh, Karim Abed-Meraim, Philippe Ravier, Olivier Buttelli, Ales Holobar. 5575-5579 [doi]
- Local and Global Feature Adaptive Adjustment Network for Remote Sensing Image Scene ClassificationFeng Cao, Chang Liu, Deyu Li, Yuhua Qian, Chao Zhang 0046, Hu Zhang. 5580-5584 [doi]
- IRLSG: Invariant Representation Learning for Single-Domain Generalization in Medical Image SegmentationZiwei Niu, Hao Sun 0013, Shuyi Ouyang, Shiao Xie, Yen-Wei Chen 0001, Ruofeng Tong 0001, Lanfen Lin. 5585-5589 [doi]
- Federated CINN Clustering for Accurate Clustered Federated LearningYuhao Zhou, Minjia Shi, Yuxin Tian, Yuanxi Li, Qing Ye, Jiancheng Lv 0001. 5590-5594 [doi]
- Micro-expression recognition by fusing action unit detection and Spatio-temporal featuresLei Wang 0017, Pinyi Huang, Wangyang Cai, Xiyao Liu 0001. 5595-5599 [doi]
- MEAT: Median-Ensemble Adversarial Training for Improving Robustness and GeneralizationZhaozhe Hu, Jia-Li Yin, Bin Chen, Luojun Lin, Bo-Hao Chen, Ximeng Liu. 5600-5604 [doi]
- SIMMKD: Simple Mask-Flow Keypoint Detection for Both Typhoon Detection and Typhoon Eye LocationYunling Feng, Yang Lei, Xinjie Yang, Jian Xu, Xingxian Liu, Bo Xiao, Yajing Xu. 5605-5609 [doi]
- Federated Dataset Dictionary Learning for Multi-Source Domain AdaptationFabiola Espinoza Castellon, Eduardo Fernandes Montesuma, Fred Maurice Ngolè Mboula, Aurélien Mayoue, Antoine Souloumiac, Cédric Gouy-Pailler. 5610-5614 [doi]
- Beyond Empirical Windowing: An Attention-Based Approach for Trust Prediction In Autonomous VehiclesMinxue Niu, Zhaobo K. Zheng, Kumar Akash, Teruhisa Misu. 5615-5619 [doi]
- Multi-Source Domain Adaptation Meets Dataset Distillation through Dataset Dictionary LearningEduardo Fernandes Montesuma, Fred Maurice Ngolè Mboula, Antoine Souloumiac. 5620-5624 [doi]
- DCS: Debiased Contrastive Learning with Weak Supervision for Time Series ClassificationRongyao Cai, Linpeng Peng, Zhengming Lu, Kexin Zhang, Yong Liu. 5625-5629 [doi]
- A Contrario Paradigm for Yolo-Based Infrared Small Target DetectionAlina Ciocarlan, Sylvie Le Hégarat-Mascle, Sidonie Lefebvre, Arnaud Woiselle, Clara Barbanson. 5630-5634 [doi]
- Retaining Informative Latent Variables in Probabilistic SegmentationM. M. Amaan Valiuddin, Christiaan G. A. Viviers, Ruud van Sloun, Peter H. N. de With, Fons van der Sommen. 5635-5639 [doi]
- Domaindiff: Boost out-of-Distribution Generalization with Synthetic DataQiaowei Miao, Junkun Yuan, Shengyu Zhang 0001, Fei Wu 0001, Kun Kuang. 5640-5644 [doi]
- Regularized Conditional Alignment for Multi-Domain Text ClassificationJuntao Hu, Yuan Wu. 5645-5649 [doi]
- Modality Re-Balance for Visual Question Answering: A Causal FrameworkXinpeng Lv, Wanrong Huang, Haotian Wang 0001, Ruochun Jin, Xueqiong Li, Zhipeng Lin, Shuman Li, Yongquan Feng, Yuhua Tang. 5650-5654 [doi]
- Understanding Probe Behaviors Through Variational Bounds of Mutual InformationKwangHee Choi, Jee-weon Jung, Shinji Watanabe 0001. 5655-5659 [doi]
- EC-NAS: Energy Consumption Aware Tabular Benchmarks for Neural Architecture SearchPedram Bakhtiarifard, Christian Igel, Raghavendra Selvan. 5660-5664 [doi]
- FAVANO: Federated Averaging with Asynchronous NodesLouis Leconte, Van Minh Nguyen, Eric Moulines. 5665-5669 [doi]
- Robustness Against Adversarial Attacks Via Learning Confined Adversarial PolytopesShayan Mohajer Hamidi, Linfeng Ye. 5670-5674 [doi]
- Multilingual Transliteration for Pan-Indic Keyboard InputJerome R. Bellegarda. 5675-5679 [doi]
- Wavelet-Inspired Multiscale Graph Convolutional Recurrent Network for Traffic ForecastingQipeng Qian, Tanwi Mallick. 5680-5684 [doi]
- Probabilistic Spike Train InferenceAbhisek Chakraborty. 5685-5689 [doi]
- SPCL-MER: Supervised Prototypical Contrastive Learning for Micro-Expression RecognitionXiqiao Fang, Qingfeng Wu, Lu Cao. 5690-5694 [doi]
- Viewing Writing as Video: Optical Flow based Multi-Modal Handwritten Mathematical Expression RecognitionHanbo Cheng, Jun Du, Pengfei Hu 0006, Jiefeng Ma, Zhenrong Zhang, Mobai Xue. 5695-5699 [doi]
- Continuous Review and Timely Correction: Enhancing the Resistance to Noisy Labels via Self-Not-True DistillationJingyi Wang, Da Huang, Xinghao Wu, Yuhua Tang, Long Lan. 5700-5704 [doi]
- Cubic Knowledge Distillation for Speech Emotion RecognitionZhibo Lou, Shinta Otake, Zhengxiao Li, Rei Kawakami, Nakamasa Inoue. 5705-5709 [doi]
- LEFormer: A Hybrid CNN-Transformer Architecture for Accurate Lake Extraction from Remote Sensing ImageryBen Chen, Xuechao Zou, Yu Zhang 0165, Jiayu Li, Kai Li 0022, Junliang Xing, Pin Tao. 5710-5714 [doi]
- A Fine-Grained Tri-Modal Interaction Model for Multimodal Sentiment AnalysisYuxing Zhi, Junhuai Li, Huaijun Wang, Jing Chen, Ting Cao. 5715-5719 [doi]
- The Selectivity and Competition of the Mind's Eye in Visual PerceptionEdward Kim 0006, Maryam Daniali, Jocelyn Rego, Garrett T. Kenyon. 5720-5724 [doi]
- FDNet: A Novel Multivariate Time Series Classification Model Through Fusing Feature and DifferenceFei Gao 0014, Luofeng Zhang, Yuanming Zhang. 5725-5729 [doi]
- Scalable Model-Based Gaussian Process ClusteringAnirban Chakraborty 0008, Abhisek Chakraborty. 5730-5734 [doi]
- Multivariate Time Series Forecasting with Causal-Temporal Attention NetworkWenbo Liu, Yifan He, Jihong Guan, Shuigeng Zhou. 5735-5739 [doi]
- ESA: Expert-and-Samples-Aware Incremental Learning Under Longtail DistributionJie Mei, Jenq-Neng Hwang. 5740-5744 [doi]
- TCNAS: Transformer Architecture Evolving in Code Clone DetectionHongyan Xu, Xiaohuan Pei, Xiu Su, Shan You, Chang Xu. 5745-5749 [doi]
- FincGAN: A Gan Framework of Imbalanced Node Classification on Heterogeneous Graph Neural NetworkHung-Chun Hsu, Ting-Le Lin, Bo-Jun Wu, Ming-Yi Hong, Che Lin, Chih-Yu Wang 0001. 5750-5754 [doi]
- Enhancing Audio-Visual Question Answering with Missing Modality via Trans-Modal Associative LearningKyu Ri Park, Youngmin Oh, Jung-Uk Kim. 5755-5759 [doi]
- Autonomous Generative Feature Replay for Non-Exemplar Class-Incremental LearningYinjie Zhang, Ming Shao, Wenlong Shi, Haifeng Xia, Siyu Xia. 5760-5764 [doi]
- MVITP: Multi-View Image-Text Perception for Few-Shot Remote Sensing Image ClassificationChen Yang, Tongtong Liu, Didi Jiao, Wenhui Li 0002. 5765-5769 [doi]
- Similarity Knowledge Distillation with Calibrated MaskQi Wang, Wenxin Yu, Lu Che, Chang Liu, Zhiqiang Zhang, Jun Gong, Peng Chen. 5770-5774 [doi]
- Offline Reinforcement Learning Based on Next State SupervisionJie Yan, Quan Liu, Lihua Zhang. 5775-5779 [doi]
- A Novel Contrastive Diffusion Graph Convolutional Network for Few-Shot Skeleton-Based Action RecognitionChao Wei, Zhidong Deng. 5780-5784 [doi]
- 1-D Spatial Attention in Binarized Convolutional Neural NetworksHyunjin Kim 0001, Jungwoo Shin, Wansoo Kim, Alberto A. Del Barrio. 5785-5789 [doi]
- Causally Uncovering Bias in Video Micro-Expression RecognitionPei-Sze Tan, Sailaja Rajanala, Arghya Pal, Shu-Min Leong, Raphaël C.-W. Phan, Huey Fang Ong. 5790-5794 [doi]
- TRET: Two Stream-Based Regionally Enhanced Transformers for Person Re-IdentificationKyoungoh Lee, Kwang-Ju Kim, Pyong-Kun Kim, In-Su Jang. 5795-5799 [doi]
- On The Equivalence Of Dynamic Mode Decomposition And Complex Nonnegative Matrix FactorizationMasahiro Kohjima. 5800-5804 [doi]
- Communication-Efficient Federated Learning Through Adaptive Weight Clustering And Server-Side DistillationVasileios Tsouvalas, Aaqib Saeed, Tanir Ozcelebi, Nirvana Meratnia. 5805-5809 [doi]
- DBS: Differentiable Budget-Aware Searching For Channel PruningZhaokai Zhang, Tianpeng Feng, Yang Liu, Chunnan Sheng, Fanyi Wang, He Cai. 5810-5814 [doi]
- Deformation And Penetration Hybrid Detection-Net For Parcels Inspection In Industrial Supply ChainZhi Chen, Cuifeng Du, Xiujie Huang, Zelong Lin, Yuyu Zhou, Quanlong Guan, Zhefu Li, Shuanghuan Lv, Xiaofeng Wu, Xiaotian Zhuang. 5815-5819 [doi]
- Smooth Start: A Unified Approach for Gradual Transition from Cold to Old in Recommender SystemsJianwen Yang, Xiao Zhang, Jun Xu. 5820-5824 [doi]
- Learning With Non-Uniform Label Noise: A Cluster-Dependent Weakly Supervised ApproachMengtian Zhang, Bo Jiang 0003, Yuye Ling, Xinbing Wang. 5825-5829 [doi]
- Segment Anything Model Guided Semantic Knowledge Learning For Remote Sensing Change DetectionZixuan Sun, Huihui Song, Kaihua Zhang, Gang Dong, Lingyan Liang, Yaqian Zhao. 5830-5834 [doi]
- GLMAE: Graph Representation Learning Method Combining Generative Learning and Masking AutoencoderYunfeng Xu, Shaohui Zhao, Hexun Fan, Jialin Wang. 5835-5839 [doi]
- A Reconstruction-Based Feature Adaptation for Anomaly Detection with Self-Supervised Multi-Scale AggregationZuo Zuo, Zongze Wu 0001, Badong Chen, Xiaopin Zhong. 5840-5844 [doi]
- Forecasting Torsional Resonance in Electric Vehicles by Learning a Quantile RegressorFusataka Kuniyoshi, Inazumi Masanobu, Toshiyuki Koga. 5845-5849 [doi]
- Power-Aware Task-Based Learning of Neuromorphic ADCsTal Vol, Loai Danial, Nir Shlezinger. 5850-5854 [doi]
- Proximal Bellman Mappings for Reinforcement Learning and Their Application to Robust Adaptive FilteringYuki Akiyama, Konstantinos Slavakis. 5855-5859 [doi]
- Adaptive Multi-View Joint Contrastive Learning on GraphsLong Chen, Qianqian Ren, Zilong Li, Hui Xu. 5860-5864 [doi]
- Conversation Clique-Based Model for Emotion Recognition In ConversationZhongQuan Jian, Jiajian Li, Junfeng Yao, Meihong Wang, Qingqiang Wu 0001. 5865-5869 [doi]
- A Learning Resource Recommendation Algorithm Based on Online Learning BehaviorHaoxin Xu, Bihao Hu, Xiaoqing Gu, Longwei Zheng. 5870-5874 [doi]
- Noise-Disentangled Graph Contrastive Learning via Low-Rank and Sparse Subspace DecompositionGehang Zhang, Jiawei Sheng, Shicheng Wang, Tingwen Liu. 5880-5884 [doi]
- Learning a Low-Rank Feature Representation: Achieving Better Trade-Off Between Stability and Plasticity in Continual LearningZhenrong Liu, Yang Li, Yi Gong 0001, Yik-Chung Wu. 5885-5889 [doi]
- Multi-Agent Sparse Interaction Modeling is an Anomaly Detection ProblemChao Li, Shaokang Dong, Shangdong Yang, Hongye Cao, Wenbin Li, Yang Gao. 5890-5894 [doi]
- Quantum Topic Model: Topic Modeling Using Variational Quantum CircuitsWenbo Qiao, Peng Zhang, Jiaming Zhao, Chang Yang. 5895-5899 [doi]
- Asformer: Learning From Adjacent ScaleHanpeng Jiang, Zhennan Chen, Wei Ding, Fan Lin. 5900-5904 [doi]
- Learning from Easy to Hard: Multi-Task Learning with Data SchedulingZeyu Liu, Heyan Chai, Qing Liao 0001. 5905-5909 [doi]
- SSTA: Salient Spatially Transformed AttackRenYang Liu, Wei Zhou 0011, Sixing Wu, Jun Zhao 0007, Kwok-Yan Lam. 5910-5914 [doi]
- Dynamic Replay Training for Class-Incremental LearningYan Yang, Dongdong Ren, Chenglei Peng, Jing Huo, Wenbin Li, Yang Gao. 5915-5919 [doi]
- Long-Term Action Anticipation Based on Contextual AlignmentConstantin Patsch, Jinghan Zhang, Yuankai Wu, Marsil Zakour, Driton Salihu, Eckehard G. Steinbach. 5920-5924 [doi]
- Distill Vision Transformers to CNNs via Teacher CollaborationSunqi Lin, Chong Wang, Yujie Zheng, Chenchen Tao, Xinmiao Dai, Yuqi Li. 5925-5929 [doi]
- Refinement Bird's Eye View Feature for 3D Lane Detection with Dual-Branch View Transformation ModuleHao Ren, Mingwei Wang, Xinyu Lei, Mengli Zhang, Wenpeng Li, Chen Liu. 5930-5934 [doi]
- SGT: Self-Guided Transformer for Few-Shot Semantic SegmentationKangkang Ai, Haigen Hu, Qianwei Zhou, Qiu Guan. 5935-5939 [doi]
- Topology-Dependent Privacy Bound for Decentralized Federated LearningQiongxiu Li, Wenrui Yu, Changlong Ji, Richard Heusdens. 5940-5944 [doi]
- Federated PAC-Bayesian Learning on Non-IID DataZihao Zhao, Yang Liu, Wenbo Ding, Xiao-Ping Zhang 0003. 5945-5949 [doi]
- Image Retrieval with Composed Query by Multi-Scale Multi-Modal FusionZelong Sun, Guoxing Yang, Zhiwu Lu 0001, Hao Jiang, Guojie Zhu, Zhao Cao. 5950-5954 [doi]
- G-SHARP: Globally Shared Kernel with Pruning for Efficient CNNsEunseop Shin, Incheon Cho, Muhammad Awais, A. F. M. Shahab Uddin, YounHo Jang, Sung-Ho Bae. 5955-5959 [doi]
- WFTNet: Exploiting Global and Local Periodicity in Long-Term Time Series ForecastingPeiyuan Liu, Beiliang Wu, Naiqi Li, Tao Dai 0001, Fengmao Lei, Jigang Bao, Yong Jiang 0001, Shu-Tao Xia. 5960-5964 [doi]
- Representation and Boundary Enhancement for Action Segmentation Using TransformerShang-Fu Chen, Cheng-Xun Wen, Wen-Huang Cheng, Kai-Lung Hua. 5965-5969 [doi]
- Adaptive Kalmannet: Data-Driven Kalman Filter with Fast AdaptationXiaoyong Ni, Guy Revach, Nir Shlezinger. 5970-5974 [doi]
- Tensor Low-Rank Approximation of Finite-Horizon Value FunctionsSergio Rozada, Antonio G. Marques. 5975-5979 [doi]
- Privacy-Preserving Deep Learning Using Deformable Operators for Secure Task LearningFabian Perez, Jhon Lopez, Henry Arguello. 5980-5984 [doi]
- Architecture-Agnostic Iterative Black-Box Certified Defense Against Adversarial PatchesDi Yang, Yihao Huang 0006, Qing Guo 0005, Felix Juefei-Xu, Ming Hu 0003, Yang Liu 0003, Geguang Pu. 5985-5989 [doi]
- Image Attribution by Generating ImagesAniket Singh, Anoop M. Namboodiri. 5990-5994 [doi]
- Fourier Domain Approach for Galaxy Spectra Decontamination and DeconvolutionMostafa Bella, Shahram Hosseini, Hicham Saylani, Thierry Contini, Tristan Grégoire, Yannick Deville. 5995-5999 [doi]
- Elevating Visual Prompting in Transfer Learning Via Pruned Model Ensembles: No Retrain, No PainBrian Zhang, Yuguang Yao, Sijia Liu 0001. 6000-6004 [doi]
- Simple Contrastive Representation Learning for Time Series ForecastingXiaochen Zheng, Xingyu Chen, Manuel Schürch, Amina Mollaysa, Ahmed Allam, Michael Krauthammer. 6005-6009 [doi]
- Bayesian Optimization with Gaussian Processes for Robust LocalizationWilliam F. Jenkins, Peter Gerstoft. 6010-6014 [doi]
- Search for Gravitational Wave Probes - A Self-Supervised Learning for Pulsars Based on Signal ContextsShen Wang, Xiaofeng Cheng, Ming Xie, Yuhang Ling, Chao Liu, Mingmin Chi, Pei Wang, Zhongyi Sun 0002, Yabiao Wang. 6015-6019 [doi]
- Learned Layered Coding for Successive Refinement in the Wyner-Ziv ProblemBoris Joukovsky, Brent De Weerdt, Nikos Deligiannis. 6020-6024 [doi]
- MTRGL: Effective Temporal Correlation Discerning Through Multi-Modal Temporal Relational Graph LearningJunwei Su, Shan Wu, Jinhui Li. 6025-6029 [doi]
- Semantic-Enhanced Supervised Contrastive LearningPingyue Zhang, Mengyue Wu, Kai Yu 0004. 6030-6034 [doi]
- Adaptive Parameter Sharing for Multi-Agent Reinforcement LearningDapeng Li, Na Lou, Bin Zhang, Zhiwei Xu, Guoliang Fan. 6035-6039 [doi]
- Elevating Skeleton-Based Action Recognition with Efficient Multi-Modality Self-SupervisionYiping Wei, Kunyu Peng, Alina Roitberg, Jiaming Zhang 0001, Junwei Zheng, Ruiping Liu, Yufan Chen, Kailun Yang 0001, Rainer Stiefelhagen. 6040-6044 [doi]
- MACCN: Multi-Modal Adaptive Co-Attention Fusion Contrastive Learning Networks for Fake News DetectionZepu Yi, Songfeng Lu, Xueming Tang, Junjun Wu, Jianxin Zhu. 6045-6049 [doi]
- Haformer: Heterogeneous Aggregation Transformer for Single Image DerainingDuolin Sun, Yimou Wang, Joey Zhaoyu Zuo, Huan Zheng. 6050-6054 [doi]
- Prototype-Guided Masking for Unsupervised Domain AdaptationKai-Wen Chen, Chen-Kuo Chiang. 6055-6059 [doi]
- Generative Extension Positive Pairs and Improving Sample Selection Based on Contrastive Learning for Unsupervised Person Re-IdentificationZheng-An Zhu, Chen-Kuo Chiang. 6060-6064 [doi]
- Heuristic-Driven, Type-Specific Embedding in Parallel Spaces for Enhancing Knowledge Graph ReasoningYao Liu, Yongfei Zhang, Xin Wang, Shan Yang. 6065-6069 [doi]
- Bayesian-Boosted MetaLoc: Efficient Training and Guaranteed Generalization for Indoor LocalizationDongze Wu, Jun Gao, Feng Yin. 6070-6074 [doi]
- Learning Dynamics of Low-Precision Clipped SGD with MomentumRoula Nassif, Soummya Kar, Stefan Vlaski. 6075-6079 [doi]
- Tree Network Design for Faster Distributed Machine Learning Process with Distributed Dual Coordinate AscentMyung Cho, Meghana Chikkam, Weiyu Xu, Lifeng Lai. 6080-6084 [doi]
- Trajectory set Empowered Hypergraph Transformer for Mobile Sensor Based Traffic PredictionHanyuan Zhang, Xinyu Zhang, Qize Jiang, Liang Li, Baihua Zheng, Weiwei Sun. 6085-6089 [doi]
- Matrix Factorization in Tropical and Mixed Tropical-Linear AlgebrasIoannis Kordonis, Emmanouil Theodosis, George Retsinas, Petros Maragos. 6090-6094 [doi]
- 2D Human Pose Estimation Calibration and Keypoint Visibility ClassificationZhongyu Jiang, Haorui Ji, Cheng-Yen Yang, Jenq-Neng Hwang. 6095-6099 [doi]
- FedAQT: Accurate Quantized Training with Federated LearningRenkun Ni, Yonghui Xiao, Phoenix Meadowlark, Oleg Rybakov, Tom Goldstein, Ananda Theertha Suresh, Ignacio López-Moreno, Mingqing Chen, Rajiv Mathews. 6100-6104 [doi]
- Encoding Seasonal Climate Predictions with Modular Neural NetworkSmit Marvaniya, Jitendra Singh, Nicolas Galichet, Fred Ochieng Otieno, Geeth de Mel, Kommy Weldemariam. 6105-6109 [doi]
- Streaming Anchor Loss: Augmenting Supervision with Temporal SignificanceUtkarsh Oggy Sarawgi, John Berkowitz, Vineet Garg, Arnav Kundu, Minsik Cho, Sai Srujana Buddi, Saurabh Adya, Ahmed H. Tewfik. 6110-6114 [doi]
- Exploration of Visual Prompt in Grounded Pre-Trained Open-Set DetectionQibo Chen, Weizhong Jin, Shuchang Li, Mengdi Liu, Li Yu, Jian Jiang, Xiaozheng Wang. 6115-6119 [doi]
- A Robust Quantile Huber Loss with Interpretable Parameter Adjustment in Distributional Reinforcement LearningParvin Malekzadeh, Konstantinos N. Plataniotis, Zissis Poulos, Zeyu Wang. 6120-6124 [doi]
- Multimodal Transformer with a Low-Computational-Cost GuaranteeSungjin Park, Edward Choi. 6125-6129 [doi]
- OADAS: Optimizing Global Perturbation Attacks with Dual-Path Attribution SynergyXinlei Gao, Jing Liu. 6130-6134 [doi]
- Boundary-Driven Active Learning for Anomaly Detection in Time Series Data StreamsXiaohui Zhou, Yijie Wang 0001, Hongzuo Xu, Mingyu Liu. 6135-6139 [doi]
- Human Motion Generation via Conditioned GMVAE with TUNetYongqi Liu, Jiashuang Zhou, Xiaoqin Du. 6140-6144 [doi]
- Enhancing Event Sequence Modeling with Contrastive Relational InferenceYan Wang, Zhixuan Chu, Tao Zhou, Caigao Jiang, Hongyan Hao, Minjie Zhu, Xindong Cai, Qing Cui, Longfei Li, James Y. Zhang, Siqiao Xue, Jun Zhou. 6145-6149 [doi]
- Noise-BERT: A Unified Perturbation-Robust Framework with Noise Alignment Pre-Training for Noisy Slot Filling TaskJinxu Zhao, Guanting Dong, Yueyan Qiu, Tingfeng Hui, Xiaoshuai Song, Daichi Guo, Weiran Xu. 6150-6154 [doi]
- Radar Recognition in the Wild: Enhancing Radar Emitter Recognition through Auto-Correlation Model-Agnostic Meta LearningYixian Luo, Shaowu Yang, Tianrui Liu, Huibin Tan, Ruochun Jin, Hengzhu Liu, Xueqiong Li. 6155-6159 [doi]
- Adaptive Image-Enhanced Knowledge Graph CompletionMeng Gao, Wei Chen 0056, Tengjiao Wang 0003, Dawei Lu, Jiabin Zheng. 6160-6164 [doi]
- Titan: Bringing the Deep Image Prior to Implicit RepresentationsLorenzo Luzi, Daniel LeJeune, Ali Siahkoohi, Sina Alemohammad, Vishwanath Saragadam, Hossein Babaei, Naiming Liu, Zichao Wang 0001, Richard G. Baraniuk. 6165-6169 [doi]
- Spatio-Temporal Correlation Learning for Multiple Object TrackingYajun Jian, Chihui Zhuang, Wenyan He, Kaiwen Du, Yang Lu 0009, Hanzi Wang. 6170-6174 [doi]
- Image Augmentation with Controlled Diffusion for Weakly-Supervised Semantic SegmentationWangyu Wu, Tianhong Dai, Xiaowei Huang 0001, Fei Ma, Jimin Xiao. 6175-6179 [doi]
- Delay Embedding for Matrix Graphical Model Learning from Dependent DataJitendra K. Tugnait. 6180-6184 [doi]
- Improving Open-Set Recognition with Bayesian Metric LearningTong Chen, Guanchao Feng, Petar M. Djuric. 6185-6189 [doi]
- Reparameterization Head for Efficient Multi-Input NetworksKeke Tang, Wenyu Zhao, Weilong Peng, Xiang Fang, Xiaodong Cui, Peican Zhu, Zhihong Tian. 6190-6194 [doi]
- Fast Personalized Text to Image Synthesis with Attention InjectionYuxuan Zhang, Yiren Song, Jinpeng Yu 0002, Han Pan, Zhongliang Jing. 6195-6199 [doi]
- Audio Match Cutting: Finding and Creating Matching Audio Transitions in Movies and VideosDennis Fedorishin, Lie Lu, Srirangaraj Setlur, Venu Govindaraju. 6200-6204 [doi]
- Large-Scale Multi-View Multiple ClusteringXiaolong Xiong, Jinhan Cui, Rui Xie, Shuzhan Guo, Jun Zhou. 6205-6209 [doi]
- FW-Shapley: Real-Time Estimation of Weighted Shapley ValuesPranoy Panda, Siddharth Tandon, Vineeth N. Balasubramanian. 6210-6214 [doi]
- Robustness Evaluation of Machine Learning Models for Robot Arm Action Recognition in Noisy EnvironmentsElaheh Motamedi, Kian Behzad, Rojin Zandi, Hojjat Salehinejad, Milad Siami. 6215-6219 [doi]
- Pyramid: A Heterogeneous Data Integration Algorithm Based on Hierarchical GraphSining Jiang, Yujun Lan, Weigang Wang, Zhongwen Guo. 6220-6224 [doi]
- Beyond the Limit of Weight-Sharing: Pioneering Space-Evolving NAS with Large Language ModelsXiu Su, Shan You, Hongyan Xu, Xiuxing Li, Jun Long, Yi Chen, Chang Xu. 6225-6229 [doi]
- Pu-Edgeformer++: An Advanced Hierarchical Edge Transformer for Arbitrary-Scale Point Cloud Upsampling using Distance FieldsDohoon Kim, MinWoo Shin, Jaeseok Ryu, Heunseung Lim, Joonki Paik. 6230-6234 [doi]
- Enhancing Noisy Label Learning Via Unsupervised Contrastive Loss with Label Correction Based on Prior KnowledgeMasaki Kashiwagi, Keisuke Maeda, Ren Togo, Takahiro Ogawa 0001, Miki Haseyama. 6235-6239 [doi]
- Edge Deployable Distributed Evolutionary Optimization based Calibration method for Neural QuantizationUtsav Tiwari, Srinivas Soumitri Miriyala, Vikram Nelvoy Rajendiran. 6240-6244 [doi]
- Bregman Graph Neural NetworkJiayu Zhai, Lequan Lin, Dai Shi, Junbin Gao. 6250-6254 [doi]
- Debiasing Recommenders Through Personalized Popularity-Aware MarginsRuiguo Yu, Yue Chen, Mankun Zhao, Jian Yu, Tianyi Xu, Mei Yu, Xuewei Li 0001. 6255-6259 [doi]
- Mixed Precision Neural Quantization with Multi-Objective Bayesian Optimization for on-Device DeploymentSrinivas Soumitri Miriyala, P. K. Suhas, Utsav Tiwari, Vikram Nelvoy Rajendiran. 6260-6264 [doi]
- Biomimetic Mappings for Active Sonar Object Recognition in ClutterSangwook Park, Angeles Salles, Kathryne Allen, Cynthia F. Moss, Mounya Elhilali. 6265-6269 [doi]
- IPCL: Iterative Pseudo-Supervised Contrastive Learning to Improve Self-Supervised Feature RepresentationSonal Kumar, Anirudh Phukan, Arijit Sur. 6270-6274 [doi]
- ECIL-MU: Embedding Based Class Incremental Learning and Machine UnlearningZhiwei Zuo, Zhuo Tang, Bin Wang, Kenli Li 0001, Anwitaman Datta. 6275-6279 [doi]
- DG-RainDiff: Depth-Guided Dynamic Message Passing Diffusion Model for Mixture of Rain RemovalRongwei Yu, Peihao Zhang, Jingyi Xiang. 6280-6284 [doi]
- Diff-HOD: Diffusion Model for Object Detection in Hazy Weather ConditionsYizhan Li, Rongwei Yu, Junjie Shi, Lina Wang 0001. 6285-6289 [doi]
- Disentangle Estimation of Causal Effects from Cross-Silo DataYuxuan Liu, Haozhao Wang, Shuang Wang, Zhiming He, Wenchao Xu 0001, Jialiang Zhu, Fan Yang. 6290-6294 [doi]
- Prototype Calibration with Synthesized Samples for Zero-Shot Chinese Character RecognitionXiang Ao 0002, Xiaohui Li, Xu-Yao Zhang, Chenglin Liu 0001. 6295-6299 [doi]
- Vision-Sensor Attention Based Continual Multimodal Egocentric Activity RecognitionShaoxu Cheng, Chiyuan He, Kailong Chen, Linfeng Xu 0001, Hongliang Li 0001, Fanman Meng, Qingbo Wu 0001. 6300-6304 [doi]
- Pseudo Labels Regularization for Imbalanced Partial-Label LearningMingyu Xu, Zheng Lian, Bin Liu, Zerui Chen, Jianhua Tao 0001. 6305-6309 [doi]
- Visual-Linguistic Representation Learning with Deep Cross-Modality Fusion for Referring Multi-Object TrackingWenyan He, Yajun Jian, Yang Lu, Hanzi Wang. 6310-6314 [doi]
- Attr-Int: A Simple and Effective Entity Alignment Framework for Heterogeneous Knowledge GraphsLinyan Yang, Jingwei Cheng, Chuanhao Xu, Xihao Wang, Jiayi Li, Fu Zhang. 6315-6319 [doi]
- Task-Wise Prompt Query Function for Rehearsal-Free Continual LearningShuai Chen, Mingyi Zhang 0004, Junge Zhang, Kaiqi Huang. 6320-6324 [doi]
- Hyperbolic Diffusion Procrustes Analysis for Intrinsic Representation of Hierarchical Data SetsYa-Wei Eileen Lin, Yuval Kluger, Ronen Talmon. 6325-6329 [doi]
- Mitigating Intra-Class Variance in Few-Shot Point Cloud ClassificationYiqi Wu, Kelin Song, Xuan Huang, Dejun Zhang. 6330-6334 [doi]
- F2GNN: An Adaptive Filter with Feature Segmentation for Graph-Based Fraud DetectionGuanghui Hu, Yang Liu, Qing He, Xiang Ao. 6335-6339 [doi]
- Multi-View Subspace Clustering With Consensus Graph Contrastive LearningJie Zhang, Yuan Sun, Yu Guo 0006, Zheng Wang 0037, Feiping Nie 0001, Fei Wang 0008. 6340-6344 [doi]
- General Phrase Debiaser: Debiasing Masked Language Models at a Multi-Token LevelBingkang Shi, Xiaodan Zhang, Dehan Kong, Yulei Wu, Zongzhen Liu, Honglei Lyu, Longtao Huang. 6345-6349 [doi]
- Fairness-Aware Job Scheduling for Multi-Job Federated LearningYuxin Shi, Han Yu 0001. 6350-6354 [doi]
- Boosting Zero-Shot Human-Object Interaction Detection with Vision-Language TransferSandipan Sarma, Pradnesh Kalkar, Arijit Sur. 6355-6359 [doi]
- On Fine-Tuning Pre-Trained Speech Models With EMA-Target Self-Supervised LossHejung Yang, Hong-Goo Kang. 6360-6364 [doi]
- Water Leak Detection via Domain AdaptationDaniele Ugo Leonzio, Paolo Bestagini, Marco Marcon, Gian Paolo Quarta, Stefano Tubaro. 6365-6369 [doi]
- PLS: Unsupervised Domain Adaptation for 3d Object Detection Via Pseudo-Label SizesShijie Chen, Rongquan Wang, Xin Li, Yuchen Wu, Haizhuang Liu, Jiansheng Chen, Huimin Ma 0001. 6370-6374 [doi]
- A Novel Local-Global Feature Fusion Framework for Body-Weight Exercise Recognition with Pressure Mapping SensorsDavinder Pal Singh, Lala Shakti Swarup Ray, Bo Zhou 0005, Sungho Suh, Paul Lukowicz. 6375-6379 [doi]
- EK-Net: Real-Time Scene Text Detection with Expand Kernel DistanceBoyuan Zhu, Fagui Liu, Xi Chen, Quan Tang 0001. 6380-6384 [doi]
- OLKAVS: An Open Large-Scale Korean Audio-Visual Speech DatasetJeongkyun Park, Jung-Wook Hwang, KwangHee Choi, Seung-Hyeon Lee, Jun Hwan Ahn, Rae-Hong Park, Hyung-Min Park. 6385-6389 [doi]
- Multi-Level Contrastive Learning For Hybrid Cross-Modal RetrievalYiming Zhao, Haoyu Lu, Shiqi Zhao, Haoran Wu, Zhiwu Lu 0001. 6390-6394 [doi]
- Time Changed Normalizing Flows for Accurate SDE ModelingNaoufal El Bekri, Lucas Drumetz, Franck Vermet. 6395-6399 [doi]
- Online Caching With Switching Cost and Operational Long-Term Constraints: An Online Learning ApproachZifan Jia, Qingsong Liu, Xiaoyan Gu, Haihui Fan, Feifei Dai, Bo Li, Weiping Wang. 6400-6404 [doi]
- Semi-Supervised Metrics-Based Self-Training Root Cause Analysis for Cloud-Native Systems with Class-Imbalanced DataYing Huang, Qingfeng Du, Yongqi Han, Cheng He, Fulong Tian. 6405-6409 [doi]
- Sparsespikformer: A Co-Design Framework for Token and Weight Pruning in Spiking TransformerYue Liu, Shanlin Xiao, Bo Li, Zhiyi Yu. 6410-6414 [doi]
- Phase Retrieval by Tensor Total Least SquaresJiani Liu 0002, Ce Zhu, Yang Chen, Xiaolin Huang, Yipeng Liu 0001. 6415-6419 [doi]
- Learning Representations from Explainable and Connectionist Approaches for Visual Question AnsweringAakansha Mishra, Srinivas Soumitri Miriyala, Vikram Nelvoy Rajendiran. 6420-6424 [doi]
- Joint Embedding Learning and Latent Subspace Probing for Cross-Domain Few-Shot Keyword SpottingMete Ozay. 6425-6429 [doi]
- Temporal Relational Context Learning for Extrapolation Reasoning on Temporal Knowledge GraphsShuxian Huang, Ye Wang 0015, Kai Chen, Yan Jia 0001. 6430-6434 [doi]
- MISA: Unveiling the Vulnerabilities in Split Federated LearningWei Wan, Yuxuan Ning, Shengshan Hu, Lulu Xue, Minghui Li, Leo Yu Zhang, Hai Jin 0001. 6435-6439 [doi]
- 3D Parallelism for Transformers via Integer ProgrammingHao Zheng, Peng Liang 0017, Yu Tang, Yanqi Shi, Linbo Qiao, Dongsheng Li 0001. 6440-6444 [doi]
- Target Optimization Direction Guided Transfer Learning for Image ClassificationKelvin Ting Zuo Han, Shengxuming Zhang, Gerard Marcos Freixas, Zunlei Feng, Cheng Jin 0001. 6445-6449 [doi]
- Lipschitz-Constrained Convolutional Layers Using Convex ProjectionBhartendu Kumar, Kunal N. Chaudhury. 6450-6454 [doi]
- DefocusSR: An Efficient Framework for Defocus Image Super-Resolution Guided by Depth InformationQirong Liang, Da Pan 0001, Zefeng Ying, Ping Shi 0001. 6455-6459 [doi]
- A Multi-Scale Bimodal Fusion Network for Robust and Accurate Online Handwriting RecognitionZhen Xu, Ziqiang Chen, Yaqiang Wu, Hui Li, Wanjun Lv, Lianwen Jin, QianYing Wang. 6460-6464 [doi]
- Unraveling Explainable Reinforcement Learning Using Behavior Tree StructuresKejia Wan, Yuntao Liu, Hengzhu Liu, Xinhai Xu. 6465-6469 [doi]
- Disentangled Graph Representation with Contrastive Learning for Rumor DetectionHaoyu Liu, Yuanhai Xue, Xiaoming Yu. 6470-6474 [doi]
- Optimal ANN-SNN Conversion with Group NeuronsLiuzhenghao Lv, Wei Fang, Li Yuan 0007, Yonghong Tian 0001. 6475-6479 [doi]
- Multi-Modality Action Recognition Based on Dual Feature Shift in Vehicle Cabin MonitoringDan Lin, Philip Hann Yung Lee, Yiming Li, Ruoyu Wang, Kim-Hui Yap, Bingbing Li, You Shing Ngim. 6480-6484 [doi]
- A Meta-Preconditioning Approach for Deep Q-LearningSpilios Evmorfos, Athina P. Petropulu. 6485-6489 [doi]
- Rademacher Complexity Regularization for Correlation-Based Multiview Representation LearningMaurice Kuschel, Tanuj Hasija, Timothy Marrinan. 6490-6494 [doi]
- Variational Connectionist Temporal Classification for Order-Preserving Sequence ModelingZheng Nan, Ting Dang, Vidhyasaharan Sethu, Beena Ahmed. 6495-6499 [doi]
- Towards Video-Text Retrieval Adversarial AttackHaozhe Yang, Yuhan Xiang, Ke Sun, Jianlong Hu, Xianming Lin. 6500-6504 [doi]
- Dual-Mix for Cross-Modal Retrieval with Noisy LabelsFeng Ding, Xiu Liu, Xinyi Wang, Fangming Zhong. 6505-6509 [doi]
- Accurate Gigapixel Crowd Counting by Iterative Zooming and RefinementArian Bakhtiarnia, Qi Zhang, Alexandros Iosifidis. 6510-6514 [doi]
- EPA: Neural Collapse Inspired Robust Out-of-distribution DetectorJiawei Zhang, Yufan Chen, Cheng Jin, Lei Zhu, Yuantao Gu. 6515-6519 [doi]
- A PLS-Integrated Lasso Method With Application in Index TrackingShiqin Tang, Yining Dong, S. Joe Qin. 6520-6524 [doi]
- A Comparative Study on Annotation Quality of Crowdsourcing and LLm Via Label AggregationJiyi Li. 6525-6529 [doi]
- Multi-Band Speech Tensor Decomposition for Interactive Feature Extraction in Early Dysphagia ScreeningFei He, Yipeng Liu, Da Shen, Yangyang Jiang, Ying Li, Ce Zhu. 6530-6534 [doi]
- GMTR: Graph Matching TransformersJinpei Guo, Shaofeng Zhang, Runzhong Wang, Chang Liu 0021, Junchi Yan. 6535-6539 [doi]
- CDA-MBPO: Corrected Data Aggregation for Model-Based Policy OptimizationXin Du, Shan Zhong, Wenhao Ying, Yi Wang, Shengrong Gong. 6540-6544 [doi]
- Multi-Relational Graph Diffusion Neural Network with Parallel Retention for Stock Trends ClassificationZinuo You, Pengju Zhang, Jin Zheng, John Cartlidge. 6545-6549 [doi]
- Multi-Level Augmentation Consistency Learning and Sample Selection for Semi-Supervised Domain GeneralizationMei Yu, Yujian Zhang, Xuewei Li 0001, Ruixuan Zhang, Han Jiang 0004, Jie Gao 0008, Zhiqiang Liu 0002. 6550-6554 [doi]
- Two-Stage Transfer Learning for Fusion and Classification of Airborne Hyperspectral ImageryBenjamin Rise, Murat Uney, Xiaowei Huang. 6555-6559 [doi]
- EiffHDR: An Efficient Network for Multi-Exposure High Dynamic Range ImagingXiang Zhang, Qiang Zhu, Tao Hu, Qingsen Yan. 6560-6564 [doi]
- DGLP: Incorporating Orientation Information for Enhanced Link Prediction in Directed GraphsYusen Zhang, Yusong Tan, Songlei Jian, Qingbo Wu 0003, Kenli Li 0001. 6565-6569 [doi]
- GCIA: A Black-Box Graph Injection Attack Method Via Graph Contrastive LearningXiao Liu, Jun-Jie Huang, Wentao Zhao. 6570-6574 [doi]
- Filter-Enhanced Hypergraph Transformer for Multi-Behavior Sequential RecommendationZhufeng Shao, Shoujin Wang, Wenpeng Lu, Weiyu Zhang, Hongjiao Guan, Long Zhao 0002. 6575-6579 [doi]
- Multiway-Adapter: Adapting Multimodal Large Language Models for Scalable Image-Text RetrievalZijun Long, George Killick, Richard McCreadie, Gerardo Aragon-Camarasa. 6580-6584 [doi]
- Grounded-Instruct-Pix2Pix: Improving Instruction Based Image Editing with Automatic Target GroundingArtur Shagidanov, Hayk Poghosyan, Xinyu Gong, Zhangyang Wang, Shant Navasardyan, Humphrey Shi. 6585-6589 [doi]
- Memory-Augmented Online Video Anomaly DetectionLeonardo Rossi, Vittorio Bernuzzi, Tomaso Fontanini, Massimo Bertozzi, Andrea Prati 0001. 6590-6594 [doi]
- Stochastic Configuration Networks for Laboratory Seismic Time-to-Failure PredictionYuanhang Qiu. 6595-6599 [doi]
- Personalized Local Differentially Private Federated Learning with Adaptive Client SamplingYizhou Chen, Wangjie Xu, Xincheng Wu, Meng Zhang, Bing Luo. 6600-6604 [doi]
- Synthesizing Aβ-Pet Via An Image And Label Conditioning Latent Diffusion Model For Detecting Amyloid StatusZaixin Ou, Yongsheng Pan, Yuanning Li, Fang Xie, Qihao Guo, Dinggang Shen. 6610-6614 [doi]
- Saliency Prediction of Sports Videos: A Large-Scale Database and a Self-Adaptive ApproachMinglang Qiao, Mai Xu, ShiJie Wen, Lai Jiang, Shengxi Li, Tao Xu, Yunjin Chen, Leonid Sigal. 6615-6619 [doi]
- Semantic Distillation and Structural Alignment Network for Fake News DetectionShangdong Liu, Xiaofan Yue, Fei Wu 0004, Jing Sun, Yujian Feng, Yimu Ji. 6620-6624 [doi]
- Push4Rec: Temporal and Contextual Trend-Aware Transformer Push Notification RecommenderChu-Chun Yu, Ming-Yi Hong, Chiok-Yew Ho, Che Lin. 6625-6629 [doi]
- Pareto Graph Self-Supervised LearningZhengyu Chen 0001, Teng Xiao, Donglin Wang, Min Zhang. 6630-6634 [doi]
- Momentum-Imbued Langevin Dynamics (MILD) for Faster SamplingNishanth Shetty, Manikanta Bandla, Nishit Neema, Siddarth Asokan, Chandra Sekhar Seelamantula. 6635-6639 [doi]
- Hyperganstrument: Instrument Sound Synthesis and Editing With Pitch-Invariant HypernetworksZhe Zhang, Taketo Akama. 6640-6644 [doi]
- AdaFL: Adaptive Client Selection and Dynamic Contribution Evaluation for Efficient Federated LearningQingming Li, Xiaohang Li, Li Zhou, Xiaoran Yan. 6645-6649 [doi]
- Sunflower Strategy for Bayesian Relational Data AnalysisMasahiro Nakano, Ryohei Shibue, Kunio Kashino. 6650-6654 [doi]
- Transformer-Inspired Lightweight Model for Efficient Time Series ForecastingXu Wang, Kele Xu, Ting Yu, Bo Ding, Dawei Feng. 6655-6659 [doi]
- Improve Deep Forest with Learnable Layerwise Augmentation Policy SchedulesHongyu Zhu 0004, Sichu Liang, Wentao Hu, Fang-Qi Li, Yali Yuan, Shi-Lin Wang, Guang Cheng 0001. 6660-6664 [doi]
- Motion Latent Diffusion for Stochastic Trajectory PredictionWeishang Wu, Xiaoheng Deng. 6665-6669 [doi]
- Enhancing Cross-Domain Detection: Adaptive Class-Aware Contrastive TransformerZiru Zeng, Yue Ding 0001, Hongtao Lu. 6670-6674 [doi]
- CommIN: Semantic Image Communications as an Inverse Problem with INN-Guided Diffusion ModelsJiakang Chen, Di You, Deniz Gündüz, Pier Luigi Dragotti. 6675-6679 [doi]
- Complex Bounded Component Analysis: Identifiability and AlgorithmJingzhou Hu, Kejun Huang. 6680-6684 [doi]
- Following the Embedding: Identifying Transition Phenomena in Wav2vec 2.0 Representations of Speech AudioPatrick Cormac English, Erfan A. Shams, John D. Kelleher, Julie Carson-Berndsen. 6685-6689 [doi]
- Temporal Inconsistency-Based Active LearningTianjiao Wan, Yutao Dou, Kele Xu, Zijian Gao, Bo Ding, Dawei Feng, Huaimin Wang. 6690-6694 [doi]
- Data-Scarce Condition Modeling Requires Model-Based Prior RegularizationNikolaus Mutsam, Alexander Fuchs 0009, Fabio Ziegler, Franz Pernkopf. 6695-6699 [doi]
- Gradient Reactivation Enhanced Causal Attention for Out-Of-Distribution Generalizable Graph ClassificationXu Wang, Pengfei Gu, Yudong Zhang, Binwu Wang, Pengkun Wang, Yang Wang. 6700-6704 [doi]
- STS-CCL: Spatial-Temporal Synchronous Contextual Contrastive Learning for Urban Traffic ForecastingLincan Li, Kaixiang Yang, Jichao Bi, Fengji Luo. 6705-6709 [doi]
- Extrinsic Versus App Information Feedback in Turbo Vep Mu-Mimo Receivers: Optimization Via Deep UnfoldingArthur Michon, Charly Poulliat, Adam Mekhiche, Antonio Maria Cipriano. 6710-6714 [doi]
- Multi-Attention Enhanced Discriminator for GAN-Based Anomalous Sound DetectionShuxin Liu, Jiliang Li, Wei Ke, Hao Yin. 6715-6719 [doi]
- Adaptive Quantization with Mixed-Precision Based on Low-Cost ProxyJunzhe Chen, Qiao Yang, Senmao Tian, Shunli Zhang. 6720-6724 [doi]
- Contrastive Deep Nonnegative Matrix Factorization For Community DetectionYuecheng Li, Jialong Chen, Chuan Chen 0001, Lei Yang, Zibin Zheng. 6725-6729 [doi]
- Towards Multi-Domain Face Landmark Detection with Synthetic Data from Diffusion ModelYuanming Li, Gwantae Kim, Jeong-gi Kwak, Bonhwa Ku, Hanseok Ko. 6730-6734 [doi]
- EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio CaptioningJaeyeon Kim, Jaeyoon Jung, Jinjoo Lee, Sang Hoon Woo. 6735-6739 [doi]
- PECR: Parameter-Efficient Transfer Learning with Cross-Modal Representation Learning for Remote Sensing Visual Question AnsweringPengfei Li, Jinlong He, Gang Liu, Shenjun Zhong. 6740-6744 [doi]
- Cross-Image Distillation for Semi-Supervised Semantic SegmentationNan Zhang, Fan Xiao, Junlin Hou, Ruiwei Zhao, Xiaobo Zhang, Rui Feng. 6745-6749 [doi]
- Novel Architecture of Deep Feature-Based Gaussian Processes with an Ensemble of KernelsYuanqing Song, Yuhao Liu 0002, Petar M. Djuric. 6750-6754 [doi]
- Context-Aware and Contrastiveness-Driven Feature Learning for Cross-Domain Few-Shot Hyperspectral Image ClassificationSuhua Zhang, Fangming Zhong, Zhikui Chen. 6755-6759 [doi]
- Understanding Data Augmentation From A Robustness PerspectiveZhendong Liu, Jie Zhang, Qiangqiang He, Chongjun Wang. 6760-6764 [doi]
- Paste and Harmonize via Denoising: Subject-Driven Image Editing with Frozen Pre-Trained Diffusion ModelXin Zhang, Jiaxian Guo, Paul Yoo, Yutaka Matsuo, Yusuke Iwasawa. 6765-6769 [doi]
- Cross-Lingual Learning in Multilingual Scene Text RecognitionJeonghun Baek, Yusuke Matsui, Kiyoharu Aizawa. 6770-6774 [doi]
- Subnetwork-To-Go: Elastic Neural Network with Dynamic Training and Customizable InferenceKai Li, Yi Luo. 6775-6779 [doi]
- From Game Theory to Visual Recognition: Advancing DNN RobustnessZhendong Liu, Wenyu Jiang, Ming Guo, Chongjun Wang. 6780-6784 [doi]
- Mutual Information Assisted Graph Convolution Network for Cold-Start RecommendationWenbo Wang, Ben Chen, Bingquan Liu, Xinxin Wang, Luwei Yang, Wen Jiang, Wei Ning, Jian Guan. 6785-6789 [doi]
- Fusing Multi-Level Features from Audio and Contextual Sentence Embedding from Text for Interview-Based Depression DetectionJunqi Xue, Ruihan Qin, Xinxu Zhou, Honghai Liu 0001, Min Zhang, Zhiguo Zhang. 6790-6794 [doi]
- Pixel-Superpixel Contrastive Learning and Pseudo-Label Correction for Hyperspectral Image ClusteringRenxiang Guan, Zihao Li, Xianju Li, Chang Tang. 6795-6799 [doi]
- Class-Wise Buffer Management for Incremental Object Detection: An Effective Buffer Training StrategyJunsu Kim, Sumin Hong, Chanwoo Kim, Jihyeon Kim, Yihalem Yimolal Tiruneh, Jeongwan On, Jihyun Song, Sunhwa Choi, SeungRyul Baek. 6800-6804 [doi]
- Context-Aware Preference Learning System Based on Dirichlet Process Gaussian Mixture ModelXianbo Xu, Bart van Erp, Tanya Ignatenko. 6805-6809 [doi]
- On Estimating Link Prediction Uncertainty Using Stochastic CenteringPuja Trivedi, Danai Koutra, Jayaraman J. Thiagarajan. 6810-6814 [doi]
- NAC: Mitigating Noisy Correspondence in Cross-Modal Matching Via Neighbor Auxiliary CorrectorYuqing Li, Haoming Huang, Jian Xu, Shao-Lun Huang. 6815-6819 [doi]
- T-Foley: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound SynthesisYoonjin Chung, Junwon Lee, Juhan Nam. 6820-6824 [doi]
- Exploring the Utility of Clip Priors for Visual Relationship PredictionRakshith Subramanyam, T. S. Jayram, Rushil Anirudh, Jayaraman J. Thiagarajan. 6825-6829 [doi]
- T-EnFP: An Efficient Transformer Encoder-Based System for Driving Behavior ClassificationBin Guo, John H. L. Hansen. 6830-6834 [doi]
- Analysis of the Memorization and Generalization Capabilities of AI Agents: are Continual Learners Robust?Minsu Kim, Walid Saad. 6840-6844 [doi]
- Target Localization Based on Multistatic Mimo Radar via Double Coupled Canonical Polyadic DecompositionGuo-Zhao Liao, Xiao-Feng Gong, Qiu-Hua Lin. 6845-6849 [doi]
- Beyond Simple Text Style Transfer: Unveiling Compound Text Style Transfer with Prompt-Based Pre-Trained Language ModelsShuai Ju, Chenxu Wang. 6850-6854 [doi]
- An Adaptive Algorithm for Tracking Third-Order Coupled Canonical Polyadic DecompositionXin-Tong Liu, Xiao-Feng Gong, Dong Zhao, Qiu-Hua Lin. 6855-6859 [doi]
- Stability of Graph Convolutional Neural Networks Through The Lens of Small Perturbation AnalysisLucia Testa, Claudio Battiloro, Stefania Sardellitti, Sergio Barbarossa. 6865-6869 [doi]
- FIBA: Federated Invisible Backdoor AttackLu Zhang, Baolin Zheng. 6870-6874 [doi]
- Quantum Privacy Aggregation of Teacher Ensembles (QPATE) for Privacy Preserving Quantum Machine LearningWilliam M. Watkins, Heehwan Wang, Sangyoon Bae, Huan-Hsin Tseng, Jiook Cha, Samuel Yen-Chi Chen, Shinjae Yoo. 6875-6879 [doi]
- Freq2Time: Weakly Supervised Learning of Camera-Based RPPG from Heart RateJeremy Speth, Korosh Vatanparvar, Li Zhu, Jilong Kuang, Alex Gao 0001. 6880-6884 [doi]
- Unsupervised Optimal Power Flow Using Graph Neural NetworksDamian Owerko, Fernando Gama, Alejandro Ribeiro. 6885-6889 [doi]
- AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation ModelsYuan Tseng, Layne Berry, Yiting Chen, I-Hsiang Chiu, Hsuan-Hao Lin, Max Liu, Puyuan Peng, Yi-Jen Shih, Hung-Yu Wang, Haibin Wu, Poyao Huang 0001, Chun-Mao Lai, Shang-wen Li 0001, David Harwath, Yu Tsao 0001, Abdelrahman Mohamed, Chi-Luen Feng, Hung-yi Lee. 6890-6894 [doi]
- Discriminative Semi-Supervised Feature Selection Via a Class-Credible Pseudo-Label Learning FrameworkXin Qi, Han Zhang 0012, Feiping Nie 0001. 6895-6899 [doi]
- A Comparison of Parameter-Efficient ASR Domain Adaptation Methods for Universal Speech and Language ModelsKhe Chai Sim, Zhouyuan Huo, Tsendsuren Munkhdalai, Nikhil Siddhartha, Adam Stooke, Zhong Meng, Bo Li 0028, Tara N. Sainath. 6900-6904 [doi]
- DDI-CoCo: A Dataset for Understanding the Effect of Color Contrast in Machine-Assisted Skin Disease DetectionMing-Chang Chiu, Yingfei Wang, Yen-Ju Kuo, Pin-Yu Chen. 6905-6909 [doi]
- Convergent Plug-And-Play Using Contractive DenoisersPravin Nair, Kunal N. Chaudhury. 6910-6914 [doi]
- Towards Building The Federatedgpt: Federated Instruction TuningJianyi Zhang, Saeed Vahidian, Martin Kuo, Chunyuan Li, Ruiyi Zhang, Tong Yu 0001, Guoyin Wang 0002, Yiran Chen 0001. 6915-6919 [doi]
- Fast Approximation of the Generalized Sliced-Wasserstein DistanceDung Le, Huy Nguyen, Khai Nguyen, Trang Nguyen, Nhat Ho. 6920-6924 [doi]
- Multi-Interest Learning for Multi-Modal Paper RecommendationXiaoteng Shen, LiangCai Su, Xi Xiao, Yi Li. 6925-6929 [doi]
- Personalized Federated Learning with Attention-Based Client SelectionZihan Chen, Jundong Li, Cong Shen. 6930-6934 [doi]
- Uncertainty-Guided Contrastive Learning For Single Source Domain GeneralisationAnastasios Arsenos, Dimitrios D. Kollias, Evangelos Petrongonas, Christos Skliros, Stefanos D. Kollias. 6935-6939 [doi]
- Physics-Guided Variational Graph Autoencoder For Air Quality InferenceEsther Rodrigo Bonet, Nikos Deligiannis. 6940-6944 [doi]
- Fracture Assembly with Segmentation And Iterative RegistrationJinhyeok Kim, Inha Lee, Kyungdon Joo. 6945-6949 [doi]
- HAROOD: Human Activity Classification and Out-Of-Distribution Detection with Short-Range FMCW RadarSabri Mustafa Kahya, Muhammet Sami Yavuz, Eckehard G. Steinbach. 6950-6954 [doi]
- Privacy Preserving Federated Learning from Multi-Input Functional Proxy Re-EncryptionXinyu Feng 0002, Qingni Shen, Cong Li, Yuejian Fang, Zhonghai Wu. 6955-6959 [doi]
- Audio-Journey: Open Domain Latent Diffusion Based Text-To-Audio GenerationJackson Michaels, Juncheng B. Li, Laura Yao, Lijun Yu, Zach Wood-Doughty, Florian Metze. 6960-6964 [doi]
- Neural Stochastic Differential Equations with Change Points: A Generative Adversarial ApproachZhongchang Sun, Yousef El-Laham, Svitlana Vyetrenko. 6965-6969 [doi]
- MEPE: A Minimalist Ensemble Policy Evaluation Operator for Deep Reinforcement LearningQiang He, Xinwen Hou. 6970-6974 [doi]
- Variance Reduction Can Improve Trade-Off in Multi-Objective LearningHeshan Devaka Fernando, Lisha Chen, Songtao Lu, Pin-Yu Chen, Miao Liu, Subhajit Chaudhury, Keerthiram Murugesan, Gaowen Liu, Meng Wang 0003, Tianyi Chen. 6975-6979 [doi]
- Generalized Multi-Source Inference for Text Conditioned Music Diffusion ModelsEmilian Postolache, Giorgio Mariani, Luca Cosmo, Emmanouil Benetos, Emanuele Rodolà. 6980-6984 [doi]
- Conformalized Multimodal Uncertainty Regression and ReasoningDomenico Parente, Nastaran Darabi, Alex C. Stutts, Theja Tulabandhula, Amit Ranjan Trivedi. 6985-6989 [doi]
- DeepGRE: Global Robustness Evaluation of Deep Neural NetworksTianle Zhang, Jiaxu Liu, Yanghao Zhang, Ronghui Mu, Wenjie Ruan. 6990-6994 [doi]
- GPT-4 Driven Cinematic Music Generation Through Text ProcessingMuhammad Taimoor Haseeb, Ahmad Hammoudeh, Gus Xia. 6995-6999 [doi]
- Prioritizing Data Acquisition for end-to-end Speech Model ImprovementAlkis Koudounas, Eliana Pastor, Giuseppe Attanasio, Luca de Alfaro, Elena Baralis. 7000-7004 [doi]
- Fixed Inter-Neuron Covariability Induces Adversarial RobustnessMuhammad A. Shah, Bhiksha Raj. 7005-7009 [doi]
- Unsupervised multiple domain translation through controlled Disentanglement in variational autoencoderAntonio Almudévar, Théo Mariotte, Alfonso Ortega Giménez, Marie Tahon. 7010-7014 [doi]
- AttHear: Explaining Audio Transformers Using Attention-Aware NMFAlican Akman, Björn W. Schuller. 7015-7019 [doi]
- Knowledge-Based Convolutional Neural Network for the Simulation and Prediction of Two-Phase Darcy FlowsZakaria Elabid, Daniel Busby, Abdenour Hadid. 7020-7024 [doi]
- Counting Network for Learning from Majority LabelKaito Shiku, Shinnosuke Matsuo, Daiki Suehiro, Ryoma Bise. 7025-7029 [doi]
- PHYOT: Physics-Informed Object Tracking in Surveillance CamerasKawisorn Kamtue, José M. F. Moura, Orathai Sangpetch, Paulo Garcia. 7030-7034 [doi]
- Spatiotemporal Group Anomaly Detection via Graph Total Variation on TensorsMert Indibi, Selin Aviyente. 7035-7039 [doi]
- Augment on Manifold: Mixup Regularization with UMAPYousef El-Laham, Elizabeth Fons, Dillon Daudert, Svitlana Vyetrenko. 7040-7044 [doi]
- Graph Convolutional Neural Networks In The Companion ModelJohn Shi, Shreyas Chaudhari, José M. F. Moura. 7045-7049 [doi]
- Identifying Attack-Specific Signatures in Adversarial ExamplesHossein Souri, Pirazh Khorramshahi, Chun Pong Lau 0001, Micah Goldblum, Rama Chellappa. 7050-7054 [doi]
- Federated Learning under Restricted user AvailabilityPeriklis Theodoropoulos, Konstantinos E. Nikolakakis, Dionysis Kalogerias. 7055-7059 [doi]
- HMM-based CSI Embedding for Trajectory Recovery from RSS Measurements of Non-Cooperative DevicesZheng Xing, Junting Chen. 7060-7064 [doi]
- Skip-Step Contrastive Predictive Coding for Time Series Anomaly DetectionKexin Zhang, Qingsong Wen, Chaoli Zhang, Liang Sun, Yong Liu. 7065-7069 [doi]
- GBSD: Generative Bokeh with Stage DiffusionJieren Deng, Xin Zhou 0017, Hao Tian, Zhihong Pan 0001, Derek Aguiar. 7070-7074 [doi]
- SpectrumNet: Spectrum-Based Trajectory Encode Neural Network for Pedestrian Trajectory PredictionShaohua Liu, Yinglong Zhu, Pengfei Yao, Tianlu Mao, Zhaoqi Wang. 7075-7079 [doi]
- Ten-Guard: Tensor Decomposition for Backdoor Attack Detection in Deep Neural NetworksKhondoker Murad Hossain, Tim Oates 0001. 7080-7084 [doi]
- A Machine-Learning Model for Detecting Depression, Anxiety, and Stress from SpeechMashrura Tasnim, Ramon E. Diaz-Ramos, Eleni Stroulia, Luis A. Trejo. 7085-7089 [doi]
- SEA-GNN: Sequence Extension Augmented Graph Neural Network for Sequential RecommendationGeyunqian Zu, Shengjie Zhao, Jin Zeng, Shilong Dong, Zixuan Chen. 7090-7094 [doi]
- Higher Order Multiple Graph Filtering for Structured Graph LearningLiang Du 0003, Xiaodong Li, Yan Chen, Gui Yang, Mian Ilyas Ahmad, Peng Zhou 0006. 7095-7099 [doi]
- The Power of Few: Accelerating and Enhancing Data Reweighting with Coreset SelectionMohammad Jafari, Yimeng Zhang, Yihua Zhang, Sijia Liu. 7100-7104 [doi]
- Improving Continual Learning of Acoustic Scene Classification via Mutual Information OptimizationMuqiao Yang, Umberto Cappellazzo, Xiang Li, Bhiksha Raj. 7105-7109 [doi]
- Graph-Enhanced Hybrid Sampling for Multi-Armed Bandit RecommendationFen Wang, Taihao Li, Wuyue Zhang, Xue Zhang, Cheng Yang. 7110-7114 [doi]
- Engineering the Neural Collapse Geometry of Supervised-Contrastive LossJaidev Gill, Vala Vakilian, Christos Thrampoulidis. 7115-7119 [doi]
- Source-Free Domain Adaptation for Millimeter Wave Radar Based Human Activity RecognitionJin Liu 0012, Dejiao Zeng, Ludi Li, Hanhe Lin, Xu Tian. 7120-7124 [doi]
- uSee: Unified Speech Enhancement And Editing with Conditional Diffusion ModelsMuqiao Yang, Chunlei Zhang, Yong Xu 0004, Zhongweiyang Xu, Heming Wang, Bhiksha Raj, Dong Yu 0001. 7125-7129 [doi]
- Communication-Efficient Decentralized Dynamic Kernel LearningPing Xu, Yue Wang, Xiang Chen, Zhi Tian. 7135-7139 [doi]
- Enhanced KPI Anomaly Detection: An Unsupervised Hybrid Model with Dynamic ThresholdYilin Wang, Tao Chen, Yuliang Tang, Lianfen Huang. 7140-7144 [doi]
- UNIDEAL: Curriculum Knowledge Distillation Federated LearningYuwen Yang, Chang Liu 0078, Xun Cai, Suizhi Huang, Hongtao Lu, Yue Ding 0001. 7145-7149 [doi]
- Learning Disentangled Speech Representations with Contrastive Learning and Time-Invariant RetrievalYimin Deng, Huaizhen Tang, Xulong Zhang 0001, Ning Cheng 0001, Jing Xiao 0006, Jianzong Wang. 7150-7154 [doi]
- Bridging The Domain Gap Arising from Text Description Differences for Stable Text-To-Image GenerationTian Tan 0016, Weimin Tan, Xuhao Jiang, Yueming Jiang, Bo Yan 0001. 7155-7159 [doi]
- Distributed Stochastic Contextual Bandits for Protein Drug InteractionJiabin Lin, Karuna Anna Sajeevan, Bibek Acharya, Shana Moothedath, Ratul Chowdhury. 7160-7164 [doi]
- Graph Identification and Upper Confidence Evaluation for Causal Bandits with Linear ModelsChen Peng, Di Zhang, Urbashi Mitra. 7165-7169 [doi]
- DIB-X: Formulating Explainability Principles for a Self-Explainable Model Through Information Theoretic LearningChangkyu Choi, Shujian Yu, Michael Kampffmeyer, Arnt-Børre Salberg, Nils Olav Handegard, Robert Jenssen. 7170-7174 [doi]
- K-Means Clustering Based on Chebyshev Polynomial Graph FilteringLiang Du 0003, Yunhui Liang, Mian Ilyas Ahmad, Peng Zhou. 7175-7179 [doi]
- Adversarial Domain Adaptation for Classification with Nested DichotomiesAkram Heidarizadeh, George K. Atia. 7180-7184 [doi]
- CLAF: Contrastive Learning with Augmented Features for Imbalanced Semi-Supervised LearningBowen Tao, Lan Li, Xin-Chun Li, De-Chuan Zhan. 7185-7189 [doi]
- StableMiss+: Prediction with Incomplete Data Under Agnostic Mask Distribution ShiftYichen Zhu, Bo Jiang 0003. 7190-7194 [doi]
- An Efficient Alternating Riemannian/Projected Gradient Descent Ascent Algorithm for Fair Principal Component AnalysisMeng Xu, Bo Jiang 0010, Wenqiang Pu, Ya-Feng Liu, Anthony Man-Cho So. 7195-7199 [doi]
- Phase-Space-Guided Deep Learning For Time Series ForecastingJingze Lu, Kaijun Ren, Taikang Yuan, Wuxin Wang. 7200-7204 [doi]
- Sequential Detection of Anomalies in Noisy Outputs of an Unknown Function Using Gaussian and Yule-Simon ProcessesLiu Yang, Kurt Butler, Petar M. Djuric. 7205-7209 [doi]
- Functionally Similar Multi-Label Knowledge DistillationBinghan Chen, Jianlong Hu, Xiawu Zheng, Wei Lin, Fei Chao 0001, Rongrong Ji. 7210-7214 [doi]
- Self-Supervised Dual Generative Networks for Edge-Preserving Image SmoothingHuiqing Qi, Shengli Tan, Xiaoliu Luo. 7215-7219 [doi]
- Comparing and Combining Audio Processing and Deep Learning Features for Classification of Heartbeat SoundsVinícius Araújo Rabello Landeira, Jardel Oliveira Santos, Hitoshi Nagano. 7220-7224 [doi]
- Self-Supervised Pulse-Aware Interpretable Disentangled ECG Representation LearningChun-Ti Chou, Vincent S. Tseng. 7225-7229 [doi]
- Mitigate Replication and Copying in Diffusion Models with Generalized Caption and Dual Fusion EnhancementChenghao Li, Dake Chen, Yuke Zhang, Peter A. Beerel. 7230-7234 [doi]
- The Double-Edged Sword Of Ai Safety: Balancing Anomaly Detection and OOD Generalization Via Model AnchoringVivek Sivaraman Narayanaswamy, Rushil Anirudh, Jayaraman J. Thiagarajan. 7235-7239 [doi]
- INCPrompt: Task-Aware Incremental Prompting for Rehearsal-Free Class-Incremental LearningZhiyuan Wang, Xiaoyang Qu, Jing Xiao 0006, Bokui Chen, Jianzong Wang. 7240-7244 [doi]
- On the Convergence of Hierarchical Federated Learning with Gradient Quantization and Imperfect TransmissionHaofeng Sun, Hui Tian 0003, Wanli Ni, Jingheng Zheng. 7245-7249 [doi]
- 3ARL: Moment-Embedded Mean-Field Multi-Agent Reinforcement Learning for Continuous Action SpaceHuaze Tang, Yuanquan Hu, Fanfan Zhao, Junji Yan, Ting Dong, Wenbo Ding. 7250-7254 [doi]
- Stage-Regularized Neural Stein Critics For Testing Goodness-Of-Fit Of Generative ModelsMatthew Repasky, Xiuyuan Cheng, Yao Xie 0002. 7255-7259 [doi]
- Hear-Your-Action: Human Action Recognition by Ultrasound Active SensingRisako Tanigawa, Yasunori Ishii. 7260-7264 [doi]
- P2DT: Mitigating Forgetting in Task-Incremental Learning with Progressive Prompt Decision TransformerZhiyuan Wang, Xiaoyang Qu, Jing Xiao 0006, Bokui Chen, Jianzong Wang. 7265-7269 [doi]
- Exploring Self-Explainable Street-Level IP Geolocation with Graph Information BottleneckKai Yang, Wenxin Tai, Zhenhui Li, Ting Zhong, Guangqiang Yin, Yong Wang, Fan Zhou 0002. 7270-7274 [doi]
- Source-Free Online Domain Adaptive Semantic Segmentation of Satellite Images Under Image DegradationFahim Faisal Niloy, Kishor Kumar Bhaumik, Simon S. Woo. 7275-7279 [doi]
- Enhancing GAN Performance Through Neural Architecture Search and Tensor DecompositionPrasanna Reddy Pulakurthi, Mahsa Mozaffari, Sohail A. Dianat, Majid Rabbani, Jamison Heard, Raghuveer Rao. 7280-7284 [doi]
- AUTOSGM: A Unified Lowpass Regularization Framework for Accelerated LearningOluwasegun A. Somefun, Stefan Lee, V. John Mathews. 7285-7289 [doi]
- Accurate Interpolation of Scattered Data Via Learning Relation GraphShizhe Ding, Boyang Xia, Jingyan Sui, Dongbo Bu. 7290-7294 [doi]
- Tensor-Guided Interpolation For Off-Grid Power Spectrum Map ConstructionHao Sun, Junting Chen, Yuan Luo. 7295-7299 [doi]
- A Sound Approach: Using Large Language Models to Generate Audio Descriptions for Egocentric Text-Audio RetrievalAndreea-Maria Oncescu, João F. Henriques, Andrew Zisserman, Samuel Albanie, A. Sophia Koepke. 7300-7304 [doi]
- Non Commutative Convolutional Signal Models in Neural Networks: Stability to Small DeformationsAlejandro Parada-Mayorga, Landon Butler, Alejandro Ribeiro. 7305-7309 [doi]
- A Parameterized Generative Adversarial Network Using Cyclic Projection for Explainable Medical Image ClassificationsXiangyu Xiong, Yue Sun, Xiaohong Liu, Chan-Tong Lam, Tong Tong 0001, Hao Chen, Qinquan Gao, Wei Ke 0001, Tao Tan. 7310-7314 [doi]
- Graph Networks Stand Strong: Enhancing Robustness via Stability ConstraintsZhe Zhao, Pengkun Wang, Haibin Wen, Yudong Zhang 0005, Binwu Wang, Yang Wang 0015. 7315-7319 [doi]
- Adaptive Spatial-Temporal Hypergraph Fusion Learning for Next POI RecommendationYantong Lai, Yijun Su, Lingwei Wei, Tianci Wang, Daren Zha, Xin Wang. 7320-7324 [doi]
- Multi-Modal GPT-4 Aided Action Planning and Reasoning for Self-driving VehiclesFangyuan Chi, Yixiao Wang, Panos Nasiopoulos, Victor C. M. Leung. 7325-7329 [doi]
- Diffstock: Probabilistic Relational Stock Market Predictions Using Diffusion ModelsDivyanshu Daiya, Monika Yadav, Harshit Singh Rao. 7335-7339 [doi]
- Multi-Stage Learning for Radar Pulse Activity SegmentationZi Huang, Akila Pemasiri, Simon Denman, Clinton Fookes, Terrence Martin. 7340-7344 [doi]
- Mutual Information Based Noise Scale Optimization for Gradient Leakage Resistant Federated LearningChao Zheng, Liming Wang, Zhen Xu 0009, Hongjia Li. 7345-7349 [doi]
- Data Augmentation via Subgroup Mixup for Improving FairnessMadeline Navarro, Camille Olivia Little, Genevera I. Allen, Santiago Segarra. 7350-7354 [doi]
- Expression Domain Translation Network for Cross-Domain Head ReenactmentTaewoong Kang, Jeongsik Oh, Jaeseong Lee, Sunghyun Park 0005, Jaegul Choo. 7356-7359 [doi]
- Boosting Zero-Shot Node Classification via Dependency Capture and Discriminative Feature LearningWenxin Liang, Zhiliang Hao, Han Liu 0008, Hongyang Chen. 7360-7364 [doi]
- SPDG-Net: Semantics Preserving Domain Augmentation through Style Interpolation for Multi-Source Domain GeneralizationAdvait Kumar, Shirsha Bose, Mohamad Hassan N. C, Biplab Banerjee. 7365-7369 [doi]
- HLS-FGVC: Hierarchical Label Semantics Enhanced Fine-Grained Visual ClassificationShichuan Zhang, Sunyi Zheng, Zhongyi Shui, Lin Yang. 7370-7374 [doi]
- Diversifying Cross-Domain Few-Shot Learning via Multimodal Image EditingZhipeng Lin, Wenjing Yang 0002, Long Lan, Mingyang Geng, Haotian Wang 0001, Haoang Chi, Xueqiong Li, Ji Wang 0001. 7375-7379 [doi]
- ECPNet: An Enhanced Curve Perception Network for Lane DetectionYunzuo Zhang, Yuxin Zheng, Cunyu Wu, Tian Zhang, Yameng Liu. 7380-7384 [doi]
- Cutransnet: Transformers to Make Strong Encoders for Multi-Task Vision Perception of Autonomous DrivingJianping Li, Xiao Ke, ZhiHao Wang, JinCheng Wan, Guozhen Tan. 7385-7389 [doi]
- Co-Salient Object Detection via Discriminative Prototypes ContrastJunyi Wang, Bin Chen, Wenrui Fan, Yongjiang Liu. 7390-7394 [doi]
- Differentially Private Federated Frank-WolfeRobin Francis, Sundeep Prabhakar Chepuri. 7395-7399 [doi]
- CGN: A Simple Yet Effective Multi-Channel Gated Network for Long-Term Time Series ForecastingZhao Sun, Yulong Pei, Defu Li, Qinke Peng. 7400-7404 [doi]
- When Training-Free Nas Meets Vision Transformers: A Neural Tangent Kernel PerspectiveQiqi Zhou 0001, Yichen Zhu. 7405-7409 [doi]
- Revisiting the Equivalence of In-Context Learning and Gradient Descent: The Impact of Data DistributionSadegh Mahdavi, Renjie Liao, Christos Thrampoulidis. 7410-7414 [doi]
- Killing It With Zero-Shot: Adversarially Robust Novelty DetectionHossein Mirzaei, Mohammad Jafari, Hamid Reza Dehbashi, Zeinab Sadat Taghavi, Mohammad Sabokrou, Mohammad Hossein Rohban. 7415-7419 [doi]
- InvariantOODG: Learning Invariant Features of Point Clouds for Out-of-Distribution GeneralizationZhimin Zhang 0008, Xiang Gao, Wei Hu 0003. 7420-7424 [doi]
- DiffRENT: A Diffusion Model for Recording Environment Transfer of SpeechJaekwon Im, Juhan Nam. 7425-7429 [doi]
- Activation Compression of Graph Neural Networks Using Block-Wise Quantization with Improved Variance MinimizationSebastian Eliassen, Raghavendra Selvan. 7430-7434 [doi]
- Strategic Arms with Side Communication Prevail Over Low-Regret MAB AlgorithmsAhmed Ben Yahmed, Clément Calauzènes, Vianney Perchet. 7435-7439 [doi]
- Fast Test Error Rates for Gradient-Based Algorithms on Separable DataPuneesh Deora, Bhavya Vasudeva, Vatsal Sharan, Christos Thrampoulidis. 7440-7444 [doi]
- Renyi Divergences Learning for explainable classification of SAR Image PairsMatthieu Gallet, Ammar Mian, Abdourrahmane M. Atto. 7445-7449 [doi]
- Synthia's Melody: A Benchmark Framework for Unsupervised Domain Adaptation in AudioChia-Hsin Lin, Charles Jones, Björn W. Schuller, Harry Coppock, Alican Akman. 7450-7454 [doi]
- Adversarial Robustness of Convolutional Models Learned in the Frequency DomainSubhajit Chaudhury, Toshihiko Yamasaki. 7455-7459 [doi]
- Continual Learning with Class-Level Minimally Interfered UpdateGuanglu Wang, Xianchao Zhang 0001, Han Liu 0008, Xiaotong Zhang 0003, Jie Mu, Wentao Yang, Linlin Zong. 7460-7464 [doi]
- Balanced Learning for Multi-Domain Long-Tailed Speaker RecognitionJanghoon Cho, Sunghyun Park, Hyunsin Park, Hyoungwoo Park, Seunghan Yang, Sungrack Yun. 7465-7469 [doi]
- Privacy-Preserving Attention-Weighted Multi-Source Domain Adaptation for EEG Motor ImageryYu-Mei Huang, Hui-Nien Hung, Vincent S. Tseng. 7470-7474 [doi]
- Towards 3D Computational Persicopy with an Ordinary Camera: a Separable Non-Linear Least Squares FormulationFadlullah Raji, John Murray-Bruce. 7475-7479 [doi]
- Beta Quantile Regression for Robust Estimation of Uncertainty in the Presence of OutliersHaleh Akrami, Omar Zamzam, Anand A. Joshi, Sergül Aydöre, Richard M. Leahy. 7480-7484 [doi]
- Self-Adaptive Scale Handling for Forecasting Time Series with Scale HeterogeneityXu Zhang, Zhengang Huang, Yunzhi Wu, Xun Lu, Erpeng Qi, Yunkai Chen, Zhongya Xue, Peng Wang, Wei Wang. 7485-7489 [doi]
- SCRN: A Spectrogram Convolutional Recurrent Network for AoA Estimation Using Bluetooth 5Wentao Shi, Baoqi Huang, Bing Jia. 7490-7494 [doi]
- Progressive Image Synthesis from Semantics to Details with Denoising Diffusion GANGuoxing Yang, Haoyu Lu, Chongxuan Li, Guang Zhou, Haoran Wu, Zhiwu Lu 0001. 7495-7499 [doi]
- GFMAE: Self-Supervised GNN-Free Masked AutoencodersYulan Hu, Sheng Ouyang, Zhirui Yang, Yi Zhao, Junchen Wan, Fuzheng Zhang, Zhongyuan Wang, Yong Liu. 7500-7504 [doi]
- Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity DetectionNianlong Gu, Kanghwi Lee, Maris Basha, Sumit Kumar Ram, Guanghao You, Richard H. R. Hahnloser. 7505-7509 [doi]
- Treemil: A Multi-Instance Learning Framework for Time Series Anomaly Detection with Inexact SupervisionChen Liu, Shibo He, Haoyu Liu, Shizhong Li. 7510-7514 [doi]
- Hypergraph Transformer for Semi-Supervised ClassificationZexi Liu, Bohan Tang, Ziyuan Ye, Xiaowen Dong 0001, Siheng Chen, Yanfeng Wang. 7515-7519 [doi]
- A New Pre-Training Paradigm for Offline Multi-Agent Reinforcement Learning with Suboptimal DataLinghui Meng 0001, Xi Zhang, Dengpeng Xing, Bo Xu 0002. 7520-7524 [doi]
- Facilitating Message Passing with Potential Links for Knowledge Graph CompletionHong Yin, Jiang Zhong, Rongzhen Li, Jiaqi Wang, Chen Wang, Qizhu Dai, Xue Li 0001. 7525-7529 [doi]
- Importance of Negative Sampling in Weak Label LearningAnkit Shah 0001, Fuyu Tang, Zelin Ye, Rita Singh, Bhiksha Raj. 7530-7534 [doi]
- Sequential Monte Carlo Graph Convolutional Network for Dynamic Brain ConnectivityFengfan Zhao, Ercan Engin Kuruoglu. 7535-7539 [doi]
- Ranking Enhanced Fine-Grained Contrastive Learning for RecommendationYunhang Yao, Min Gao 0001, Hongwei Zhou, Zongwei Wang 0002, Zehua Zhao, Qingyu Xiong. 7540-7544 [doi]
- DACR: Distribution-Augmented Contrastive Reconstruction for Time-Series Anomaly DetectionLixu Wang, Shichao Xu, Xinyu Du, Qi Zhu 0002. 7545-7549 [doi]
- Graph Neural Networks are More Powerful than We ThinkCharilaos I. Kanatsoulis, Alejandro Ribeiro. 7550-7554 [doi]
- Impact of Sampling Strategies on the Monitoring of Climate Regime Shifts with a Learning Data Assimilation MethodPerrine Bauchot, Angélique Drémeau, Florian Sévellec, Ronan Fablet. 7555-7559 [doi]
- Noise-Resistant Graph Neural Network for Node ClassificationZichao Deng, Han Yu 0001. 7560-7564 [doi]
- Importance Sampling Based Federated Unsupervised Representation LearningNazreen Shah, Prachi Goyal, Ranjitha Prasad. 7565-7569 [doi]
- Interpretable Policy Extraction with Neuro-Symbolic Reinforcement LearningRajdeep Dutta, Qincheng Wang, Ankur Singh, Dhruv Kumarjiguda, Xiaoli Li 0001, Senthilnath Jayavelu. 7570-7574 [doi]
- Communication Efficient Private Federated Learning Using DitheringBurak Hasircioglu, Deniz Gündüz. 7575-7579 [doi]
- Computational Complexity of Asynchronous Policy Iteration for Two-Player Zero-Sum Markov GamesChenyu Xu, Sihai Zhang, Zhengdao Wang. 7580-7584 [doi]
- Partial Convolutional Based-Radio Map Reconstruction for Urban Environments with Inaccessible AreasFanhua Li, Yuanyuan Deng, Bo Zhou 0012, QiHui Wu. 7585-7589 [doi]
- Zero-Shot Imitation Policy Via Search In Demonstration DatasetFederico Malato, Florian Leopold, Andrew Melnik, Ville Hautamäki. 7590-7594 [doi]
- Federated Learning via Consensus Mechanism on Heterogeneous Data: A New Perspective on ConvergenceShu Zheng, Tiandi Ye, Xiang Li, Ming Gao. 7595-7599 [doi]
- A Counterfactual Inspired Framework For Quantifying Edge Effects On Gnns FairnessYuefeng Ma, Lanzhen Guo. 7600-7604 [doi]
- Adversarial Jamming for Autoencoder Distribution MatchingWaleed El-Geresy, Deniz Gündüz. 7605-7609 [doi]
- Sod-Uav: Small Object Detection For Unmanned Aerial Vehicle Images Via Improved Yolov7Yujie Li, Yifu Wang, Zihang Ma, Xinghe Wang, Yutao Tang. 7610-7614 [doi]
- When Green Learning Meets Federated Learning: Toward Distributed Learning with Low Complexity and Model HeterogeneityYi-Cheng Lai, Chen-Yu Wang, Feng-Tsun Chien. 7615-7619 [doi]
- Expressive Acoustic Guitar Sound Synthesis with an Instrument-Specific Input Representation and Diffusion OutpaintingHounsu Kim, Soonbeom Choi, Juhan Nam. 7620-7624 [doi]
- Unrolled Proximal Gradient Descent Method for Non-Negative Least Squares ProblemAkash Sen, Pradyumna Pradhan, Ramunaidu Randhi, C. S. Sastry. 7625-7629 [doi]
- Automated Labeling of Automotive Radar Azimuth MultipathStav Danino, Igal Bilik. 7630-7634 [doi]
- Hierarchical Attacks on Large-Scale Graph Neural NetworksJianfu Zhang 0003, Yan Hong, Dawei Cheng, Liqing Zhang 0001, Qibin Zhao. 7635-7639 [doi]
- Object Detection Oriented Privacy-Preserving Frame-Level Video Anomaly DetectionJiawei Yan, Yuxing Yang, Syed Mohsen Naqvi. 7640-7644 [doi]
- Multi-Stage Progressive Refinement and RoI Context Enhancement Network for Small Logo DetectionSonghui Zhao, Sujuan Hou. 7645-7649 [doi]
- Context-Aware Transformer for Single Image Rain Streaks RemovalZhihua Chen, Lei Liang, Yeting Huang, Lei Dai, Ran Li, Bin Sheng 0001. 7650-7654 [doi]
- Leveraging Tensor Subspace Prior: Enhanced Sum of Nuclear Norm Minimization for Tensor CompletionLi Ge, Xue Jiang, Lin Chen, Xingzhao Liu, Martin Haardt. 7655-7659 [doi]
- Efficient Content Reconstruction for High Dynamic Range ImagingXiang Zhang, Tao Hu, Jiashuang He, Qingsen Yan. 7660-7664 [doi]
- Diversity-Aware Buffer for Coping with Temporally Correlated Data Streams in Online Test-Time AdaptationMario Döbler, Florian Marencke, Robert A. Marsden, Bin Yang. 7665-7669 [doi]
- Hierarchical Metadata Information Constrained Self-Supervised Learning for Anomalous Sound Detection under Domain ShiftHaiyan Lan, Qiaoxi Zhu, Jian Guan 0001, Yuming Wei, Wenwu Wang 0001. 7670-7674 [doi]
- Neural Ordinary Differential Equations with Trainable SolversSaid Ouala, Laurent Debreu, Bertrand Chapron, Fabrice Collard, Lucile Gaultier, Ronan Fablet. 7675-7679 [doi]
- Test-Time Distribution Learning Adapter for Cross-Modal Visual ReasoningYi Zhang, Ce Zhang 0009. 7680-7684 [doi]
- Language Guided Adversarial PurificationHimanshu Singh, A V. Subramanyam. 7685-7689 [doi]
- Vision Transformer with 2D Explicit Position EncodingYujie Li, Zihang Ma, Xinghe Wang, Yifu Wang, Benying Tan. 7690-7694 [doi]
- Which is the Better Teacher Action? A New Ranking Model and DatasetMing Fang, Xinning Du, Qi Liu, Yunpeng Zhou, Qiwen Liang, Shuhua Liu. 7695-7699 [doi]
- Activity Recognition Method Based on Kernel Supervised Laplacian EigenmapsPengjia Tu, Cheng Tian, Dandan Du, Junhuai Li, Huaijun Wang. 7700-7704 [doi]
- Accelerating Gradient Descent for Over-Parameterized Asymmetric Low-Rank Matrix Sensing via PreconditioningCheng Cheng, Ziping Zhao 0002. 7705-7709 [doi]
- Synonym Replacement and Generation Enhancement for Document AugmentationJianwei Sun, Yang An, Xinyu Jiang, Qian Li, Yulong Liu, Yongshun Gong. 7710-7714 [doi]
- Multi-Rate Variable-Length CSI Compression for FDD Massive MIMOBumsu Park, Heedong Do, Namyoon Lee. 7715-7719 [doi]
- Leveraging Noisy Labels of Nearest Neighbors for Label Correction and Sample SelectionHua Jiang, Yixiong Chen, Li Liu, Xiaoguang Han 0001, Xiao-Ping Zhang 0003. 7720-7724 [doi]
- Self-Supervised Reinforcement Learning for Out-of-Distribution Recovery via Auxiliary RewardYufeng Xie, Yinan Wang, Han Wang, Qingshan Li. 7725-7729 [doi]
- Multi-Grained Multimodal Interaction Network for Sentiment AnalysisLingyong Fang, Gongshen Liu, Ru Zhang. 7730-7734 [doi]
- Adaptive Order Aggregator and Extractor Graph Neural NetworkLing Guo, Guoguo Ai, Hui Yan. 7735-7739 [doi]
- CENet: Content-Aware Enhanced Network for Practical Scene ParsingKai Song, Zhengtan Wang, Huhe Dai, Yuan Zheng 0002. 7740-7744 [doi]
- PoisonPrompt: Backdoor Attack on Prompt-Based Large Language ModelsHongwei Yao, Jian Lou 0001, Zhan Qin. 7745-7749 [doi]
- On Optimizing Timesteps of an EDM Based Diffusion Sampling ProcedureHuiwen Luo, Guoqiang Zhang. 7750-7754 [doi]
- Multimodal Transformer Distillation for Audio-Visual SynchronizationXuanjun Chen, Haibin Wu, Chung-Che Wang, Hung-yi Lee, Jyh-Shing Roger Jang. 7755-7759 [doi]
- Spectral Graph Neural Networks with Generalized Laguerre ApproximationZhengpin Li, Jian Wang. 7760-7764 [doi]
- One-Step Late Fusion Multi-View Clustering with Compressed SubspaceQiyuan Ou, Pei Zhang 0008, Sihang Zhou 0001, En Zhu. 7765-7769 [doi]
- Rapid Change Localization in Dynamic Graphical ModelsAbrar Zahin, Weizhi Li, Gautam Dasarathy. 7770-7774 [doi]
- SAM: A Self-Adaptive Attention Module for Context-Aware Recommendation SystemJiabin Liu, Zheng Wei, Zhengpin Li, Xiaojun Mao, Jian Wang, Zhongyu Wei, Qi Zhang. 7775-7779 [doi]
- A Simple and Effective Method for Anomaly Detection on Attributed Graphs via Feature ConsistencyCheng Zhou, Guangxia Li, Yulong Shen. 7780-7784 [doi]
- A Unified DNN-Based System for Industrial Pipeline SegmentationDimitrios Psarras, Christos Papaioannidis, Vasileios Mygdalis, Ioannis Pitas. 7785-7789 [doi]
- Non-Stationary Bandits with Periodic Behavior: Harnessing Ramanujan Periodicity Transforms to Conquer Time-Varying ChallengesParth Thaker, Vineet Gattani, Vignesh Tirukkonda, Pouria Saidi, Gautam Dasarathy. 7790-7794 [doi]
- A Novel Cross-Sensor Self-Supervised Learning Method for Rotating Machinery Fault DiagnosisHao Hu, Zhixi Feng, Ruoxue Li, Yue Ma, Shuyuan Yang. 7795-7799 [doi]
- Double Reverse Regularization Network Based on Self-Knowledge Distillation for SAR Object ClassificationBo Xu, Hao Zheng, Zhigang Hu, Liu Yang, Meiguang Zheng, Xianting Feng, Wei Lin. 7800-7804 [doi]
- How to Bridge Graph and Sequence Patterns in Session-Based Recommendation? A Self-Supervised MethodXinglong Wu, Hui He, Zejun Wang, Yu Tai, Sheng Yin, Hongwei Yang, Weizhe Zhang. 7805-7809 [doi]
- Dual-Stream Contrastive Predictive Network with Joint Handcrafted Feature View for SAR Ship ClassificationXianting Feng, Hao Zheng, Zhigang Hu, Liu Yang, Meiguang Zheng. 7810-7814 [doi]
- Neighborhood-Enhanced Multimodal Collaborative Filtering for Item Cold Start RecommendationGuohui Li, Li Zou, Zhiying Deng, Qi Chen. 7815-7819 [doi]
- Text Region Multiple Information Perception Network for Scene Text DetectionJinzhi Zheng, Libo Zhang 0001, Yanjun Wu, Chen Zhao. 7820-7824 [doi]
- Robust Localization of Key Fob Using Channel Impulse Response of Ultra Wide Band Sensors for Keyless Entry SystemsAbhiram Kolli, Filippo Casamassima, Horst Possegger, Horst Bischof. 7825-7829 [doi]
- Analyzing Adversarial Vulnerabilities of Graph Lottery TicketsSubhajit Dutta Chowdhury, Zhiyu Ni, Qingyuan Peng, Souvik Kundu 0009, Pierluigi Nuzzo 0002. 7830-7834 [doi]
- Self-Motion As Supervision For Egocentric Audiovisual LocalizationCalvin Murdock, Ishwarya Ananthabhotla, Hao Lu, Vamsi Krishna Ithapu. 7835-7839 [doi]
- Speech-Driven Emotional 3d Talking Face Animation Using Emotional EmbeddingsSeongmin Lee 0002, Jeonghaeng Lee, Hyewon Song, Sanghoon Lee 0001. 7840-7844 [doi]
- TransAVS: End-to-End Audio-Visual Segmentation with TransformerYuhang Ling, Yuxi Li, Zhenye Gan, Jiangning Zhang, Mingmin Chi, Yabiao Wang. 7845-7849 [doi]
- Cliprerank: An Extremely Simple Method For Improving Ad-Hoc Video SearchAozhu Chen, Fangming Zhou, Ziyuan Wang, Xirong Li 0001. 7850-7854 [doi]
- A Relation-Aware Heterogeneous Graph Transformer on Dynamic Fusion for Multimodal Classification TasksYimo Ren, Jinfa Wang, Jie Liu, Peipei Liu, Hong Li 0004, Hongsong Zhu, Limin Sun 0001. 7855-7859 [doi]
- VK-G2T: Vision and Context Knowledge Enhanced Gloss2textLiqiang Jing, Xuemeng Song, Xinxing Zu, Na Zheng, Zhongzhou Zhao, Liqiang Nie. 7860-7864 [doi]
- A Tri-Dynamic Preprocessing Framework for UGC Video CompressionFei Zhao, Mengxi Guo, Shijie Zhao, Junlin Li, Li Zhang, Xiaodong Xie. 7865-7869 [doi]
- Exploring Latent Cross-Channel Embedding for Accurate 3d Human Pose Reconstruction in a Diffusion FrameworkJunkun Jiang, Jie Chen. 7870-7874 [doi]
- SJTU-TMQA: A Quality Assessment Database for Static Mesh with Texture MapBingyang Cui, Qi Yang 0003, Kaifa Yang, Yiling Xu, Xiaozhong Xu, Shan Liu 0001. 7875-7879 [doi]
- Multimodal Graph-Based Audio-Visual Event LocalizationZhen Wang, Dongyuan Li, Manabu Okumura. 7880-7884 [doi]
- Object-Conditioned Bag of Instances for Few-Shot Personalized Instance RecognitionUmberto Michieli, Jijoong Moon, Daehyun Kim, Mete Ozay. 7885-7889 [doi]
- Unified Pretraining Target Based Video-Music Retrieval with Music Rhythm and Video Optical Flow InformationTianjun Mao, Shansong Liu, Yunxuan Zhang, Dian Li, Ying Shan. 7890-7894 [doi]
- Visual Prompt Tuning for Weakly Supervised Phrase GroundingPengyue Lin, Zhihan Yu, Mingcong Lu, Fangxiang Feng, Ruifan Li, Xiaojie Wang 0006. 7895-7899 [doi]
- Adaptive Confidence Multi-View Hashing for Multimedia RetrievalJian Zhu, Yu Cui, Zhangmin Huang, Xingyu Li, Lei Liu, Lingfang Zeng, Li-Rong Dai. 7900-7904 [doi]
- Joint-Semantics Multi-Similarity Hashing for Cross-Modal RetrievalWeigang Wang, Zhongwen Guo, Chao Yang, Jinxin Wang, Sining Jiang, Tianao Zhang. 7905-7909 [doi]
- Circular Decomposition and Cross-Modal Recombination for Multimodal Sentiment AnalysisHaijian Liang, Weicheng Xie 0001, Xilin He, Siyang Song, LinLin Shen. 7910-7914 [doi]
- Humtrans: A Novel Open-Source Dataset for Humming Melody Transcription and BeyondShansong Liu, Xu Li, Dian Li, Ying Shan. 7915-7919 [doi]
- Temporal Conditional Coding for Dynamic Point Cloud Geometry CompressionBowen Huang, Davi Lazzarotto, Touradj Ebrahimi. 7920-7924 [doi]
- Fusing Modality-Specific Representations and Decisions for Multimodal Emotion RecognitionYu-Ping Ruan, Shoukang Han, Taihao Li, Yanfeng Wu. 7925-7929 [doi]
- Modality-Dependent Sentiments Exploring for Multi-Modal Sentiment ClassificationJingzhe Li, Chengji Wang, Zhiming Luo, Yuxian Wu, Xingpeng Jiang. 7930-7934 [doi]
- Clip-Based Synergistic Knowledge Transfer for text-based Person RetrievalYating Liu, Yaowei Li, Zimo Liu, Wenming Yang, Yaowei Wang 0001, Qingmin Liao. 7935-7939 [doi]
- Hourglass-AVSR: Down-Up Sampling-Based Computational Efficiency Model for Audio-Visual Speech RecognitionFan Yu, Haoxu Wang, Ziyang Ma, Shiliang Zhang. 7940-7944 [doi]
- FreeTalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker NaturalnessSicheng Yang, Zunnan Xu, Haiwei Xue, Yongkang Cheng, Shaoli Huang, Mingming Gong, Zhiyong Wu 0001. 7945-7949 [doi]
- Weakly Supervised Few-Shot Segmentation Through Textual PromptShengzhe You, Libo Weng, Fei Gao 0014. 7950-7954 [doi]
- Region-Adaptive Video Sharpening Via Rate-Perception OptimizationYingxue Pang, Shijie Zhao, Mengxi Guo, Junlin Li, Li Zhang. 7955-7959 [doi]
- Large Language Models Augmented Rating Prediction in Recommender SystemSichun Luo, Jiansheng Wang, Aojun Zhou, Li Ma, Linqi Song. 7960-7964 [doi]
- MFT-PCQA: Multi-Modal Fusion Transformer for No-Reference Point Cloud Quality AssessmentYating Liu, Ziyu Shan, Yujie Zhang, Yiling Xu. 7965-7969 [doi]
- Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-Training and Multi-Modal TokensMinsu Kim, Jeongsoo Choi, Soumi Maiti, Jeong Hun Yeo, Shinji Watanabe 0001, Yong Man Ro. 7970-7974 [doi]
- Underlying-Complementarity and Surrounding-Correspondence for Multi-View ClusteringNan Li, Songlin Du. 7975-7979 [doi]
- Visually Guided Binaural Audio Generation with Cross-Modal ConsistencyMiao Liu, Jing Wang, Xinyuan Qian, Xiang Xie. 7980-7984 [doi]
- ECM-OPCC: Efficient Context Model for Octree-Based Point Cloud CompressionYiqi JIn, Ziyu Zhu, Tongda Xu, Yuhuan Lin, Yan Wang. 7985-7989 [doi]
- Binauralmusic: A Diverse Dataset for Improving Cross-Modal Binaural Audio GenerationYunqi Li, Shulin Liu, Haonan Cheng, Long Ye. 7990-7994 [doi]
- Video-Language Graph Convolutional Network for Human Action RecognitionRui Zhang, Xiaoran Yan. 7995-7999 [doi]
- Implicit Neural Multiple Description for DNA-Based Data StorageTrung-Hieu Le, Xavier Pic, Jeremy Mateos, Marc Antonini. 8000-8004 [doi]
- Music-to-Dance Poses: Learning to Retrieve Dance Poses from MusicBo-Wei Tseng, Kenneth Yang, Yu-Hua Hu, Wen-Li Wei, Jen-Chun Lin. 8005-8009 [doi]
- Multi-Source Dynamic Interactive Network Collaborative Reasoning Image CaptioningQiang Su, Zhixin Li. 8010-8014 [doi]
- Electroencephalogram Helps Few-Shot LearningXiaoya Fan, Yuntao Liu, Zhong Wang 0001. 8015-8019 [doi]
- NeRI: Implicit Neural Representation of LiDAR Point Cloud Using Range Image SequenceRuixiang Xue, Jiaxin Li, Tong Chen 0004, Dandan Ding, Xun Cao, Zhan Ma. 8020-8024 [doi]
- Textual Tokens Classification for Multi-Modal Alignment in Vision-Language TrackingZhongjie Mao, Yucheng Wang, Xi Chen, Jia Yan. 8025-8029 [doi]
- AttA-NET: Attention Aggregation Network for Audio-Visual Emotion RecognitionRuijia Fan, Hong Liu 0008, Yidi Li, Peini Guo, Guoquan Wang, Ti Wang. 8030-8034 [doi]
- GeneFormer: Learned Gene Compression using Transformer-Based Context ModelingZhanbei Cui, Tongda Xu, Jia Wang, Yu Liao, Yan Wang. 8035-8039 [doi]
- C-CLAPA: Improving Text-Audio Cross Domain Retrieval with Captioning and AugmentationsAmit Sofer, Shlomo E. Chazan. 8040-8044 [doi]
- Transferring Structure Knowledge: A New Task to Fake News Detection towards Cold-Start PropagationLingwei Wei, Dou Hu 0001, Wei Zhou 0019, Songlin Hu. 8045-8049 [doi]
- 3M-Transformer: A Multi-Stage Multi-Stream Multimodal Transformer for Embodied Turn-Taking PredictionMehdi Fatan, Emanuele Mincato, Dimitra Pintzou, Mariella Dimiccoli. 8050-8054 [doi]
- MOMA: Mixture-of-Modality-Adaptations for Transferring Knowledge from Image Models Towards Efficient Audio-Visual Action RecognitionKai Wang, Dimitrios Hatzinakos. 8055-8059 [doi]
- Facial Micro-Motion-Aware Mixup for Micro-Expression RecognitionZhuoyao Gu, Miao Pang, Zhen Xing, Weimin Tan, Xuhao Jiang, Bo Yan. 8060-8064 [doi]
- Text-Driven Talking Face Synthesis by Reprogramming Audio-Driven ModelsJeongsoo Choi, Minsu Kim, Se Jin Park, Yong Man Ro. 8065-8069 [doi]
- Towards Robust Multimodal Prompting with Missing ModalitiesJaehyuk Jang, Yooseung Wang, Changick Kim. 8070-8074 [doi]
- Dual-Color Granularity Alignment for Text-Based Person SearchWeichen Zhao, Yuxing Lu, Ge Jiao, Yuan Yang. 8075-8079 [doi]
- Data-Driven Lattices for Vector QuantizationNatalie Lang, Itamar Assaf, Omer Bokobza, Nir Shlezinger. 8080-8084 [doi]
- Caption Unification for Multi-View Lifelogging Images Based on In-Context Learning with Heterogeneous Semantic ContentsMasaya Sato, Keisuke Maeda, Ren Togo, Takahiro Ogawa 0001, Miki Haseyama. 8085-8089 [doi]
- Audio-Visual Child-Adult Speaker Classification in Dyadic InteractionsAnfeng Xu, Kevin Huang, TianTian Feng, Helen Tager-Flusberg, Shrikanth Narayanan. 8090-8094 [doi]
- Enhancing Reinforcement Learning via Causally Correct Input Identification and Targeted InterventionJiwei Shen, Hu Lu, Hao Zhang, Shujing Lyu, Yue Lu. 8095-8099 [doi]
- GLMB 3D Speaker Tracking with Video-Assisted Multi-Channel Audio Optimization FunctionsXinyuan Qian, Zexu Pan, Qiquan Zhang, Kainan Chen, Shoufeng Lin. 8100-8104 [doi]
- Camera-Radar Association for Data AnnotationChanul Park, Dahyun Jeon, Seongwook Lee. 8105-8109 [doi]
- Rethinking Normals: Direction Guided Point Cloud RecognitionKuan Liu, Yanmin Zhu, Zhaobo Wang, Ke Wang, Gang Zhou. 8110-8114 [doi]
- MDAVIF: A Multi-Domain Acoustical-Visual Information Fusion Model for Depression Recognition from Vlog DataTianfei Ling, Deyuan Chen, Baobin Li. 8115-8119 [doi]
- Axis Order Invariance Learned from Point CloudsKuan Liu, Yanmin Zhu, Zhaobo Wang, Ke Wang, Gang Zhou. 8120-8124 [doi]
- Prompting Large Language Models with Fine-Grained Visual Relations from Scene Graph for Visual Question AnsweringJiapeng Liu, Chengyang Fang, Liang Li, Bing Li, Dayong Hu, Can Ma. 8125-8129 [doi]
- Segment then Match: Find the Carrier before Reasoning in Scene-Text VQAChengyang Fang, Liang Li, Jiapeng Liu, Bing Li, Dayong Hu, Can Ma. 8130-8134 [doi]
- Emotion-Aligned Contrastive Learning Between Images and MusicShanti Stewart, Kleanthis Avramidis, TianTian Feng, Shrikanth Narayanan. 8135-8139 [doi]
- Multi-Object Editing in Personalized Text-To-Image Diffusion Model Via Segmentation GuidanceHaruka Matsuda, Ren Togo, Keisuke Maeda, Takahiro Ogawa 0001, Miki Haseyama. 8140-8144 [doi]
- CLIP-MSA: Incorporating Inter-Modal Dynamics and Common Knowledge to Multimodal Sentiment Analysis With ClipQi Huang, Pingting Cai, Tanyue Nie, Jinshan Zeng. 8145-8149 [doi]
- MLCA-AVSR: Multi-Layer Cross Attention Fusion Based Audio-Visual Speech RecognitionHe Wang, Pengcheng Guo, Pan Zhou, Lei Xie. 8150-8154 [doi]
- BEVLOC: End-to-End 6-DoF Localization Via Cross-Modality Correlation Under Bird's Eye ViewNanjie Chen, Jinping Wang, Hao Chen, Ying Shen, Shuai Wang, Xiaojun Tan. 8155-8159 [doi]
- Long Term Memory-Enhanced Via Causal Reasoning for Text-To-Video RetrievalDingxin Cheng, Shuhan Kong, Wenyu Wang, Meixia Qu, Bin Jiang 0011. 8160-8164 [doi]
- SIMFALL: A Data Generator for RF-Based Fall DetectionJiamu Li, Dongheng Zhang, Qi Chen, Yadong Li, Jianyang Wang, Wenxuan Li, Yang Hu 0006, Qibin Sun, Yan Chen 0007. 8165-8169 [doi]
- Reducing the Complexity of Normalizing Flow Architectures for Point Cloud Attribute CompressionRodrigo B. Pinheiro, Jean-Eudes Marvie, Giuseppe Valenzise, Frédéric Dufaux. 8170-8174 [doi]
- A Fine-Grained Attribute Pre-Labeling Method Based on Label Dependency and Feature Similarity DynamicsHao-Chiang Shao, Yu-Hsien Lin, Chia-Wen Lin. 8175-8179 [doi]
- Incomplete Multi-View Clustering Via Inference and EvaluationBinqiang Huang, Zhijie Huang, Shoujie Lan, Qinghai Zheng, Yuanlong Yu. 8180-8184 [doi]
- Enhancing Expressiveness in Dance Generation Via Integrating Frequency and Music Style InformationQiaochu Huang, Xu He, Boshi Tang, Haolin Zhuang, Liyang Chen, Shuochen Gao, Zhiyong Wu 0001, Haozhi Huang 0004, Helen Meng. 8185-8189 [doi]
- Concentrated Reasoning and Unified Reconstruction for Multi-Modal Media ManipulationWeichen Zhao, Yuxing Lu, Ge Jiao, Yuan Yang. 8190-8194 [doi]
- Audio-Visual Speech Recognition In-The-Wild: Multi-Angle Vehicle Cabin Corpus and Attention-Based MethodAlexandr Axyonov, Dmitry Ryumin, Denis Ivanko, Alexey M. Kashevnik, Alexey Karpov 0001. 8195-8199 [doi]
- MMRBN: Rule-Based Network for Multimodal Emotion RecognitionXi Chen. 8200-8204 [doi]
- Towards an Objective Quality Metric for Interpolated Directional Room Impulse ResponsesHualin Ren, Christian Ritz 0001, Jiahong Zhao, Daeyoung Jang. 8205-8209 [doi]
- Comparison of Conditions for Omnidirectional Video with Spatial Audio in Terms of Subjective Quality and Impacts on Objective Metrics Resolving PowerAndréas Pastor, Pierre R. Lebreton, Toinon Vigier, Patrick Le Callet. 8210-8214 [doi]
- Position-Aware Active Learning for Multi-Modal Entity AlignmentBaogui Xu, Yafei Lu, Bing Su 0001, Xiaoran Yan. 8215-8219 [doi]
- Unified Speech and Gesture Synthesis Using Flow MatchingShivam Mehta, Ruibo Tu, Simon Alexanderson, Jonas Beskow, Éva Székely, Gustav Eje Henter. 8220-8224 [doi]
- PhISANet: Phonetically Informed Speech Animation NetworkSalvador Medina, Sarah L. Taylor, Carsten Stoll, Gareth Edwards, Alex Hauptmann 0001, Shinji Watanabe 0001, Iain A. Matthews. 8225-8229 [doi]
- ETP: Learning Transferable ECG Representations via ECG-Text Pre-TrainingChe Liu, Zhongwei Wan, Sibo Cheng, Mi Zhang 0002, Rossella Arcucci. 8230-8234 [doi]
- AutoSen: Improving Automatic WiFi Human Sensing through Cross-Modal AutoencoderQian Gao, Yanling Hao, Yuanwei Liu. 8235-8239 [doi]
- Modality Drop-Out for Multimodal Device Directed Speech Detection Using Verbal and Non-Verbal FeaturesGautam Krishna, Sameer Dharur, Oggi Rudovic, Pranay Dighe, Saurabh Adya, Ahmed Hussen Abdelaziz, Ahmed H. Tewfik. 8240-8244 [doi]
- Enhancing Image-Text Matching with Adaptive Feature AggregationZuhui Wang, Yunting Yin, I. V. Ramakrishnan. 8245-8249 [doi]
- Long-Term Social Interaction Context: The Key to Egocentric Addressee DetectionDeqian Kong, Furqan Khan, Xu Zhang, Prateek Singhal, Ying Nian Wu. 8250-8254 [doi]
- Sec2Sec Co-Attention Transformer for Video-Based Apparent Affective PredictionMingwei Sun, Kunpeng Zhang. 8255-8259 [doi]
- Human Motion Capture Data Segmentation Based on ST-GCNXiuyun Ma, Na Lv. 8260-8264 [doi]
- Balancing Easy and Hard Distortions: A Multi-Rate Knowledge Distillation Strategy for Blind Image Quality AssessmentDesen Yuan. 8265-8269 [doi]
- Character Attribute Extraction from Movie Scripts Using LLMsSabyasachee Baruah, Shrikanth Narayanan. 8270-8275 [doi]
- EmoTalker: Emotionally Editable Talking Face Generation via Diffusion ModelBingyuan Zhang, Xulong Zhang 0001, Ning Cheng 0001, Jun Yu 0001, Jing Xiao 0006, Jianzong Wang. 8276-8280 [doi]
- Exploring Multi-Modal Control in Music-Driven Dance GenerationRonghui Li, Yuqin Dai, Yachao Zhang, Jun Li, Jian Yang 0003, Jie Guo, Xiu Li. 8281-8285 [doi]
- Learning Fine-Grained Information Alignment for Calibrated Cross-Modal RetrievalJianhua Dong, Shengrong Zhao, Hu Liang. 8286-8290 [doi]
- Diffradar: High-Quality Mmwave Radar Perception With Diffusion Probabilistic ModelJincheng Wu, Ruixu Geng, Yadong Li, Dongheng Zhang, Zhi Lu, Yang Hu 0006, Yan Chen 0007. 8291-8295 [doi]
- Conversational Co-Speech Gesture Generation via Modeling Dialog Intention, Emotion, and Context with Diffusion ModelsHaiwei Xue, Sicheng Yang, Zhensong Zhang, Zhiyong Wu 0001, Minglei Li 0001, Zonghong Dai, Helen Meng. 8296-8300 [doi]
- Inter-Modality and Intra-Sample Alignment for Multi-Modal Emotion RecognitionYusong Wang, Dongyuan Li, Jialun Shen. 8301-8305 [doi]
- Driver Scanpath Prediction Based On Inverse Reinforcement LearningZhixin Huang, Yuchen Zhou, Jie Zhu, Chao Gou. 8306-8310 [doi]
- Keep Knowledge in Perception: Zero-Shot Image Aesthetic AssessmentGuolong Wang, Yike Tan, Hangyu Lin, Chuchun Zhang. 8311-8315 [doi]
- Gesture Generation Via Diffusion Model with Attention MechanismLingling Li, Weicong Li, Qiyuan Ding, Chengpei Tang, Keze Wang. 8316-8320 [doi]
- Enhancing Spatial Audio Generation with Source Separation and Channel Panning LossWootaek Lim, Juhan Nam. 8321-8325 [doi]
- ControlCap: Controllable Captioning via No-Fuss LexiconQiujie Xie, Qiming Feng, Yuejie Zhang, Rui Feng, Tao Zhang 0022, Shang Gao. 8326-8330 [doi]
- Graph-Based Environment Representation for Vision-and-Language Navigation in Continuous EnvironmentsTing Wang, Zongkai Wu, Feiyu Yao, Donglin Wang. 8331-8335 [doi]
- A Novel Multimodal Sentiment Analysis Model Based on Gated Fusion and Multi-Task LearningXin Sun, Xiangyu Ren, Xiaohao Xie. 8336-8340 [doi]
- Enhancing Realism in 3D Facial Animation Using Conformer-Based Generation and Automated Post-ProcessingYi Zhao, Chunyu Qiang, Hao Li, Yulan Hu, Wangjin Zhou, Sheng Li. 8341-8345 [doi]
- FCC-MF: Detecting Violence in Audio-Visual Context with Frame-Wise Cluster Contrast and Modality-Stage FloodingJiaqing He, Yanzhen Ren, Liming Zhai, Wuyang Liu. 8346-8350 [doi]
- The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker ExtractionShilong Wu, Chenxi Wang, Hang Chen, Yusheng Dai, Chenyue Zhang, Ruoyu Wang 0029, Hongbo Lan, Jun Du, Chin-Hui Lee 0001, Jingdong Chen, Sabato Marco Siniscalchi, Odette Scharenborg, Zhong-qiu Wang, Jia Pan, Jianqing Gao. 8351-8355 [doi]
- Cross Pseudo-Labeling for Semi-Supervised Audio-Visual Source LocalizationYuxin Guo, Shijie Ma, Yuhao Zhao, Hu Su, Wei Zou. 8356-8360 [doi]
- Speech Guided Masked Image Modeling for Visually Grounded SpeechJongbhin Woo, Hyeonggon Ryu, Arda Senocak, Joon Son Chung. 8361-8365 [doi]
- Learning Density Regulated and Multi-View Consistent Unsigned Distance FieldsRui Zhang, Jingyi Xu, Weidong Yang, Lipeng Ma, Menglong Chen, Ben Fei. 8366-8370 [doi]
- Fast Cross-Modality Knowledge Transfer via a Contextual Autoencoder TransformationMin Zheng, Chunpeng Wu, Yue Wang, Yantao Jia, Weiwei Liu, Long Lin, Shuai Chen, Fei Zhou. 8371-8375 [doi]
- Evidence-Aware Multimodal Chinese Social Media Rumor DetectionKaixuan Wu, Donglin Cao. 8376-8380 [doi]
- Small Object Detection on the Water Surface Based on Radar and Camera FusionQiancheng Wei, Xiaoping Jiang, Ying Liu, Qiya Su, Muyao Yu. 8381-8385 [doi]
- ScanPCGC: Learning-Based Lossless Point Cloud Geometry Compression using Sequential Slice RepresentationJiangwei Deng, Yuhao An, Thomas H. Li, Shan Liu 0001, Ge Li 0002. 8386-8390 [doi]
- TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive LearningChaeyoung Jung, Suyeon Lee, Kihyun Nam, Kyeongha Rho, You Jin Kim, Youngjoon Jang, Joon Son Chung. 8391-8395 [doi]
- Does Video Summarization Require Videos? Quantifying the Effectiveness of Language in Video SummarizationYoonsoo Nam, Adam Lehavi, Daniel Yang, Digbalay Bose, Swabha Swayamdipta, Shrikanth Narayanan. 8396-8400 [doi]
- Learning Spectral Canonical ℱ-Correlation Representation for Face Super-ResolutionYun-Hao Yuan, Mingzhi Hao, Yun Li 0010, Jipeng Qiang, Yi Zhu 0006, Xiaobo Shen 0001. 8401-8405 [doi]
- Multi-Dimensional Geometric Feature-Based Calibration Method for LiDAR and Camera FusionYuhan Hao, Xin Jin 0002, Dongyu Du. 8406-8410 [doi]
- Talking Face Generation for Impression Conversion Considering Speech SemanticsSaki Mizuno, Nobukatsu Hojo, Kazutoshi Shinoda, Keita Suzuki, Mana Ihori, Hiroshi Sato, Tomohiro Tanaka, Naotaka Kawata, Satoshi Kobashikawa, Ryo Masumura. 8411-8415 [doi]
- Cross-Modal Multiscale Difference-Aware Network for Joint Moment Retrieval and Highlight DetectionMingyao Zhou, Wenjing Chen 0003, Hao Sun 0014, Wei Xie 0008. 8416-8420 [doi]
- CM-PIE: Cross-Modal Perception for Interactive-Enhanced Audio-Visual Video ParsingYaru Chen, Ruohao Guo, Xubo Liu, Peipei Wu, Guangyao Li, Zhenbo Li, Wenwu Wang 0001. 8421-8425 [doi]
- Efficient Point Cloud Attribute Compression Framework using Attribute-Guided Graph Fourier TransformJingshu Zhang, Yueru Chen, Guoqing Liu, Wei Gao 0003, Ge Li. 8426-8430 [doi]
- Key Points Centered Sparse Hashing for Cross-Modal RetrievalZhikai Hu, Yiu-ming Cheung, Mengke Li, Weichao Lan, Donglin Zhang. 8431-8435 [doi]
- Efficient Point Cloud Attribute Compression Using Rich Parallelizable Context ModelRuishan Huang, Pengpeng Yu, Shaolin Liao, Fan Liang. 8436-8440 [doi]
- Speaker-Centric Multimodal Fusion Networks for Emotion Recognition in ConversationsBiyun Yao, Wuzhen Shi. 8441-8445 [doi]
- mmBaT: A Multi-Task Framework for Mmwave-Based Human Body Reconstruction and Translation PredictionJiarui Yang, Songpengcheng Xia, Yifan Song, Qi Wu 0007, Ling Pei. 8446-8450 [doi]
- Multi-Beam Multiplexing Design with Phase-Only Excitation Based on Hybrid Beamforming ArchitecturesJunwei Zhang, Shufeng Li, Libiao Jin, Wei Liu, Hing-Cheung So. 8451-8455 [doi]
- Analysis of an Elliptic Localization Algorithm Using Fixed Point IterationYanbin Zou, Liehu Wu, Yimao Sun. 8456-8460 [doi]
- Robust Beamforming for DFRC Systems in Complex EnvironmentsXue Xiong, Hao Liang, Bin Liao. 8461-8465 [doi]
- Joint Multi-Band DOA Estimation Using Low-Rank Matrix RecoveryZhengang Guo, Wei Dai. 8466-8470 [doi]
- Adaptive Grid 2-D Direction of Arrival Estimation Method Using an Integrated DictionaryYanan Wu, Andreas Jakobsson. 8471-8475 [doi]
- Multidimensional Scaling-Based TDOA Localization in Modified Polar RepresentationBeichuan Tang, Yimao Sun, K. C. Ho 0001, Lei Zhang, Yanbing Yang. 8476-8480 [doi]
- Generalized Deterministic-Random Tradeoff of Integrated Sensing and Communications: The Sensing-Optimal Operating PointYifeng Xiong, Fan Liu, Marco Lops. 8481-8485 [doi]
- A New Fourth-Order Sparse Array Generator Based on Sum-Difference Co-Array AnalysisHaodong Guo, Hua Chen 0004, Hongguang Lin, Wei Liu 0001, Qing Shen, Gang Wang 0007. 8486-8490 [doi]
- Using Temporal Consistency for Compressed Sensing in High-Resolution mmWave SoundingSebastian Semper, J. Chuang, Samuel Berweger, Camillo Gentile. 8491-8495 [doi]
- Identifiability Analysis of Sensor Arrays with Sensors off Half-Wavelength GridMd. Waqeeb T. S. Chowdhury, Yimin D. Zhang, Wei Liu 0001, Maria S. Greco. 8496-8500 [doi]
- A CCM-Based Joint DOA-Frequency Estimation and Signal Recovery with Efficient Sub-Nyquist SamplingLiang Liu, Zhouchen Li, Jiancheng An, Lu Gan 0003, Hongbin Li 0001. 8501-8505 [doi]
- DOA Estimation for Switch-Element Arrays Based on Sparse RepresentationLiang Liu, Zhouchen Li, Jiancheng An, Lu Gan, Hongbin Li. 8506-8510 [doi]
- Low-Rank Constrained Multichannel Signal Denoising Considering Channel-Dependent Sensitivity Inspired by Self-Supervised Learning for Optical Fiber SensingNoriyuki Tonami, Wataru Kohno, Sakiko Mishima, Yumi Arai, Reishi Kondo, Tomoyuki Hino. 8511-8515 [doi]
- Three-Dimensional Spatial-Temporal Near-Field Passive Localization Based on an Exact Spatial Propagation ModelJiaxiong Fang, Juan Liu, Hua Chen, Wei Liu, Ye Tian, Gang Wang. 8516-8520 [doi]
- Direct Position Determination by Covariance-Fitting on the Riemannian Manifold of Hermitian Positive Definite MatricesJoseph S. Picard, Amitay Bar, Ronen Talmon. 8521-8525 [doi]
- Reduced-Dimensional Decomposition and Eigenspace Reconstruction of Coherent Sources with Arbitrary Rectangle ArraysXiangtian Meng, Fenggang Yan, Maria Greco 0001, Fulvio Gini, Ming Jin 0004. 8526-8530 [doi]
- Detector Design for Distributed Multichannel Radar Sensors in Colored Interference EnvironmentsMoein Ahmadi, Mohammad Alaee Kerahroodi, Linlong Wu, Bhavani Shankar M. R., Björn E. Ottersten. 8531-8535 [doi]
- On the Design of Planar Differential Microphone Arrays with Specified Beamwidth or Sidelobe LevelXueqin Luo, Jilu Jin, Gongping Huang, Yingke Zhao, Jingdong Chen, Jacob Benesty. 8536-8540 [doi]
- An MVDR-Embedded U-Net Beamformer for Effective and Robust Multichannel Speech EnhancementChing Hua Lee, Kashyap Patel, Chouchang Yang, Yilin Shen, Hongxia Jin. 8541-8545 [doi]
- Subspace-Based Co-Array Processing For Nested Arrays without EigendecompositionXinghao Qu, Zhigang Shang, Gang Qiao, Jixing Qin, Xuerui Liu. 8546-8550 [doi]
- Predicting Fall Events by a Spatio-Temporal Topological Network with Multiple Wearable SensorsXiaohu Li, Jiawei Liu, Guorui Liao, Mingrui Yin, Shu Wang, Guoxin Su, Jun Liao, Li Liu 0001. 8551-8555 [doi]
- User-Assisted Networked Sensing in OFDM Cellular Network with Erroneous Anchor Position InformationXianzhen Guo, Qin Shi 0004, Liang Liu 0003, Shuowen Zhang. 8556-8560 [doi]
- Beamforming Through Online Convex Combination of Differential BeamformersJilu Jin, Xueqin Luo, Gongping Huang, Jingdong Chen, Jacob Benesty. 8561-8565 [doi]
- All Neural Kronecker Product Beamforming for Speech Extraction with Large-Scale Microphone ArraysWeixin Meng, Xiaoyu Li, Andong Li, Jian Li, Xiaodong Li 0002, Chengshi Zheng. 8566-8570 [doi]
- Privacy-Preserving Distributed Optimisation using Stochastic PDMMSebastian O. Jordan, Qiongxiu Li, Richard Heusdens. 8571-8575 [doi]
- TransMUSIC: A Transformer-Aided Subspace Method for DOA Estimation with Low-Resolution ADCSJunkai Ji, Wei Mao, Feng Xi, Shengyao Chen. 8576-8580 [doi]
- Deep Convolution Network Based Super Resolution DOA Estimation with Toeplitz and Sparse PriorChenkang Duan, Ye Tian, Wei Liu. 8581-8585 [doi]
- Multi-Speaker Localization in the Circular Harmonic Domain on Small Aperture Microphone Arrays Using Deep Convolutional NetworksKunkun SongGong, Pufen Zhang, Xiongwei Zhang, Meng Sun 0001, Wenwu Wang 0001. 8586-8590 [doi]
- Further Results on the Design Of Real-Valued Wideband Beamformers Using Adaptive-Array-Theory-Inspired Weighted Least SquaresRuiwa Sun, Congwei Feng, Huawei Chen. 8591-8595 [doi]
- Through-The-Wall Radar Imaging With Wall Clutter Removal Via Riemannian Optimization On The Fixed-Rank ManifoldHugo Brehier, Arnaud Breloy, Chengfang Ren, Guillaume Ginolhac. 8596-8600 [doi]
- Channel Estimation and Prediction in Wireless Communications Assisted by Semi-Passive RISMirza Asif Haider, Yimin D. Zhang, Elias Aboutanios. 8601-8605 [doi]
- Sensing-Aided Communication Channel Estimation with Tensor-Based Moving Target LocalizationLuning Lin, Hang Zheng, Sergiy A. Vorobyov, Chengwei Zhou, Zhiguo Shi 0001. 8606-8610 [doi]
- Unified Analysis of Correlation-Aware Joint Sparse Support Recovery with ℓ0-Norm ConstraintWenzhe Lu, Mingyu Jiang, Heng Qiao. 8611-8615 [doi]
- High-Resolution Through-Wall Imaging Using Data Fusion and ReasoningZihan Chen, Xiaolu Zeng, Xiaopeng Yang, Jiarong Zhao, Junbo Gong. 8616-8620 [doi]
- Situation-Aware Adaptive Transmit Beamforming for Automotive RadarsEdoardo Focante, Nitin Jonathan Myers, Geethu Joseph, Ashish Pandharipande. 8621-8625 [doi]
- Automotive Radar Point Cloud Parametric Density Estimation using Camera ImagesTunç Alkanat, Ashish Pandharipande. 8636-8640 [doi]
- A Novel 3-D Focusing Scheme for Distributed SAR TomographyShen Zhong, Zhongyu Li 0001, Junjie Wu 0001, Jianyu Yang. 8641-8645 [doi]
- Block Adaptive Subspace Pursuit Method for Wall Clutter MitigationJiancheng Liao, Xiaolu Zeng, Xiaopeng Yang, Zixiang Yin, Junbo Gong. 8646-8650 [doi]
- Transmit Beampattern Optimization for MIMO-ISAC Systems with Hybrid BeamformingWeijie Chen, Yaling Deng, Chongtao Guo, Yuan Ma, Bin Liao. 8651-8655 [doi]
- Auditory Cortex-Inspired Spectral Attention Modulation for Binaural Sound Localization in HRTF MismatchWaradon Phokhinanan, Nicolas Obin, Sylvain Argentieri. 8656-8660 [doi]
- Fast and Efficient Sequential Radar Parameter Estimation in MIMO-OTFS SystemsKuranage Roche Rayan Ranasinghe, Hyeon Seok Rou, Giuseppe Thadeu Freitas de Abreu. 8661-8665 [doi]
- Fundamental Limits of Direction Finding in Distributed Arrays Exploiting Auxiliary SourcesZongyu Wang, Yuhan Li, Yihan Su, Tianyao Huang, Yimin Liu. 8666-8670 [doi]
- Close-Range Direction of Arrival Estimation in the Presence of Clock JitterAndreas Jansson 0002, Andreas Jakobsson. 8671-8675 [doi]
- ZIV-Zakai Bound for DOA Estimation with Gain-Phase ErrorSihan Wen, Zongyu Zhang, Chengwei Zhou, Zhiguo Shi 0001. 8681-8685 [doi]
- CST-Former: Transformer with Channel-Spectro-Temporal Attention for Sound Event Localization and DetectionYusun Shul, Jung-Woo Choi. 8686-8690 [doi]
- Jointly Learning Selection Matrices for Transmitters, Receivers and Fourier Coefficients in Multichannel ImagingHan Wang, Yiming Zhou, Eduardo Pérez, Florian Römer. 8691-8695 [doi]
- LOCSELECT: Target Speaker Localization with an Auditory Selective Hearing MechanismYu Chen, Xinyuan Qian, Zexu Pan, Kainan Chen, Haizhou Li 0001. 8696-8700 [doi]
- An Optimized Interleaved OFDM Chirp Orthogonal Waveform Design for Dechirped Miniature MMW MIMO RadarBiao Xue, Gong Zhang 0002, Fulvio Gini, Maria S. Greco, Henry Leung. 8701-8705 [doi]
- FDA-MIMO Radar Using Ambiguity Function for Target Two-Dimensional LocalizationWen-Qin Wang. 8706-8710 [doi]
- Deep Learning Inversion of Ocean Wave Spectrum from SAR Satellite ObservationsS. P. Tripathi, Bertrand Chapron, Fabrice Collard, Gilles Guitton, Manuel Lopez-Radcenco, Alexis Mouche, Ronan Fablet. 8711-8715 [doi]
- Three-Dimensional Decoupled Atomic Norm MinimizationMohammadreza Bagheri Jazi, Seyed Mohammad Karbasi, Prabhu Babu. 8716-8720 [doi]
- Adaptive Joint Channel Estimation/Data Detection in Flexible Multicarrier Mimo Systems - A Tensor-Based ApproachEleftherios Kofidis. 8721-8725 [doi]
- Robust Near-Field Beamforming for Millimeter Wave Communication System with Aperture PerturbationsGerald C. Nwalozie, Damir Rakhimov, Martin Haardt. 8726-8730 [doi]
- Batch Substitution Calibration of a Mems Microphone Array : Impact of Sensor Performance Dispersion on Directivity EstimationM. Hartenstein, F. Ollivier, F. Silva, P. Luizard. 8731-8735 [doi]
- Harmonic Retrieval for Non-Circular Coherent Signals via Double Decoupled Atomic Norm MinimizationYu Zhang 0068, Yue Wang 0019, Zhipeng Cai, Fangqing Wen, Gong Zhang 0002. 8736-8740 [doi]
- Multispectral RF Imaging Using Multiple Narrow-Band FMCW SignalsZiyu Zhou, Wei Dai. 8741-8745 [doi]
- Max-Min Beamforming for Multi-User Massive MIMO Systems: An Alternating Projection-Based ApproachMenghong Cai, Bin Wang, Jun Fang. 8746-8750 [doi]
- Enabling Orientation-Free Mmwave-Based Vital Sign Sensing with Multi-Domain Signal AnalysisHanqin Gong, Dongheng Zhang, Jinbo Chen, Yadong Li, Guixin Xu, Yuqin Yuan, Yang Hu 0006, Yan Chen 0007. 8751-8755 [doi]
- Frequency-Domain Signal Reconstruction for Dynamic Time-Domain Weighting Hybrid Precoding with Beam SquintJinyi Yang, Lin Chen, Xue Jiang, Wei Liu. 8756-8760 [doi]
- Sparse Bayesian Learning-Based Direct Localization for Distributed Sensor Arrays with Unknown Gain and Phase ErrorsYuexian Wang, Qianyuan Shi, Chuang Han, Ling Wang 0001, Chintha Tellambura. 8761-8765 [doi]
- Fast Algorithm Design for the Constant-Envelope Precoding in Massive Mimo Communications with Interference ExploitationChunxuan Shi, Yongzhe Li, Ran Tao 0003. 8766-8770 [doi]
- Design of Spatial-Slow-Time Constant-Modulus Waveform Transmission and Receive Adaptive Filter for Dual-Function Radar Communications with Reconfigurable Intelligent SurfaceYuxuan Zhen, Chunxuan Shi, Yongzhe Li, Ran Tao 0003. 8771-8775 [doi]
- OFDM Waveform Design with Good Correlation Level and Peak-to-Mean Envelope Power Ratio for the Joint MIMO Radar And CommunicationsXiaonan Xu, Yongzhe Li, Ran Tao 0003, Tao Shan. 8776-8780 [doi]
- A Real-Time Active Speaker Detection System Integrating an Audio-Visual Signal with a Spatial Querying MechanismIlya Gurvich, Ido Leichter, Dharmendar Reddy Palle, Yossi Asher, Alon Vinnikov, Igor Abramovski, Vishak Gopal, Ross Cutler, Eyal Krupka. 8781-8785 [doi]
- A Graph Neural Network Based Approach for Fault Delineation in Seismic Data using Graph Total Variation and MultigraphPatitapaban Palo, Aurobinda Routray, Ritesh Chandra Tewari. 8786-8790 [doi]
- Newtonalized Orthogonal Matching Pursuit for Mixed Far-Field and Near-Field Source LocalizationQi Zhang, Hong Jiang, Yunchang Liu. 8791-8795 [doi]
- Gridless Parameter Estimation in Partly Calibrated Rectangular ArraysTianyi Liu, Sai Pavan Deram, Khaled Ardah, Martin Haardt, Marc E. Pfetsch, Marius Pesavento. 8796-8800 [doi]
- Joint Robust Optimal Transmit and Receive Beamforming Designs for a DFRC System for the MIMO Radar and Secondary Multicast Communication in a Cognitive Radio NetworkYongwei Huang, Jiachao Liang. 8801-8805 [doi]
- GCC-PHAT Re-Imagined - A U-Net Filter for Audio TDOA Peak-SelectionJens Gulin, Kalle Åström. 8806-8810 [doi]
- Enhancing AoA Estimation Via Phase Modeling of Bluetooth 5 CTE SignalsWentao Shi, Tao Zhang, Baoqi Huang, Bing Jia. 8811-8815 [doi]
- Fusion of Audio and Visual Embeddings for Sound Event Localization and DetectionDavide Berghi, Peipei Wu, Jinzheng Zhao, Wenwu Wang 0001, Philip J. B. Jackson. 8816-8820 [doi]
- Binaural Sound Source Localization Using a Hybrid Time and Frequency Domain ModelGil Geva, Olivier Warusfel, Shlomo Dubnov, Tammuz Dubnov, Amir Amedi, Yacov Hel-Or. 8821-8825 [doi]
- Identifiability Study of Near-Field Automotive SARMichael Shifrin, Joseph Tabrikian, Igal Bilik. 8826-8830 [doi]
- Solution and Analysis For 3-D Localization In Closed-Form Integrating Sa and TDOA MeasurementsTianyi Xing, Yimao Sun, Lihua Ni, Xiangyu Peng, Kehao Zhang, Qun Wan. 8831-8835 [doi]
- Selective User Forwarded Cell-Free Massive Mimo with Quantized SymbolsMaria Francis, K. V. S. Hari. 8836-8840 [doi]
- Target Signal Power Improvement and Clutter Suppression via Beamforming for Integrated Sensing and Communication SystemsSikai Ge, Zhiqiang Wei 0001, Zai Yang. 8841-8845 [doi]
- Reweighted Atomic Norm Minimization for One-Bit Multichannel Spectral Compressed SensingWeichao Zheng, Zai Yang. 8846-8850 [doi]
- Multitarget Tracking in the Presence of Velocity Ambiguity for Automotive RadarMahdi Koloushani, Mohammad Mahdi Naghsh, Mohammad Reza Taban, Seyed Mohammad Karbasi. 8851-8855 [doi]
- CRC-Aided Learned Ensembles of Belief-Propagation Polar DecodersTomer Raviv, Alon Goldmann, Ofek Vayner, Yair Be'ery, Nir Shlezinger. 8856-8860 [doi]
- Sparse Channel Representation and Estimation in Near Field CommunicationsXing Zhang, Haiyang Zhang, Yonina C. Eldar. 8861-8865 [doi]
- Global Optimization of Active RIS in Linear TimeHeedong Do, Namyoon Lee. 8866-8870 [doi]
- Bayesian Activity Detection for Massive Connectivity in Cell-Free IoT NetworksHao Zhang, Qingfeng Lin, Yang Li 0035, Lei Cheng 0003, Yik-Chung Wu. 8871-8875 [doi]
- A Novel Demodulation and Selection Pilot Power Trade-Off for Codebook-Based IRS with Imperfect Channel EstimatesSriram Ganesan, Neelesh B. Mehta, Rimalapudi Sarvendranath. 8876-8880 [doi]
- Cooperative Sensing Via Matrix Factorization of the Partially Received Sample Covariance MatrixRui Zhou, Wenqiang Pu, Licheng Zhao, Ming-Yi You, Qingjiang Shi, Sergios Theodoridis. 8881-8885 [doi]
- Data-Aided Channel Estimation Utilizing Gaussian Mixture ModelsFranz Weißer, Nurettin Turan, Dominik Semmler, Wolfgang Utschick. 8886-8890 [doi]
- Correlation-Based Machine Learning Techniques for Channel Estimation with Fluid AntennasShuyan Ji, Constantinos Psomas, John Thompson. 8891-8895 [doi]
- Friends to Help: Saving Federated Learning from Client DropoutHeqiang Wang, Jie Xu 0001. 8896-8900 [doi]
- Meta-Knowledge Enhanced Data Augmentation for Federated Person Re-IdentificationChunli Song, Xiaohua Chen, Wenqiu Zhu, Yucan Zhou, Xiaoyan Gu, Bo Li. 8901-8905 [doi]
- A Stochastic Proximal WMMSE for Ergodic Sum Rate MaximizationXiaotong Zhao, Xi Wang, Juncheng Wang 0001, Qingjiang Shi. 8906-8910 [doi]
- A Smoothed Bregman Proximal Gradient Algorithm for Decentralized Nonconvex OptimizationWenqiang Pu, Jiawei Zhang, Rui Zhou, Xiao Fu 0001, Mingyi Hong. 8911-8915 [doi]
- Federated Learning with Instance-Dependent Noisy LabelLei Wang, Jieming Bian, Jie Xu. 8916-8920 [doi]
- Location Optimization for RIS Aided mmWave Downlink NetworkQian Xiang, Cong Sun 0002, Danpu Liu. 8921-8925 [doi]
- Unified Probability Distributions of Generalized Composite Fading with Inverse-Type Distributions of Large-Scale Shadowing/FluctuationsChin Choy Chai, Xiao-Ping Zhang 0002. 8926-8930 [doi]
- Globally Optimal Beamforming Design for Integrated Sensing and Communication SystemsZhiguo Wang, Jiageng Wu, Ya-Feng Liu, Fan Liu 0005. 8931-8935 [doi]
- Decentralizing Coherent Joint Transmission Precoding Via Deterministic EquivalentsYuhao Liu 0005, Xinyu Bian, Yizhou Xu, Tianqi Hou, Wenjie Wang, Yuyi Mao, Jun Zhang 0004. 8936-8940 [doi]
- Anti-Deception Jamming Power Optimization Strategy for Multi-Target Tracking Tasks in Multi-Radar SystemsJun Sun, Ye Yuan, Maria Sabrina Greco, Fulvio Gini, Wei Yi. 8941-8945 [doi]
- Composite Federated Learning with Heterogeneous DataJiaojiao Zhang, Jiang Hu, Mikael Johansson 0001. 8946-8950 [doi]
- Congestion-Aware Distributed Task Offloading in Wireless Multi-Hop Networks Using Graph Neural NetworksZhongyuan Zhao 0002, Jake B. Perazzone, Gunjan Verma, Santiago Segarra. 8951-8955 [doi]
- Blind Beamforming for Intelligent Reflecting Surface: A Reinforcement Learning ApproachWenhai Lai, Kaiming Shen. 8956-8960 [doi]
- RIS Localization and Spatially Wideband Filtering EffectsDario Tagliaferri, Marouan Mizmizi, Silvia Mura, Umberto Spagnolini. 8961-8965 [doi]
- A Robust GLRT Detector Against Missing Data in Cooperative SensingJinghui Guan, Rui Zhou, Wenqiang Pu, Qingjiang Shi, Tsung-Hui Chang. 8966-8970 [doi]
- Assessing GNSS Carrier-to-Noise-Density Ratio Estimation in The Presence of Meaconer InterferenceEmile Ghizzo, Axel Garcia Pena, Julien Lesouple, Carl Milner, Christophe Macabiau. 8971-8975 [doi]
- PJSCC: A Puncturing-Based Joint Source Channel Coding Scheme with Hierarchical Down-Sampling LayerYihao Chen, Bin Tan, Jun Wu 0006, Die Hu 0002. 8976-8980 [doi]
- Learnable Statistical Moments Pooling for Automatic Modulation ClassificationClayton A. Harper, Mitchell A. Thornton, Eric C. Larson. 8981-8985 [doi]
- Time-Modulated Intelligent Reflecting Surface for Waveform SecurityZhaoyi Xu, Athina P. Petropulu. 8986-8990 [doi]
- Low-Complexity Vector Source Coding for Discrete Long Sequences with Unknown DistributionsLeah Woldemariam, Hang Liu 0007, Anna Scaglione. 8991-8995 [doi]
- Binary Signal Alignment: Optimal Solution is Polynomial-Time and Linear-Time Solution is Quasi-OptimalSpyridon Peppas, Nicholas D. Sidiropoulos. 8996-9000 [doi]
- A Binary BP Decoding Using Posterior Adjustment for Quantum LDPC CodesTzu-Hsuan Huang, Yeong-Luh Ueng. 9001-9005 [doi]
- Efficient Federated Learning with Smooth Aggregation for Non-IID Data from Multiple EdgesQianru Wang, Qingyang Li, Bin Guo, JiangTao Cui. 9006-9010 [doi]
- Deep Reinforcement Learning for Energy Minimization in Multi-RIS-Aided Cell-Free MEC NetworksMengying Sun, Wanli Ni, Xiaodong Xu 0001, Xiaofeng Tao. 9011-9015 [doi]
- Privacy-Aware Joint Source-Channel Coding For Image Transmission Based On Disentangled Information BottleneckLunan Sun, Caili Guo, Mingzhe Chen, Yang Yang 0057. 9016-9020 [doi]
- Omnidirectional Multi-Rotor Aerial Vehicle Pose Optimization: A Novel Approach to Physical Layer SecurityDaniel Bonilla Licea, Giuseppe Silano, Mounir Ghogho, Martin Saska. 9021-9025 [doi]
- Energy-Saving Cell-Free Massive MIMO Precoders with a per-AP Wideband Kronecker Channel ModelEmanuele Peschiera, Xavier Mestre, François Rottenberg. 9026-9030 [doi]
- Channel Estimation in Underdetermined Systems Utilizing Variational AutoencodersMichael Baur, Nurettin Turan, Benedikt Fesl, Wolfgang Utschick. 9031-9035 [doi]
- An Efficient Algorithm for Multiuser Sum-Rate Maximization of Large-Scale Active RIS-Aided MIMO SystemQian Zhang 0093, Mingjie Shao, Qiang Li 0017, Ju Liu. 9036-9040 [doi]
- Secure Energy Efficiency Fairness Maximization in Backscatter Throughput Constrained UAV-Assisted Data CollectionJiawang Zeng, Deepak Mishra 0001, Hassan Habibi Gharakheili, Aruna Seneviratne. 9041-9045 [doi]
- AutoCali: Enhancing AoA-based Indoor Localization through Automatic Phase CalibrationPengfei Yin, Dongheng Zhang, Tianyu Zhang, Shuai Yang, Guanzhong Wang, Yang Hu 0006, Yan Chen 0007. 9046-9050 [doi]
- Understanding Gaussian Noise Mismatch: A Hellinger Distance ApproachKexin Huang, Chaohua Shi, Lu Gan 0005, Hongqing Liu. 9051-9055 [doi]
- Deep Optimization of Relay Networks-Using Relays as NeuronsItsik Bergel. 9056-9060 [doi]
- Towards Faster End-to-End Data Transmission Over Voice ChannelsWeijun Zhang, Hao Han, Mingwei Li, Yulong Tian. 9061-9065 [doi]
- One-bit Quantization Robust to Angle-of-Arrivals for Uniform Linear Antenna ArrayShenjian Wang, Shuichi Ohno. 9066-9070 [doi]
- Joint Admission Control and Beamformer Design for Mobile Users: Stay Here or Move to a Better Position?Jingran Lin, Weijie Xiong, Qiang Li 0017, Xiangze Kong, Yuhan Zhang. 9071-9075 [doi]
- Personalized Over-The-Air Federated Learning with Personalized Reconfigurable Intelligent SurfacesJiayu Mao, Aylin Yener. 9076-9080 [doi]
- How Secure is the Time-Modulated Array-Enabled OFDM Directional Modulation?Zhihao Tao, Zhaoyi Xu, Athina P. Petropulu. 9081-9085 [doi]
- Hardware Impairments-Aware Design of noncoherent Grassmannian ConstellationsDiego Cuevas, Javier Álvarez-Vizoso, Mikel Gutiérrez, Ignacio Santamaría, Vít Tucek, Gunnar Peters. 9086-9090 [doi]
- Joint Beamforming and Compression Design for Per-Antenna Power Constrained Cooperative Cellular NetworksXilai Fan, Ya-Feng Liu, Bo Jiang 0010. 9091-9095 [doi]
- Classification-Oriented Semantic Wireless CommunicationsEmrecan Kutay, Aylin Yener. 9096-9100 [doi]
- Multicast Transmission Design With Enhanced DOF For Mimo Coded Caching SystemsMohammad NaseriTehrani, Mohammad Javad Salehi, Antti Tölli. 9101-9105 [doi]
- Optimizing Synchronization Delay for Digital Twin over Wireless NetworksZhaohui Yang 0001, Mingzhe Chen, Yuchen Liu 0001, Zhaoyang Zhang 0001. 9106-9110 [doi]
- Leaky Waveguide Antennas for Downlink Wideband THz CommunicationsYaela Gabay, Nir Shlezinger, Tirza Routtenberg, Yasaman Ghasempour, George C. Alexandropoulos, Yonina C. Eldar. 9111-9115 [doi]
- Mitigating Data Injection Attacks on Federated LearningOr Ohev Shalom, Amir Leshem, Waheed U. Bajwa. 9116-9120 [doi]
- Adaptive Reweighted Sparse Belief Propagation Decoding for Polar CodesRobert M. Oliveira, Rodrigo C. de Lamare. 9121-9125 [doi]
- Subspace-Based Detection in OFDM ISAC Systems Under Different ConstellationsYangming Lai, Musa Furkan Keskin, Henk Wymeersch, Luca Venturino, Wei Yi, Lingjiang Kong. 9126-9130 [doi]
- Joint Computing and Communication Resource Allocation for TDMA-Based Binary Computation OffloadingM. Amin Manouchehrpour, Timothy N. Davidson. 9136-9140 [doi]
- Metasurface-Based Receivers with 1-bit ADCS for multi-user Uplink CommunicationsPanagiotis N. Gavriilidis, Italo Atzeni, George C. Alexandropoulos. 9141-9145 [doi]
- Multi-Model Wireless Federated Learning with Downlink BeamformingChong Zhang, Min Dong 0001, Ben Liang 0001, Ali Afana, Yahia Ahmed. 9146-9150 [doi]
- Stein Variational Gradient Descent-Based Detection for Random Access with Preambles in MTCXin Zhu, Hongyi Pan, Salih Atici, Ahmet Enis Çetin. 9151-9155 [doi]
- Utilizing Second-Order Information in Noisy Information-Sharing Environments for Distributed OptimizationZhaoye Pan, Haoqi Yang, Huikang Liu. 9156-9160 [doi]
- Optimal Beamforming Structure for Rate Splitting Multiple AccessTianyu Fang, Yijie Mao. 9161-9165 [doi]
- UAV-Based Dynamic Object Tracking with Radio MapYangrui Dong, Fan Li 0001, Cunyan Ma, Chen He 0002, Z. Jane Wang 0001. 9166-9170 [doi]
- CROSSWORD: A Semantic Approach To Text Compression Via MaskingMingxiao Li, Rui Jin, Liyao Xiang, Kaiming Shen, Shuguang Cui. 9171-9175 [doi]
- Joint Blind Deconvolution And Demixing Of Sparse Signals Via Factorization And Nonconvex OptimizationMengting Chen, Ziping Zhao 0002. 9176-9180 [doi]
- State-Augmented Information Routing In Communication Systems With Graph Neural NetworksSourajit Das, Navid Naderializadeh, Alejandro Ribeiro. 9181-9185 [doi]
- Enabling Secure Wireless Communications via Movable AntennasZhenqiao Cheng, Nanxi Li, Jianchi Zhu, Xiaoming She, Chongjun Ouyang, Peng Chen 0028. 9186-9190 [doi]
- Integrated Localization and Communication in 3GPP Industrial EnvironmentsGirim Kwon, Zhenyu Liu 0003, Andrea Conti 0001, Hyuncheol Park, Moe Z. Win. 9191-9195 [doi]
- SemDA: Communication-Efficient Data Aggregation Through Distributed Semantic TransmissionYaru Zhao, Yakun Huang. 9196-9200 [doi]
- On The Resilience Of Online Federated Learning To Model Poisoning Attacks Through Partial SharingEhsan Lari, Vinay Chakravarthi Gogineni, Reza Arablouei, Stefan Werner 0001. 9201-9205 [doi]
- Optimal Structure of Receive Beamforming for over-The-Air ComputationHongbin Zhu, Hua Qian. 9206-9210 [doi]
- Coverage Analysis For mmWAVE UAV Networks with Static and Dynamic BlockagesCunyan Ma, Xiaoya Li 0003, Yangrui Dong, Chen He 0002. 9211-9215 [doi]
- Pilot Length Minimization via AP-UE Clustering in Cell-Free SystemsAnubhab Chowdhury, Chandra R. Murthy. 9216-9220 [doi]
- Coding for the Unsourced B-Channel with Erasures: Enhancing the Linked Loop CodeWilliam W. Zheng, Jamison R. Ebert, Stefano Rini, Jean-François Chamberland. 9221-9225 [doi]
- Robust Symbol-Level Precoding via a Symbol-Perturbed Zero-Forcing StructureWai-Yiu Keung, Yatao Liu, Wing-Kin Ma. 9226-9230 [doi]
- FED-SDS: Adaptive Structured Dynamic Sparsity for Federated Learning Under Heterogeneous ClientsYujun Cheng, Zhewei Zhang, Shengjin Wang. 9231-9235 [doi]
- Uplink Symbol Detection in Dynamic TDD Mimo Systems with AP-AP InterferenceMartin Andersson, Tung Thanh Vu, Pål K. Frenger, Erik G. Larsson. 9236-9240 [doi]
- Scaling Results for Robust Distributed Estimation in Sensor Networks Using Order StatisticsUmar Rashid 0003, Rafay Chughtai. 9241-9245 [doi]
- Asynchronous Diffusion Learning with Agent Subsampling and Local UpdatesElsa Rizk, Kun Yuan, Ali H. Sayed. 9246-9250 [doi]
- Transmitting Data Through Reconfigurable Intelligent Surface: A Spatial Sigma-Delta Modulation ApproachWai-Yiu Keung, Hei Victor Cheng, Wing-Kin Ma. 9251-9255 [doi]
- Near-Field MIMO Channel Reconstruction Via Limited Geometry FeedbackShima Eslami, Bikshapathi Gouda, Antti Tölli. 9256-9260 [doi]
- LoFi User Scheduling for Multiuser Mimo Wireless SystemsAlexandra Gallyas-Sanhueza, Gian Marti, Victoria M. T. Palhares, Reinhard Wiesmayr, Christoph Studer. 9261-9265 [doi]
- Tag Antenna Structure Calibrated Backscattering Signal DetectionAmus Chee Yuen Goay, Deepak Mishra 0001, Ross Murch, Aruna Seneviratne. 9266-9270 [doi]
- An Asymptotically Achievable Rate Bound for Establishing High-Fidelity Entanglements in Quantum NetworksZhenyu Liu 0003, Stefano Maranò 0001, Moe Z. Win. 9271-9275 [doi]
- Fast and Accurate Root Cause Analysis Based on Signalling Messages for 5G NetworksZhaorui Guo, Jiyan Sun, Jiadong Fu, Lu Yuan, Shangyuan Zhuang, Liru Geng, YinLong Liu, Wei Ma. 9276-9280 [doi]
- Client-Free Federated Unlearning via Training Reconstruction with Anchor Subspace CalibrationChaohao Fu, Weijia Jia 0001, Na Ruan. 9281-9285 [doi]
- Energy Efficient Wake-Up Solution for Large-Scale Internet of Underwater Things NetworksAbdulaziz Al-Amodi, Nour Kouzayha, Nasir Saeed, Mudassir Masood, Tareq Y. Al-Naffouri. 9286-9290 [doi]
- Non-Uniform Frequency Spacing for Regularization-Free Gridless DOAYifan Wu, Michael B. Wakin, Peter Gerstoft, Yongsung Park. 9291-9295 [doi]
- A Variable Smoothing for Nonconvexly Constrained Nonsmooth Optimization with Application to Sparse Spectral ClusteringKeita Kume, Isao Yamada. 9296-9300 [doi]
- Learned ISTA with Error-Based Thresholding for Adaptive Sparse CodingZiang Li, Kailun Wu, Yiwen Guo, Changshui Zhang. 9301-9305 [doi]
- Multistatic Passive Detection of Cyclostationary SignalsStefanie Horstmann, David Ramírez 0001, Peter J. Schreier. 9306-9310 [doi]
- Multiple Player Tracking With 3D Projection and Spatio-Temporal Information In Multi-View Sports VideosYi-Peng Wang, Wei-Ta Chu. 9311-9315 [doi]
- Distributed Decision-Making for Community Structured NetworksValentina Shumovskaia, Mert Kayaalp, Ali H. Sayed. 9316-9320 [doi]
- Shift Operator and Separation Filter for Different Period Mixed Signals Using Companion MatrixSoo-Chang Pei, Kuo-Wei Chang. 9321-9325 [doi]
- Diagonalize Integral Graph by DCTSoo-Chang Pei, Kuo-Wei Chang. 9326-9330 [doi]
- D3: Dual-Domain Defenses for Byzantine-Resilient Decentralized Resource AllocationRunhua Wang, Qing Ling 0001, Zhi Tian. 9331-9335 [doi]
- On the Tradeoff Between Privacy Preservation and Byzantine-Robustness in Decentralized LearningHaoxiang Ye, Heng Zhu, Qing Ling 0001. 9336-9340 [doi]
- Cross Branch Feature Fusion Decoder for Consistency Regularization-Based Semi-Supervised Change DetectionYan Xing, Qi'ao Xu, Jingcheng Zeng, Rui Huang, Sihua Gao, Weifeng Xu, Yuxiang Zhang, Wei Fan. 9341-9345 [doi]
- Misspecified Time-Delay and Doppler Estimation over Non Gaussian ScenariosLorenzo Ortega, Stefano Fortunati. 9346-9350 [doi]
- Social Learning with Adaptive ModelsMarco Carpentiero, Virginia Bordignon, Vincenzo Matta, Ali H. Sayed. 9351-9355 [doi]
- A Saliency Enhanced Feature Fusion Based Multiscale RGB-D Salient Object Detection NetworkRui Huang, Qingyi Zhao, Yan Xing, Sihua Gao, Weifeng Xu, Yuxiang Zhang, Wei Fan. 9356-9360 [doi]
- Global Convergence of Alternating Direction Method of Multipliers for Invex Objective LossesSamuel Pinilla, Siu Lun Yeung, Jeyan Thiyagalingam. 9361-9365 [doi]
- A Joint Data Compression and Time-Delay Estimation Distributed Systems via Extremum EncodingAmir Weiss, Yuval Kochman, Gregory W. Wornell. 9366-9370 [doi]
- Privacy Leakage In Graph Signal To Graph Matching ProblemsHang Liu 0007, Anna Scaglione, Sean Peisert. 9371-9375 [doi]
- Ellipse Detection Based on Contrast-Guided Arc EnhancementZikai Wang, Baojiang Zhong, Kai-Kuang Ma. 9376-9380 [doi]
- Vector Nonlinear Hawkes Model with InhibitionSyed Ahmed Pasha, Victor Solo. 9381-9385 [doi]
- Lossy Compression of Adjacency Matrices by Graph Filter BanksKenta Yanagiya, Junya Hara, Hiroshi Higashi, Yuichi Tanaka 0001, Antonio Ortega. 9386-9390 [doi]
- Visual Adapt for RGBD TrackingGuangtong Zhang, Qihua Liang, Zhiyi Mo, Ning Li, Bineng Zhong. 9391-9395 [doi]
- Multi-View Interactive Compromise Learning for Group RecommendationJiuqiang Li 0001, Shilei Zhu. 9396-9400 [doi]
- Blind Deconvolution of Sparse Graph Signals in the Presence of PerturbationsVictor M. Tenorio, Samuel Rey, Antonio G. Marques. 9406-9410 [doi]
- Mesh-RTUME: Universal Manifold Embedding for Estimating 3D Rigid Transformations of SurfacesYuval Haitman, Joseph M. Francos. 9411-9415 [doi]
- Robust Low-Rank Correlation FittingThu Ha Phi, Alexandre Hippert-Ferrer, Florent Bouchard, Arnaud Breloy. 9416-9420 [doi]
- Provable Randomized Coordinate Descent for Matrix CompletionMatthew Callahan, Trung Vu, Raviv Raich. 9421-9425 [doi]
- A Method for Bilevel Optimization with Convex Lower-Level ProblemHan Shen, Santiago Paternain, Gaowen Liu, Ramana Kompella, Tianyi Chen. 9426-9430 [doi]
- Mixed Graph Signal Analysis of Joint Image Denoising / InterpolationNiruhan Viswarupan, Gene Cheung, Fengbo Lan, Michael S. Brown. 9431-9435 [doi]
- Zero-Shot Object Detection with Partitioned Contrastive Feature AlignmentHaohe Li, Chong Wang 0001, Shenghao Yu, Zheng Huo, Yujie Zheng, Jiangbo Qian. 9436-9440 [doi]
- Optimizing k in kNN Graphs with Graph Learning PerspectiveAsuka Tamaru, Junya Hara, Hiroshi Higashi, Yuichi Tanaka 0001, Antonio Ortega. 9441-9445 [doi]
- Adaptive Fourier Decomposition Based Signal Extraction on Weak Electromagnetic FieldZhenhuan Xu, Yongfei Wu, Liming Zhang 0002, Yidi Li. 9446-9450 [doi]
- IFNET: Integrating Data Augmentation and Decoupled Attention Fusion for 3D Object DetectionZhenchang Xia, Guanqun Zheng, Shengwu Xiong, Jia Wu 0001, Junyin Wang, Chenghu Du. 9451-9455 [doi]
- Computing an Entire Solution Path of a Nonconvexly Regularized Convex Sparse ModelYi Zhang, Isao Yamada. 9456-9460 [doi]
- Dynamic Privacy Allocation for Locally Differentially Private Federated Learning with Composite ObjectivesJiaojiao Zhang, Dominik Fay, Mikael Johansson 0001. 9461-9465 [doi]
- Multi-Linear Kernel Regression and Imputation VIA Manifold Learning: the Dynamic MRI CaseDuc Thien Nguyen, Konstantinos Slavakis. 9466-9470 [doi]
- External Division of Two Proximity Operators: An Application to Signal Recovery with Structured SparsityKyohei Suzuki, Masahiro Yukawa. 9471-9475 [doi]
- Accelerated Recovery of Spectrally Sparse Signals Viamodified Proximal Gradient in Hankel SpaceXi Yao, Wei Dai. 9476-9480 [doi]
- Alpharotate: A Rotation Detection Benchmark Using TensorflowXue Yang 0005, Yue Zhou 0005, Wenlong Liao, Tao He, Junchi Yan. 9481-9485 [doi]
- Multi-Antenna ISAC Receiver with n-Tuple Blind DeconvolutionRoman Jacome, Edwin Vargas, Kumar Vijay Mishra, Brian M. Sadler, Henry Arguello. 9486-9490 [doi]
- Sequential Wasserstein Uncertainty Sets for Minimax Robust Online Change DetectionYiran Yang, Liyan Xie. 9491-9495 [doi]
- Towards Optimized Multi-Channel Modulo-ADCs: Moduli Selection Strategies and Bit Depth AnalysisWenyi Yan, Lu Gan 0005, Shaoqing Hu, Hongqing Liu. 9496-9500 [doi]
- Online Auditing of Information FlowMor Oren-Loberman, Vered Azar, Wasim Huleihel. 9501-9505 [doi]
- Parameter Estimation Via Expectation Maximization - Expectation Consistent AlgorithmFangqing Xiao, Dirk Slock. 9506-9510 [doi]
- Robust Regression Analysis Based on the K-DivergenceYair Sorek, Koby Todros. 9511-9515 [doi]
- Evolution Backcasting of Edge Flows From Partial Observations Using Simplicial Vector Autoregressive ModelsRohan T. Money, Joshin Krishnan, Baltasar Beferull-Lozano, Elvin Isufi. 9516-9520 [doi]
- Multiple Object Tracking Based on Occlusion-Aware Embedding Consistency LearningYaoqi Hu, Axi Niu, Yu Zhu 0004, Qingsen Yan, Jinqiu Sun, Yanning Zhang. 9521-9525 [doi]
- Soft Image Segmentation Using Gradient Graph Laplacian RegularizerFei Chen, Gene Cheung, Xue Zhang. 9526-9530 [doi]
- Sparse Regularization Based on Reverse Ordered Weighted L1-Norm and Its Application to Edge-Preserving SmoothingTakayuki Sasaki, Yukihiro Bandoh, Masaki Kitahara. 9531-9535 [doi]
- A New Perspective on Understanding Resolution Limit Via an Asymptotic Study of Christoffel-Darboux Kernel Based Spectrum EstimatorMingyu Jiang, Wenzhe Lu, Heng Qiao. 9536-9540 [doi]
- A Convergent Primal-Dual Deep Plug-and-Play Algorithm for Constrained Image RestorationYodai Suzuki, Ryosuke Isono, Shunsuke Ono. 9541-9545 [doi]
- Ranking of Visual Trackers Using Robust Error NormsJulien Valognes, Maria A. Amer. 9546-9550 [doi]
- Unlabelled Sensing with Priors: Algorithm and BoundsGarweet Sresth, Ajit Rajwade 0001, Satish Mulleti. 9551-9555 [doi]
- Reversible Jump Markov Chain Monte Carlo for Pulse FittingFred Goodyer, Bashar I. Ahmad, Simon J. Godsill. 9556-9560 [doi]
- Efficient Multi-Channel Speech Enhancement with Spherical Harmonics Injection for Directional EncodingJiahui Pan, Pengjie Shen, Hui Zhang 0031, Xueliang Zhang 0001. 9561-9565 [doi]
- Adaptive Sensor Selection with Deterministic Priors for DoA TrackingKaushani Majumder, Sibi Raj B. Pillai, Yonina C. Eldar, Satish Mulleti. 9566-9570 [doi]
- Dynamic Bandwidth Variational Mode DecompositionAndreas G. Angelou, Georgios K. Apostolidis, Leontios J. Hadjileontiadis. 9571-9575 [doi]
- Unitary Approximate Message Passing for Matrix FactorizationZhengdao Yuan, Qinghua Guo 0001, Yonina C. Eldar, Yonghui Li 0001. 9576-9580 [doi]
- On the Convergence of Single-Timescale Multi-Sequence Stochastic Approximation Without Fixed Point SmoothnessYue Huang, Zhaoxian Wu, Qing Ling 0001. 9581-9585 [doi]
- Learn to Track-Before-Detect via Neural Dynamic ProgrammingEyal Fishel Ben, Nikita Tsarov, Tslil Tapiro, Itay Nuri, Nir Shlezinger. 9586-9590 [doi]
- Graph Local-Smooth Dictionary LearningQuentin Laborde, Antoine Mazarguil, Laurent Oudre. 9591-9595 [doi]
- Vector Approximate message Passing with Arbitrary I.I.D. Noise PriorsMohamed Akrout, Tiancheng Gao, Faouzi Bellili, Amine Mezghani. 9596-9600 [doi]
- Finding Representative Sampling Subsets on Graphs via SubmodularityTianyi Li, Geert Leus. 9601-9605 [doi]
- Unsupervised Anomaly Detection for Multivariate Time Series Using Diffusion ModelRongyao Hu, Xinyu Yuan, Yan Qiao, Benchu Zhang, Pei Zhao. 9606-9610 [doi]
- Distributed Vector Approximate Message PassingMukilan Karuppasamy, Mohamed Akrout, Faouzi Bellili, Amine Mezghani. 9611-9615 [doi]
- CAG-FPN: Channel Self-Attention Guided Feature Pyramid Network for Object DetectionJie Chang, Huhe Dai, Yuan Zheng 0002. 9616-9620 [doi]
- Multi-Sensor Multi-Scan Radar Sensing of Multiple Extended TargetsMartin Voigt Vejling, Christophe A. N. Biscio, Petar Popovski. 9621-9625 [doi]
- Statistical and Computational Limits of Detecting and Recovering Hidden SubmatricesMarom Dadon, Wasim Huleihel, Tamir Bendory. 9626-9630 [doi]
- Cardinality-Constrained Binary Quadratic Optimization via Extreme Point Pursuit, with Application to the Densest K-Subgraph ProblemYa Liu, Junbin Liu, Wing-Kin Ma. 9631-9635 [doi]
- Frequency Estimation via Sub-Nyquist Unlimited SamplingYuliang Zhu, Ruiming Guo, Peiyu Zhang, Ayush Bhandari. 9636-9640 [doi]
- On the Generalization Error of Byzantine-Resilient Decentralized LearningHaoxiang Ye, Qing Ling 0001. 9641-9645 [doi]
- Joint Signal Interpolation / Time-Varying Graph Estimation Via Smoothness and Low-Rank PriorsSaghar Bagheri, Gene Cheung, Tim Eadie, Antonio Ortega. 9646-9650 [doi]
- A Novel Iterative Thresholding Algorithm for Arctangent Regularization ProblemZihao He, Qianyu Shu, Jinming Wen, Hing-Cheung So. 9651-9655 [doi]
- Large Covariance Matrix Estimation Based on Factor Models via Nonconvex OptimizationShanshan Zou, Ziping Zhao. 9656-9660 [doi]
- Frequency Analysis and Filter Design for Directed Graphs with Polar DecompositionSemin Kwak, Laura Shimabukuro, Antonio Ortega. 9661-9665 [doi]
- Signal Reconstruction from Nonideal Samples in Fractional Fourier Transform DomainXiaoping Liu, Gong Chen, Jun Shi, Ran Tao. 9666-9670 [doi]
- Probabilistic Simplex Component Analysis via Variational Auto-EncodingYuening Li, Xiao Fu 0001, Wing-Kin Ma. 9671-9675 [doi]
- Blind Separation of Noisy Mixtures Over Galois FieldsOri Ohayon, Arie Yeredor. 9677-9680 [doi]
- Learning Signals and Graphs from Time-Series Graph Data with Few CausesPanagiotis Misiakos, Vedran Mihal, Markus Püschel. 9681-9685 [doi]
- Physically-Constrained Block-Term Tensor Decomposition for Polarimetric Image RecoverySaulo Cardoso Barreto, Julien Flamant, Sebastian Miron, David Brie. 9686-9690 [doi]
- Extension of Clifford Data Regression Methods for Quantum Error MitigationJordi Pérez-Guijarro, Alba Pagès-Zamora, Javier R. Fonollosa. 9691-9695 [doi]
- Imposing Early and Asymptotic Constraints on Ligme with Application to Nonconvex Enhancement of Fused Lasso ModelsWataru Yata, Isao Yamada. 9696-9700 [doi]
- End-to-End Learning of Gaussian Mixture Proposals Using Differentiable Particle Filters and Neural NetworksBenjamin Cox, Sara Pérez-Vieites, Nicolas Zilberstein, Martin Sevilla, Santiago Segarra, Víctor Elvira. 9701-9705 [doi]
- A Modified Cramér-Rao Bound for Discrete-Time Markovian Dynamic SystemsSara El Bouch, Jérôme Galy, Eric Chaumette, Jordi Vilà-Valls. 9706-9710 [doi]
- Dual-Channel Unlimited Sampling for Bandpass SignalsGal Shtendel, Ayush Bhandari. 9711-9715 [doi]
- Sparse PCA with False Discovery Rate Controlled Variable SelectionJasin Machkour, Arnaud Breloy, Michael Muma, Daniel P. Palomar, Frédéric Pascal 0001. 9716-9720 [doi]
- Bayesian Topology Inference on Partially Known Networks from Input-Output PairsMartin Sevilla, Santiago Segarra. 9721-9725 [doi]
- Recursive-Tail-Fista for Sparse Signal RecoveryPradyumna Pradhan, Shaik Basheeruddin Shah, Ramunaidu Randhi, Yonina C. Eldar. 9726-9730 [doi]
- Neuromorphic Sensing Meets Unlimited SamplingAbijith Jagannath Kamath, Chandra Sekhar Seelamantula. 9731-9735 [doi]
- Riemannian Diffusion Adaptation over Graphs with Application to Online Distributed PCAXiuheng Wang, Ricardo Augusto Borsoi, Cédric Richard. 9736-9740 [doi]
- On Time-Encoded Sampling for Multigenerator Shift Invariant SpacesRoshaan Soundarapandian, Amitalok J. Budkuley, Stefano Rini. 9741-9745 [doi]
- Hodge-Aware Contrastive LearningAlexander Möllers, Alexander Immer, Vincent Fortuin, Elvin Isufi. 9746-9750 [doi]
- A Wasserstein Graph Distance Based on Distributions of Probabilistic Node EmbeddingsMichael Scholkemper, Damin Kühn, Gerion Nabbefeld, Simon Musall, Björn Kampa, Michael T. Schaub. 9751-9755 [doi]
- Dynamic Random Feature Gaussian Processes for Bayesian Optimization of Time-Varying FunctionsFernando Llorente 0001, Petar M. Djuric. 9756-9760 [doi]
- Directed Scattering for Knowledge Graph-Based Cellular Signaling AnalysisAarthi Venkat, Joyce A. Chew, Ferran Cardoso Rodriguez, Christopher J. Tape, Michael Perlmutter, Smita Krishnaswamy. 9761-9765 [doi]
- Asymptotic Behavior of Super-Resolution Sparse Bayesian LearningDmitriy Shutin. 9766-9770 [doi]
- Exact Classification of NMR Spectra from NMR SignalsPedro Izquierdo Lehmann, Aline Xavier, Marcelo E. Andia, Carlos A. Sing-Long. 9771-9775 [doi]
- The Rao, Wald, And Likelihood-Ratio Tests under Generalized Self-ConcordanceLang Liu, Zaïd Harchaoui. 9776-9780 [doi]
- A Graph-Prediction-Based Approach for Debiasing Underreported DataHanyang Jiang, Yao Xie 0002. 9781-9785 [doi]
- Symmetric VAR(1) Modelling with Guaranteed StabilityXinhui Rong, Victor Solo. 9786-9790 [doi]
- Sequence of Linear Program for Robust Phase RetrievalSeonho Kim, Kiryung Lee. 9791-9795 [doi]
- Risk-Managed Sparse Index Tracking Via Market Graph ClusteringEisuke Yamagata, Shunsuke Ono. 9796-9800 [doi]
- Irregularity-Aware Bandlimited Approximation for Graph Signal InterpolationDarukeesan Pakiyarajah, Eduardo Pavez, Antonio Ortega. 9801-9805 [doi]
- Graph Signal Processing: The 2D Companion ModelJohn Shi, José M. F. Moura. 9806-9810 [doi]
- Federated Quantum Machine Learning with Differential PrivacyRod Rofougaran, Shinjae Yoo, Huan-Hsin Tseng, Samuel Yen-Chi Chen. 9811-9815 [doi]
- Sketched Column-Based Matrix Approximation With Side InformationJeongMin Chae, Praneeth Narayanamurthy, Selin Bac, Shaama Mallikarjun Sharada, Urbashi Mitra. 9816-9820 [doi]
- Multivariate Density Estimation Using Low-Rank Fejér-Riesz FactorizationParis A. Karakasis, Nicholas D. Sidiropoulos. 9821-9825 [doi]
- Kalman Filtering With Unlimited SensingHongwei Wang, Xi Zheng, Hongbin Li 0001. 9826-9830 [doi]
- A Riemannian-Based Joint Design Framework of Mimo Radar Transmit Waveform And Receive Filter Via Information TheoryJie Li, Yan Huang, QiHui Wu, Arye Nehorai. 9831-9835 [doi]
- An Efficient Hierarchical Block Coordinate Descent Method for Time-Varying Graphical LassoZhaoye Pan, Xiaolu Wang, Huikang Liu, Jun Zhang. 9836-9840 [doi]
- Spectrogram Smoothing for Estimation of the Evolutionary Spectra of Uniformly Modulated ProcessesSkyepaphora Griffith, Glen Takahara, Wesley S. Burr. 9841-9845 [doi]
- Object Correlation Matrix for Two-Stage Object Detection NetworkBing Wang, Hangbin Ye, Xingpeng Zhang, Dong He, Xin Wang, Qiuli Wang 0004, Chunlan Zhao. 9846-9850 [doi]
- Self Knowledge Distillation Based On Layer-Wise Weighted Feature Imitation For Efficient Object DetectionLiangqi Zhong, Shengye Yan. 9851-9855 [doi]
- Optimal Transport Distances for Directed, Weighted Graphs: A Case Study With Cell-Cell Communication NetworksJames Shiniti Nagai, Ivan G. Costa, Michael T. Schaub. 9856-9860 [doi]
- Learning Graphs and Simplicial Complexes from DataAndrei Buciulea, Elvin Isufi, Geert Leus, Antonio G. Marques. 9861-9865 [doi]
- Neural Network-Based Symbolic Regression for Empirical Modeling of the Behavior of a Planetary GearboxNacer Yousfi, Karim Abed-Meraim, Yosra Marnissi, Maxime Leiber, Mohamed El Badaoui. 9866-9870 [doi]
- Cramer-Rao Bound for Admittance Matrix Estimation under Laplacian ConstraintsMorad Halihal, Tirza Routtenberg. 9871-9875 [doi]
- Inferring Time Varying Signals over Uncertain GraphsMohammad Sabbaqi, Elvin Isufi. 9876-9880 [doi]
- Spiral Shape Matters: Novel Bio-Inspired Cochlear CepstrumHessa Alfalahi, Ahsan H. Khandoker, Leontios J. Hadjileontiadis. 9881-9885 [doi]
- Robust Recovery of Joint Sparse Signals via Simultaneous Orthogonal Matching PursuitYuxuan Zhang, Jian Wang. 9886-9890 [doi]
- An Unsupervised Segmentation of Vocal Breath SoundsShivani Yadav, Dipanjan Gope, K. Uma Maheswari, Prasanta Kumar Ghosh. 9891-9895 [doi]
- Disentangling the Spectral Properties of the Hodge Laplacian: not all small Eigenvalues are EqualVincent P. Grande, Michael T. Schaub. 9896-9900 [doi]
- Tensor Graph Decomposition for Temporal NetworksBishwadeep Das, Elvin Isufi. 9901-9905 [doi]
- Learning the Barankin Lower Bound on DOA Estimation ErrorHai Victor Habi, Hagit Messer, Yoram Bresler. 9906-9910 [doi]
- Cyclic Misspecified Cramer-Rao Bound for Periodic Parameter EstimationMalaak Khatib, Nadav Harel, Yochai Ben-Horin, Yael Radzyner, Tirza Routtenberg. 9911-9915 [doi]
- Asymptotically Tight Misspecified Bayesian Cramér-Rao BoundNadav E. Rosenthal, Joseph Tabrikian. 9916-9920 [doi]
- Enhancing Hyperspectral Anomaly Detection by Difference-of-Convex Sparse Anomaly ModelingKoyo Sato, Kazuki Naganuma, Shunsuke Ono. 9921-9925 [doi]
- Filamentary Convolution for Spoken Language Identification: A Brain-Inspired ApproachBoyuan Zhang, Shuyuan Zhu, Tong Xie, Xibang Yang, Yahui Liu, Bing Zeng. 9926-9930 [doi]
- Recovering Missing Node Features with Local Structure-Based EmbeddingsVictor M. Tenorio, Madeline Navarro, Santiago Segarra, Antonio G. Marques. 9931-9935 [doi]
- A Spectral Analysis of Graph Neural Networks on Dense and Sparse GraphsLuana Ruiz, Ningyuan Teresa Huang, Soledad Villar. 9936-9940 [doi]
- TACos: Learning Temporally Structured Embeddings for Few-Shot Keyword Spotting with Dynamic Time WarpingKevin Wilkinghoff, Alessia Cornaggia-Urrigshardt. 9941-9945 [doi]
- Leverage Causal Graphs and Rumor-Refuting Texts for Interpretable Rumor AnalysisJiawen Huang, Donglin Cao, Dazhen Lin. 9946-9950 [doi]
- Towards Controlled Table-to-Text Generation with Scientific ReasoningZhixin Guo, Jianping Zhou 0004, Jiexing Qi, Mingxuan Yan, Ziwei He, Guanjie Zheng, Zhouhan Lin, Xinbing Wang, Chenghu Zhou. 9951-9955 [doi]
- Distilling Distributional Uncertainty from a Gaussian ProcessJeremy H. M. Wong, Nancy F. Chen. 9956-9960 [doi]
- Graph-Aware Multi-View Fusion for Rumor Detection on Social MediaYang Wu, Jing Yang, Liming Wang, Zhen Xu. 9961-9965 [doi]
- A Unified Framework for Multi-Intent Spoken Language Understanding with PromptingFeifan Song 0001, Lianzhe Huang, Houfeng Wang. 9966-9970 [doi]
- Unsupervised Learning of Neural Semantic Mappings with the Hungarian Algorithm for Compositional SemanticsXiang Zhang, Shizhu He, Kang Liu, Jun Zhao. 9971-9975 [doi]
- Language Model is a Branch Predictor for Simultaneous Machine TranslationAoxiong Yin, Tianyun Zhong, Haoyuan Li, Siliang Tang, Zhou Zhao. 9976-9980 [doi]
- Taming Prompt-Based Data Augmentation for Long-Tailed Extreme Multi-Label Text ClassificationPengyu Xu, MingYang Song, Ziyi Li, Sijin Lu, Liping Jing, Jian Yu. 9981-9985 [doi]
- SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASRZhiyun Fan, Linhao Dong, Jun Zhang 0066, Lu Lu 0015, Zejun Ma. 9986-9990 [doi]
- How Can Personalized Context Help? Exploring Joint Retrieval of Passage and Personalized ContextHui Wan, Hongkang Li, Songtao Lu, Xiaodong Cui, Marina Danilevsky. 9991-9995 [doi]
- An Empirical Investigation of Domain Adaptation Ability for Chinese Spelling Check ModelsXi Wang, Ruoqing Zhao, Hongliang Dai, Piji Li. 9996-10000 [doi]
- Communication-Efficient Personalized Federated Learning for Speech-to-Text TasksYichao Du, Zhirui Zhang, Linan Yue, Xu Huang 0008, Yuqing Zhang, Tong Xu 0001, Linli Xu, Enhong Chen. 10001-10005 [doi]
- Zero Resource Code-Switched Speech Benchmark Using Speech Utterance Pairs for Multiple Spoken LanguagesKuan-Po Huang, Chih-Kai Yang, Yu-Kuan Fu, Ewan Dunbar, Hung-yi Lee. 10006-10010 [doi]
- Automatic Temporal Alignment for Pitch Estimation EvaluationDesheng Wang, Jing Wang, Hao Zheng, Yanbin Hou. 10011-10015 [doi]
- A Soft Contrastive Learning-Based Prompt Model for Few-Shot Sentiment AnalysisJingyi Zhou, Jie Zhou 0015, Jiabao Zhao, Siyin Wang, Haijun Shan, Tao Gui, Qi Zhang 0001, Xuanjing Huang 0001. 10016-10020 [doi]
- GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Accurate Speech Emotion RecognitionYu Pan, Yanni Hu, Yuguang Yang 0005, Wen Fei, Jixun Yao, Heng Lu 0004, Lei Ma, Jianjun Zhao. 10021-10025 [doi]
- In-Context Learning for Few-Shot Nested Named Entity RecognitionMeishan Zhang, Bin Wang 0004, Hao Fei 0001, Min Zhang 0005. 10026-10030 [doi]
- Adaptive Prompt Construction Method for Relation ExtractionZhenbin Chen, Zhixin Li 0001, Ying Huang, Zhenjun Tang. 10031-10035 [doi]
- Large Scale Self-Supervised Pretraining for Active Speaker DetectionOtavio Braga, Wei Xia, Keith Johnson, Alice Chuang, Yunfan Ye, Olivier Siohan, Tuan-Anh Nguyen. 10036-10040 [doi]
- Invertible Voice Conversion with Parallel DataZexin Cai, Ming Li 0026. 10041-10045 [doi]
- Incomplete Observations Bias Suppression for Abductive Natural Language InferenceYu Gu, Xianlong Luo, Meng Yang. 10046-10050 [doi]
- Implicit Enhancement of Target Speaker in Speaker-Adaptive ASR through Efficient Joint OptimizationMinghui Wu, Haitao Tang, Jiahuan Fan, Ruoyu Wang, Hang Chen, Yanyong Zhang, Jun Du, Hengshun Zhou, Lei Sun, Xin Fang, Tian Gao, Genshun Wan, Jia Pan, Jianqing Gao. 10051-10055 [doi]
- Empowering Vision-Language Models for Reasoning Ability through Large Language ModelsYueting Yang, Xintong Zhang, Jinan Xu, Wenjuan Han. 10056-10060 [doi]
- G2PU: Grapheme-To-Phoneme Transducer with Speech UnitsHeting Gao, Mark Hasegawa-Johnson, Chang D. Yoo. 10061-10065 [doi]
- Ultra-Lightweight Neural Differential DSP Vocoder for High Quality Speech SynthesisPrabhav Agrawal, Thilo Köhler, Zhiping Xiu, Prashant Serai, Qing He. 10066-10070 [doi]
- Sensi-Bert: Towards Sensitivity Driven Fine-Tuning for Parameter-Efficient Language ModelSouvik Kundu 0002, Sharath Nittur Sridhar, Maciej Szankin, Sairam Sundaresan. 10071-10075 [doi]
- Robust Self-Supervised Learning with Contrast Samples for Natural Language UnderstandingJie Liu, Xue Han, Chao Deng, Junlan Feng. 10076-10080 [doi]
- Multi-View Speaker Embedding Learning for Enhanced Stability and DiscriminabilityLiang He 0003, Zhihua Fang, Zuoer Chen, Minqiang Xu, Ying Meng, PengHao Wang. 10081-10085 [doi]
- 3TQA: Multi-View, Multi-Hop and Multi-Stage Reasoning for Temporal Question AnsweringZhiyuan Zha, Pengnian Qi, Xigang Bao, Mengyuan Tian, Biao Qin. 10086-10090 [doi]
- Introducing Multilingual Phonetic Information to Speaker Embedding for Speaker VerificationZhida Song, Liang He 0003, PengHao Wang, Ying Hu 0005, Hao Huang 0009. 10091-10095 [doi]
- Personalization of CTC-Based End-to-End Speech Recognition Using Pronunciation-Driven Subword TokenizationZhihong Lei, Ernest Pusateri, Shiyi Han, Leo Liu, Mingbin Xu, Tim Ng, Ruchir Travadi, Youyuan Zhang, Mirko Hannemann, Man-Hung Siu, Zhen Huang 0001. 10096-10100 [doi]
- Improving Speech Emotion Recognition with Unsupervised Speaking Style TransferLeyuan Qu, Wei Wang 0310, Cornelius Weber, Pengcheng Yue, Taihao Li, Stefan Wermter. 10101-10105 [doi]
- Temporal Knowledge Graph Embedding using Householder TransformationsSensen Zhang, Xun Liang 0001, Simin Niu, Junlan Feng, Chen Feng, Mengwei Wang. 10106-10110 [doi]
- Emotion Neural Transducer for Fine-Grained Speech Emotion RecognitionSiyuan Shen, Yu Gao, Feng Liu 0039, Hanyang Wang, Aimin Zhou. 10111-10115 [doi]
- S-Evaluator: Enhance Factual Consistency Evaluator with Adversarial Data Synthesized by Large Language ModelJunnan Liu, Wenlong Du, Qingquan Li, Xuewei Wang, Zhongjun Zhou, Jin Liu. 10116-10120 [doi]
- BIGVSAN: Enhancing Gan-Based Neural Vocoders with Slicing Adversarial NetworkTakashi Shibuya 0001, Yuhta Takida, Yuki Mitsufuji. 10121-10125 [doi]
- Self-Supervised Pretraining for Robust Personalized Voice Activity Detection in Adverse ConditionsHolger Severin Bovbjerg, Jesper Jensen 0001, Jan Østergaard, Zheng-Hua Tan. 10126-10130 [doi]
- An End-to-End EEG Channel Selection Method with Residual Gumbel Softmax for Brain-Assisted Speech EnhancementQing-Tian Xu, Jie Zhang, Zhen-Hua Ling. 10131-10135 [doi]
- Conformer is All You Need for Visual Speech RecognitionOscar Chang, Hank Liao, Dmitriy Serdyuk, Ankit Shahy, Olivier Siohan. 10136-10140 [doi]
- Two-Step Knowledge Distillation for Tiny Speech EnhancementRayan Daod Nathoo, Mikolaj Kegler, Marko Stamenovic. 10141-10145 [doi]
- Co-Occurrence Graph-Enhanced Hierarchical Prediction of ICD CodesSoha Sadat Mahdi, Eirini Papagiannopoulou, Nikos Deligiannis, Hichem Sahli. 10146-10150 [doi]
- Contextual Biasing of Named-Entities with Large Language ModelsChuanneng Sun, Zeeshan Ahmed, Yingyi Ma, Zhe Liu 0011, Lucas Kabela, Yutong Pang, Ozlem Kalinli. 10151-10155 [doi]
- Robust Spoof Speech Detection Based on Multi-Scale Feature Aggregation and Dynamic ConvolutionHaochen Wu, Jie Zhang, Zhentao Zhang, Wenting Zhao, Bin Gu, Wu Guo. 10156-10160 [doi]
- Temporal Convolution Shrinkage Network for Keyword SpottingHai Zhu, Xin Wang, Kun Wang, Huayi Zhan. 10161-10165 [doi]
- What Do Self-Supervised Speech and Speaker Models Learn? New Findings from a Cross Model Layer-Wise AnalysisTakanori Ashihara, Marc Delcroix, Takafumi Moriya, Kohei Matsuura, Taichi Asami, Yusuke Ijima. 10166-10170 [doi]
- Unleashing Trigger-Free Event Detection: Revealing Event Correlations Via a Contrastive Derangement FrameworkHongzhan Lin 0001, Haiqin Yang, Ziyang Luo, Jing Ma 0004. 10171-10175 [doi]
- Multiple Representation Transfer from Large Language Models to End-to-End ASR SystemsTakuma Udagawa, Masayuki Suzuki, Gakuto Kurata, Masayasu Muraoka, George Saon. 10176-10180 [doi]
- A Unified Front-End Framework for English Text-to-Speech SynthesisZelin Ying, Chen Li 0042, Yu Dong, Qiuqiang Kong, Qiao Tian, Yuanyuan Huo, Yuxuan Wang 0002. 10181-10185 [doi]
- Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic CodingChunyu Qiang, Hao Li, Hao Ni, He Qu, Ruibo Fu, Tao Wang 0074, Longbiao Wang, Jianwu Dang 0001. 10186-10190 [doi]
- Promoting Independence of Depression and Speaker Features for Speaker Disentanglement in Speech-Based Depression DetectionLishi Zuo, Man-Wai Mak, Youzhi Tu. 10191-10195 [doi]
- Learning Speech Representation from Contrastive Token-Acoustic PretrainingChunyu Qiang, Hao Li, Yixin Tian, Ruibo Fu, Tao Wang 0074, Longbiao Wang, Jianwu Dang 0001. 10196-10200 [doi]
- Accent-Specific Vector Quantization for Joint Unsupervised and Supervised Training in Accent Robust Speech RecognitionLi Li, Yijie Li, Dongxing Xu, Haoran Wei, Yanhua Long. 10201-10205 [doi]
- SDIF-DA: A Shallow-to-Deep Interaction Framework with Data Augmentation for Multi-Modal Intent DetectionShijue Huang, Libo Qin 0001, Bingbing Wang, Geng Tu, Ruifeng Xu. 10206-10210 [doi]
- Multilingual Audio-Visual Speech Recognition with Hybrid CTC/RNN-T Fast ConformerMaxime Burchi, Krishna C. Puvvada, Jagadeesh Balam, Boris Ginsburg, Radu Timofte. 10211-10215 [doi]
- TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-Device ASR ModelsYuan Shangguan, Haichuan Yang, Danni Li, Chunyang Wu, Yassir Fathullah, Dilin Wang, Ayushi Dalmia, Raghuraman Krishnamoorthi, Ozlem Kalinli, Junteng Jia, Jay Mahadeokar, Xin Lei, Mike Seltzer, Vikas Chandra. 10216-10220 [doi]
- A Speaker Recognition Method Based on Stable LearningJian Zhang, Jing Ma, Xiaochen Guo, Lin Li, Liang He. 10221-10225 [doi]
- Multi-Source Unsupervised Transfer Components Learning for Cross-Domain Speech Emotion RecognitionShenjie Jiang, Peng Song 0002, Shaokai Li, Run Wang, Wenming Zheng. 10226-10230 [doi]
- Build a 50+ Hours Chinese Mandarin Corpus for Children's Speech RecognitionHao Xu, Jing Yang, Jiahao Wang, Wenxin Hu. 10231-10235 [doi]
- Unsupervised Accent Adaptation Through Masked Language Model Correction of Discrete Self-Supervised Speech UnitsJakob Poncelet, Hugo Van Hamme. 10236-10240 [doi]
- On Real-Time Multi-Stage Speech Enhancement SystemsLingjun Meng, Jozef Coldenhoff, Paul Kendrick, Tijana Stojkovic, Andrew Harper, Kiril Ratmanski, Milos Cernak. 10241-10245 [doi]
- RL-EMO: A Reinforcement Learning Framework for Multimodal Emotion RecognitionChengwen Zhang, Yuhao Zhang, Bo Cheng 0001. 10246-10250 [doi]
- Automatic Recognition of Gesture Identity and Onset of Cued-SpeechAnnahita Sarré, Hagar Salpeter, Deliane Bechar, Laurent Cohen, Yair Lakretz. 10251-10255 [doi]
- Attention-Guided Adaptation for Code-Switching Speech RecognitionBobbi Aditya, Mahdin Rohmatillah, Liang-Hsuan Tai, Jen-Tzung Chien. 10256-10260 [doi]
- Forgetting Private Textual Sequences in Language Models Via Leave-One-Out EnsembleZhe Liu 0011, Ozlem Kalinli. 10261-10265 [doi]
- A Study on the Adverse Impact of Synthetic Speech on Speech RecognitionJian Huang, Yancheng Bai, Yang Cai, Wei Bian. 10266-10270 [doi]
- Voxblink: A Large Scale Speaker Verification Dataset on CameraYuke Lin, Xiaoyi Qin, Guoqing Zhao, Ming Cheng, Ning Jiang, Haiying Wu, Ming Li. 10271-10275 [doi]
- Hint-Enhanced In-Context Learning Wakes Large Language Models Up For Knowledge-Intensive TasksYifan Wang, Qingyan Guo, Xinzhe Ni, Chufan Shi, Lemao Liu, Haiyun Jiang, Yujiu Yang. 10276-10280 [doi]
- Can ChatGPT Serve as a Multi-Criteria Decision Maker? A Novel Approach to Supplier EvaluationXihui Wang, Xiaojun Wu. 10281-10285 [doi]
- SADA: Saudi Audio Dataset for ArabicSadeen Alharbi, Areeb Alowisheq, Zoltán Tüske, Kareem Darwish, Abdullah Alrajeh, Abdulmajeed Alrowithi, Aljawharah Bin Tamran, Asma Ibrahim, Raghad Aloraini, Raneem Alnajim, Ranya Alkahtani, Renad Almuasaad, Sara Alrasheed, Shaykhah Alsubaie, Yaser Alonaizan. 10286-10290 [doi]
- TB-ResNet: Bridging the Gap from TDNN to ResNet in Automatic Speaker Verification with Temporal-Bottleneck EnhancementSunmook Choi, Sanghyeok Chung, Seungeun Lee, Soyul Han, Taein Kang, Jaejin Seo, Il-Youp Kwak, Seungsang Oh. 10291-10295 [doi]
- Can We Trust Explainable AI Methods on ASR? An Evaluation on Phoneme RecognitionXiaoliang Wu, Peter Bell 0001, Ajitha Rajan. 10296-10300 [doi]
- TextrolSpeech: A Text Style Control Speech Corpus with Codec Language Text-to-Speech ModelsShengpeng Ji, Jialong Zuo, Minghui Fang 0002, Ziyue Jiang 0004, Feiyang Chen, Xinyu Duan, Baoxing Huai, Zhou Zhao. 10301-10305 [doi]
- OpenTE: Open-Structure Table Extraction From TextHaoyu Dong 0001, Mengkang Hu, Qinyu Xu, Haochen Wang, Yue Hu 0002. 10306-10310 [doi]
- Can Large-Scale Vocoded Spoofed Data Improve Speech Spoofing Countermeasure with a Self-Supervised Front End?Xin Wang 0037, Junichi Yamagishi. 10311-10315 [doi]
- Paralinguistics-Enhanced Large Language Modeling of Spoken DialogueGuan-Ting Lin, Prashanth Gurunath Shivakumar, Ankur Gandhe, Chao-Han Huck Yang, Yile Gu, Shalini Ghosh, Andreas Stolcke, Hung-yi Lee, Ivan Bulyko. 10316-10320 [doi]
- SYNTHE-SEES: Face Based Text-to-Speech for Virtual SpeakerJae-Hyun Park, Joon Gyu Maeng, Taejun Bak, Young-Sun Joo. 10321-10325 [doi]
- Wav2vec-VC: Voice Conversion via Hidden Representations of Wav2vec 2.0Jaemin Lim, Kiyeon Kim. 10326-10330 [doi]
- Joint Multi-Facts Reasoning Network for Complex Temporal Question Answering Over Knowledge GraphRikui Huang, Wei Wei 0002, Xiaoye Qu, Wenfeng Xie, Xianling Mao, Dangyang Chen. 10331-10335 [doi]
- Emphasized Non-Target Speaker Knowledge in Knowledge Distillation for Automatic Speaker VerificationDuc-Tuan Truong, Ruijie Tao, Jia Qi Yip, Kong-Aik Lee, Eng Siong Chng. 10336-10340 [doi]
- Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic ModelsJu-ho Kim, Jungwoo Heo, Hyun-seo Shin, Chan-yeong Lim, Ha-Jin Yu. 10341-10345 [doi]
- SeACo-Paraformer: A Non-Autoregressive ASR System with Flexible and Effective Hotword Customization AbilityXian Shi, Yexin Yang, Zerui Li, Yanni Chen, Zhifu Gao, Shiliang Zhang. 10346-10350 [doi]
- Inversive-Reasoning Augmentation for Natural Language InferenceXixi Zhou, Xin Jie, Sheng Zhou 0004, Keyue Shi, Zhi Yu, Jiajun Bu, Haishuai Wang. 10351-10355 [doi]
- MossFormer2: Combining Transformer and RNN-Free Recurrent Network for Enhanced Time-Domain Monaural Speech SeparationShengkui Zhao, Yukun Ma, Chongjia Ni, Chong Zhang 0003, Hao Wang 0199, Trung Hieu Nguyen 0001, Kun Zhou 0003, Jia Qi Yip, Dianwen Ng, Bin Ma 0001. 10356-10360 [doi]
- Hierarchical Speaker Representation for Target Speaker ExtractionShulin He, Huaiwen Zhang, Wei Rao, Kanghao Zhang, Yukai Ju, Yang Yang, Xueliang Zhang 0001. 10361-10365 [doi]
- Are Soft Prompts Good Zero-Shot Learners for Speech Recognition?Dianwen Ng, Chong Zhang 0003, Ruixi Zhang, Yukun Ma, Fabian Ritter Gutierrez, Trung Hieu Nguyen 0001, Chongjia Ni, Shengkui Zhao, Eng Siong Chng, Bin Ma 0001. 10366-10370 [doi]
- Zero Shot Audio To Audio Emotion Transfer With Speaker DisentanglementSoumya Dutta, Sriram Ganapathy. 10371-10375 [doi]
- Generating High-Quality Adversarial Examples with Universal Perturbation-Based Adaptive Network and Improved Perceptual LossZhuhai Li, Wu Guo, Jie Zhang. 10376-10380 [doi]
- Leveraging Timestamp Information for Serialized Joint Streaming Recognition and TranslationSara Papi, Peidong Wang, Junkun Chen, Jian Xue, Naoyuki Kanda, Jinyu Li 0001, Yashesh Gaur. 10381-10385 [doi]
- Improving Vision-Inspired Keyword Spotting Using Dynamic Module Skipping in Streaming Conformer EncoderAlexandre Bittar, Paul Dixon, Mohammad Samragh, Kumari Nishu, Devang Naik. 10386-10390 [doi]
- LITEVSR: Efficient Visual Speech Recognition by Learning from Speech Representations of Unlabeled DataHendrik Laux, Emil Mededovic, Ahmed Hallawa, Lukas Martin, Arne Peine, Anke Schmeink. 10391-10395 [doi]
- Cross-Modal Parallel Training for Improving end-to-end Accented Speech RecognitionRenchang Dong, Yijie Li, Dongxing Xu, Yanhua Long. 10396-10400 [doi]
- Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTSYifan Yang, Feiyu Shen, Chenpeng Du, Ziyang Ma, Kai Yu 0004, Daniel Povey, Xie Chen 0001. 10401-10405 [doi]
- What Do Neural Networks Listen to? Exploring the Crucial Bands in Speech Enhancement Using SINC-ConvolutionKuan-Hsun Ho, Jeih-Weih Hung, Berlin Chen. 10406-10410 [doi]
- Speaker-Adaptive Lipreading Via Spatio-Temporal Information LearningYi He, Lei Yang, Hanyi Wang, Yun Zhu, Shilin Wang. 10411-10415 [doi]
- BWSNET: Automatic Perceptual Assessment of Audio SignalsClément Le Moine Veillon, Victor Rosi, Pablo Arias Sarah, Léane Salais, Nicolas Obin. 10416-10420 [doi]
- Target Speech Extraction with Pre-Trained Self-Supervised Learning ModelsJunyi Peng, Marc Delcroix, Tsubasa Ochiai, Oldrich Plchot, Shoko Araki, Jan Cernocký. 10421-10425 [doi]
- Hybrid Attention Time-Frequency Analysis Network for Single-Channel Speech EnhancementZehua Zhang, Xingwei Liang, Ruifeng Xu, Mingjiang Wang. 10426-10430 [doi]
- Diffusion-Based Speech Enhancement in Matched and Mismatched Conditions Using a Heun-Based SamplerPhilippe Gonzalez, Zheng-Hua Tan, Jan Østergaard, Jesper Jensen 0001, Tommy Sonne Alstrøm, Tobias May. 10431-10435 [doi]
- Comparable Demonstrations Are Important In In-Context Learning: A Novel Perspective On Demonstration SelectionCaoyun Fan, Jidong Tian, Yitian Li, Hao He 0007, Yaohui Jin. 10436-10440 [doi]
- S2E: Towards an End-to-End Entity Resolution Solution from Acoustic SignalKangrui Ruan, Xin He, Jiyang Wang, Xiaozhou Zhou, Helian Feng, Ali Kebarighotbi. 10441-10445 [doi]
- JPIS: A Joint Model for Profile-Based Intent Detection and Slot Filling with Slot-to-Intent AttentionThinh Pham, Dat Quoc Nguyen. 10446-10450 [doi]
- A Multimodal Approach to Device-Directed Speech Detection with Large Language ModelsDominik Wagner, Alexander W. Churchill, Siddharth Sigtia, Panayiotis G. Georgiou, Matt Mirsamadi, Aarshee Mishra, Erik Marchi. 10451-10455 [doi]
- Speaker Adaptation For Enhancement Of Bone-Conducted SpeechAmin Edraki, Wai-Yip Chan, Jesper Jensen 0001, Daniel Fogerty. 10456-10460 [doi]
- Synthetic Conversations Improve Multi-Talker ASRThai Binh Nguyen, Alexander Waibel. 10461-10465 [doi]
- Self-Supervised Domain Exploration with an Optimal Transport Regularization for Open Set Cross-Domain Speech Emotion RecognitionRuiteng Zhang, Jianguo Wei, Xugang Lu, Yongwei Li, Wenhuan Lu, Di Jin 0001, Junhai Xu. 10466-10470 [doi]
- Visual Speech Recognition for Languages with Limited Labeled Data Using Automatic Labels from WhisperJeong Hun Yeo, Minsu Kim, Shinji Watanabe 0001, Yong Man Ro. 10471-10475 [doi]
- Target Speaker Extraction by Directly Exploiting Contextual Information in the Time-Frequency DomainXue Yang, Changchun Bao, Jing Zhou, Xianhong Chen. 10476-10480 [doi]
- An Efficient and Interpre Table Speech Enhancement Network Via Deep Dictionary LearningXinmeng Xu, Yiqun Zhang, Weiping Tu, Yuhong Yang 0001. 10481-10485 [doi]
- Curricular Contrastive Regularization for Speech Enhancement with Self-Supervised RepresentationsXinmeng Xu, Chang-Han, Yiqun Zhang, Weiping Tu, Yuhong Yang 0001. 10486-10490 [doi]
- A Study on Combining Non-Parallel and Parallel Methodologies for Mandarin-English Cross-Lingual Voice ConversionChang Huai You, Minghui Dong. 10491-10495 [doi]
- Multi-Modal Emotion Recognition Using Multiple Acoustic Features and Dual Cross-Modal TransformerYanfeng Wu, Pengcheng Yue, Leyuan Qu, Taihao Li, Yu-Ping Ruan. 10496-10500 [doi]
- Reflow-TTS: A Rectified Flow Model for High-Fidelity Text-to-SpeechWenhao Guan, Qi Su, Haodong Zhou, Shiyu Miao, Xingjia Xie, Lin Li, Qingyang Hong. 10501-10505 [doi]
- SICRN: Advancing Speech Enhancement through State Space Model and Inplace Convolution TechniquesChangjiang Zhao, Shulin He, Xueliang Zhang. 10506-10510 [doi]
- Lightweight Multi-Axial Transformer with Frequency Prompt for Single Channel Speech EnhancementXingwei Liang, Zehua Zhang, Mingjiang Wang, Ruifeng Xu. 10511-10515 [doi]
- MSG-BART: Multi-Granularity Scene Graph-Enhanced Encoder-Decoder Language Model for Video-Grounded Dialogue GenerationHongcheng Liu, Zhe Chen, Hui Li, Pingjie Wang, Yanfeng Wang, Yu Wang 0027. 10516-10520 [doi]
- Frame-Wise Streaming end-to-end Speaker Diarization with Non-Autoregressive Self-Attention-Based AttractorsDi Liang, Nian Shao, Xiaofei Li. 10521-10525 [doi]
- Enhancing Generative Aspect-Based Sentiment Analysis with Relation-Level Supervision and PromptYifan Yang, Yice Zhang, Ruifeng Xu. 10526-10530 [doi]
- CIF-T: A Novel CIF-Based Transducer Architecture for Automatic Speech RecognitionTian-Hao Zhang, Dinghao Zhou, Guiping Zhong, Jiaming Zhou, Baoxiang Li. 10531-10535 [doi]
- PromptASR for Contextualized ASR with Controllable StyleXiaoyu Yang, Wei Kang, Zengwei Yao, Yifan Yang, Liyong Guo, Fangjun Kuang, Long Lin, Daniel Povey. 10536-10540 [doi]
- Knowledge-Aware Prompt Learning Framework for Korean-Chinese Microblog Sentiment AnalysisXinyu Yang, Hengxuan Wang, Huiling Jin, Zhenguo Zhang, Xiaojie Yuan. 10541-10545 [doi]
- Improving Attention-Based End-to-End Speech Recognition by Monotonic Alignment Attention Matrix ReconstructionZiyang Zhuang, Kun Zou, Chenfeng Miao, Ming Fang, Tao Wei, Zijian Li, Wei Hu, Shaojun Wang, Jing Xiao. 10546-10550 [doi]
- Feature Mixing-Based Active Learning for Multi-Label Text ClassificationXue Han, Qing Wang, Yitong Wang, Jiahui Wang, Chao Deng, Junlan Feng. 10551-10555 [doi]
- CausalME: Balancing bi-modalities in Visual Question AnsweringChenji Lu, Ge Bai, Shilong Li, Ying Liu, Xiyan Liu, Zerong Zeng, Ruifang Liu. 10556-10560 [doi]
- CIF-RNNT: Streaming ASR Via Acoustic Word Embeddings with Continuous Integrate-and-Fire and RNN-TransducersWen Shen Teo, Yasuhiro Minami. 10561-10565 [doi]
- CDUMA: An Adaptive Approach for Mitigating Confounder for MCQAShilong Li, Chenji Lu, Ge Bai, Ying Liu, Xiyan Liu, Zhang Zhang, Ruifang Liu. 10566-10570 [doi]
- Promptvc: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language PromptsJixun Yao, Yuguang Yang 0005, Yi Lei, Ziqian Ning, Yanni Hu, Yu Pan, Jingjing Yin, Hongbin Zhou, Heng Lu 0004, Lei Xie 0001. 10571-10575 [doi]
- Cross-Modal Alignment for End-to-End Spoken Language Understanding Based on Momentum Contrastive LearningBeida Zheng, Mijit Ablimit, Askar Hamdulla. 10576-10580 [doi]
- HM-CONFORMER: A Conformer-Based Audio Deepfake Detection System with Hierarchical Pooling and Multi-Level Classification Token Aggregation MethodsHyun-seo Shin, Jungwoo Heo, Ju-ho Kim, Chan-yeong Lim, Wonbin Kim, Ha-Jin Yu. 10581-10585 [doi]
- Extending Multilingual ASR to New Languages Using Supplementary Encoder and Decoder ComponentsYerbolat Khassanov, Zhipeng Chen, Tianfeng Chen, Tze Yuang Chong, Wei Li, Lu Lu, Zejun Ma. 10586-10590 [doi]
- Unimodal Aggregation for CTC-Based Speech RecognitionYing Fang, Xiaofei Li. 10591-10595 [doi]
- Automatic Detection Of Sleepiness-Related Syndromes and Symptoms Using Voice and Speech BiomarkersVincent P. Martin, Jean-Luc Rouas, Pierre Philip. 10596-10600 [doi]
- Hierarchical Emotion Prediction and Control in Text-to-Speech SynthesisSho Inoue, Kun Zhou 0003, Shuai Wang 0016, Haizhou Li 0001. 10601-10605 [doi]
- Estimating Symptoms and Clinical Signs Instead of Disorders: The Path Toward The Clinical Use of Voice and Speech Biomarkers In PsychiatryVincent P. Martin, Jean-Luc Rouas. 10606-10610 [doi]
- Learning Emotion-Invariant Speaker Representations for Speaker VerificationJingguang Tian, Xinhui Hu, Xinkang Xu. 10611-10615 [doi]
- Multi-Scale Sub-Band Constant-Q Transform Discriminator for High-Fidelity VocoderYicheng Gu, Xueyao Zhang, Liumeng Xue, Zhizheng Wu 0001. 10616-10620 [doi]
- LCB-Net: Long-Context Biasing for Audio-Visual Speech RecognitionFan Yu, Haoxu Wang, Xian Shi, Shiliang Zhang. 10621-10625 [doi]
- Towards Automatic Data Augmentation for Disordered Speech RecognitionZengrui Jin, Xurong Xie, Tianzi Wang, Mengzhe Geng, Jiajun Deng, Guinan Li, Shujie Hu, Xunying Liu. 10626-10630 [doi]
- PECER: Empathetic Response Generation Via Dynamic Personality Extraction and Contextual Emotional ReasoningMingxiu Cai, Daling Wang, Shi Feng 0001, Yifei Zhang 0003. 10631-10635 [doi]
- Towards Improving Speech Emotion Recognition Using Synthetic Data Augmentation from Emotion ConversionKarim M. Ibrahim, Antony Perzo, Simon Leglaive. 10636-10640 [doi]
- Phoneme-Aware Encoding for Prefix-Tree-Based Contextual ASRHayato Futami, Emiru Tsunoo, Yosuke Kashiwagi, Hiroaki Ogawa, Siddhant Arora, Shinji Watanabe 0001. 10641-10645 [doi]
- Sorting, Reasoning, and Extraction: An Easy-to-Hard Reasoning Framework for Document-Level Event Argument ExtractionHao Li, Yanan Cao, Yubing Ren, Fang Fang 0009, Lanxue Zhang, Yingjie Li, Shi Wang. 10646-10650 [doi]
- Cross-Target Stance Detection by Exploiting Target Analytical PerspectivesDaijun Ding, Rong Chen, Liwen Jing, Bowen Zhang, Xu Huang, Li Dong, Xiaowen Zhao, Ge Song. 10651-10655 [doi]
- Speech Relationship Learning for Cross-Corpus Speech Emotion RecognitionYinru He, Guihua Wen, Pei Yang 0001, Dongliang Chen. 10656-10660 [doi]
- Langwave: Realistic Voice Generation Based on High-Order Langevin DynamicsZiqiang Shi, Rujie Liu. 10661-10665 [doi]
- Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-Talker SpeechJunjie Li, Ruijie Tao, Zexu Pan, Meng Ge, Shuai Wang 0016, Haizhou Li 0001. 10666-10670 [doi]
- Fspen: an Ultra-Lightweight Network for Real Time Speech EnahncmentLei Yang, Wei Liu, Ruijie Meng, Gunwoo Lee, Soonho Baek, Han-gil Moon. 10671-10675 [doi]
- An Adapter-Based Unified Model for Multiple Spoken Language Processing TasksVarsha Suresh, Salah Aït-Mokhtar, Caroline Brun, Ioan Calapodescu. 10676-10680 [doi]
- Enhanced Transfer Learning with Efficient Modeling and Adaptive Fusion of Knowledge Via Prompt TuningMinghui Xu, Zishan Guo, Yulong Zeng, Deyi Xiong. 10681-10685 [doi]
- Translatotron 3: Speech to Speech Translation with Monolingual DataEliya Nachmani, Alon Levkovitch, Yifan Ding, Chulayuth Asawaroengchai, Heiga Zen, Michelle Tadmor Ramanovich. 10686-10690 [doi]
- Mapache: Masked Parallel Transformer for Advanced Speech Editing and SynthesisGuillermo Cámbara, Patrick Lumban Tobing, Mikolaj Babianski, Ravichander Vipperla, Duo Wang, Ron Shmelkin, Giuseppe Coccia, Orazio Angelini, Arnaud Joly, Mateusz Lajszczak, Vincent Pollet. 10691-10695 [doi]
- Improving Design of Input Condition Invariant Speech EnhancementWangyou Zhang, Jee-weon Jung, Yanmin Qian. 10696-10700 [doi]
- Improving Kinyarwanda Speech Recognition Via Semi-Supervised LearningAntoine Nzeyimana. 10701-10705 [doi]
- Concss: Contrastive-based Context Comprehension for Dialogue-Appropriate Prosody in Conversational Speech SynthesisYayue Deng, Jinlong Xue, YuKang Jia, Qifei Li, Yichen Han, Fengping Wang, Yingming Gao, Dengfeng Ke, Ya Li. 10706-10710 [doi]
- Unsupervised Multi-Domain Data Selection for Asr Fine-TuningNikolaos Lagos, Ioan Calapodescu. 10711-10715 [doi]
- Multilingual Distilwhisper: Efficient Distillation of Multi-Task Speech Models Via Language-Specific ExpertsThomas Palmeira Ferraz, Marcely Zanon Boito, Caroline Brun, Vassilina Nikoulina. 10716-10720 [doi]
- STaR: Distilling Speech Temporal Relation for Lightweight Speech Self-Supervised Learning ModelsKangwook Jang, Sungnyun Kim, Hoirin Kim. 10721-10725 [doi]
- Pro-HAN: A Heterogeneous Graph Attention Network for Profile-based Spoken Language UnderstandingDechuan Teng, Chunlin Lu, Xiao Xu 0005, Wanxiang Che, Libo Qin 0001. 10726-10730 [doi]
- Employing Real Training Data for Deep Noise SuppressionZiyi Xu, Marvin Sach, Jan Pirklbauer, Tim Fingscheidt. 10731-10735 [doi]
- Fregrad: Lightweight and Fast Frequency-Aware Diffusion VocoderTan Dat Nguyen, Ji-Hoon Kim, Youngjoon Jang, Jaehun Kim, Joon Son Chung. 10736-10740 [doi]
- A Study on Graph Embedding for Speaker RecognitionLiang He, Ruida Li, Mengqi Niu. 10741-10745 [doi]
- EmoRED: A Dataset for Relation Extraction in Texts with EmoticonsLingxing Kong, Zheng Ma, Jianbing Zhang, Liang He 0009, Jiajun Chen. 10746-10750 [doi]
- Dual Parameter-Efficient Fine-Tuning for Speaker Representation Via Speaker Prompt Tuning and AdaptersZhe Li, Man-Wai Mak, Helen Mei-Ling Meng. 10751-10755 [doi]
- USM-Lite: Quantization and Sparsity Aware Fine-Tuning for Speech Recognition with Universal Speech ModelsShaojin Ding, David Qiu, David Rim, Yanzhang He, Oleg Rybakov, Bo Li 0028, Rohit Prabhavalkar, Weiran Wang, Tara N. Sainath, Zhonglin Han, Jian Li, Amir Yazdanbakhsh, Shivani Agrawal. 10756-10760 [doi]
- Frame-to-Utterance Convergence: A Spectra-Temporal Approach for Unified Spoofing DetectionAwais Khan 0007, Khalid Mahmood Malik, Shah Nawaz. 10761-10765 [doi]
- Improving Oral Reading Fluency Assessment Through Sub-Sequence Matching of Acoustic Word EmbeddingsYihao Wang, Zhongdi Wu, Joseph Nese, Akihito Kamata, Vedant Nilabh, Eric C. Larson. 10766-10770 [doi]
- Recovering from Privacy-Preserving Masking with Large Language ModelsArpita Vats, Zhe Liu, Peng Su, Debjyoti Paul, Yingyi Ma, Yutong Pang, Zeeshan Ahmed, Ozlem Kalinli. 10771-10775 [doi]
- Zero-Shot Intent Classification Using a Semantic Similarity Aware Contrastive Loss and Large Language ModelJaejin Cho, Rakshith Sharma Srinivasa, Ching Hua Lee, Yashas Malur Saidutta, Chouchang Yang, Yilin Shen, Hongxia Jin. 10776-10780 [doi]
- High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion ModelsChunyu Qiang, Hao Li, Yixin Tian, Yi Zhao, Ying Zhang, Longbiao Wang, Jianwu Dang 0001. 10781-10785 [doi]
- GR0: Self-Supervised Global Representation Learning for Zero-Shot Voice ConversionYunyun Wang, Jiaqi Su, Adam Finkelstein, Zeyu Jin. 10786-10790 [doi]
- Towards End-to-End Spoken Grammatical Error CorrectionStefano Bannò, Rao Ma, Mengjie Qian, Kate M. Knill, Mark J. F. Gales. 10791-10795 [doi]
- Multi-Task Learning for Front-End Text Processing in TTSWonjune Kang, Yun Wang, Shun Zhang, Arthur Hinsvark, Qing He. 10796-10800 [doi]
- COLLD: Contrastive Layer-to-Layer Distillation for Compressing Multilingual Pre-Trained Speech EncodersHeng-Jui Chang, Ning Dong, Ruslan Mavlyutov, Sravya Popuri, Yu-An Chung. 10801-10805 [doi]
- AV2WAV: Diffusion-Based Re-Synthesis from Continuous Self-Supervised Features for Audio-Visual Speech EnhancementJu-Chieh Chou, Chung-Ming Chien, Karen Livescu. 10806-10810 [doi]
- Towards A World-English Language Model for on-Device Virtual AssistantsRricha Jalota, Lyan Verwimp, Markus Nußbaum-Thom, Amr El-Desoky Mousa, Arturo Argueta, Youssef Oualil. 10811-10815 [doi]
- Hot-Fixing Wake Word Recognition for End-to-End ASR Via Neural Model ReprogrammingPin-Jui Ku, I-Fan Chen, Chao-Han Huck Yang, Anirudh Raju, Pranav Dheram, Pegah Ghahremani, Brian King, Jing Liu, Roger Ren, Phani Sankar Nidadavolu. 10816-10820 [doi]
- Stable Distillation: Regularizing Continued Pre-Training for Low-Resource Automatic Speech RecognitionAshish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha. 10821-10825 [doi]
- Maximum-Entropy Adversarial Audio Augmentation for Keyword SpottingZuzhao Ye, Gregory Ciccarelli, Brian Kulis. 10826-10830 [doi]
- Leveraging Self-Supervised Speech Representations for Domain Adaptation in Speech EnhancementChing Hua Lee, Chouchang Yang, Rakshith Sharma Srinivasa, Yashas Malur Saidutta, Jaejin Cho, Yilin Shen, Hongxia Jin. 10831-10835 [doi]
- Post-Training Embedding Alignment for Decoupling Enrollment and Runtime Speaker Recognition ModelsChenyang Gao, Brecht Desplanques, Chelsea J.-T. Ju, Aman Chadha, Andreas Stolcke. 10836-10840 [doi]
- Efficient Adapter Finetuning for Tail Languages in Streaming Multilingual ASRJunwen Bai, Bo Li 0028, Qiujia Li, Tara N. Sainath, Trevor Strohman. 10841-10845 [doi]
- Large Language Models As A Proxy For Human Evaluation In Assessing The Comprehensibility Of Disordered Speech TranscriptionKatrin Tomanek, Jimmy Tobin, Subhashini Venugopalan, Richard Cave, Katie Seaver, Jordan R. Green, Rus Heywood. 10846-10850 [doi]
- Controllable Speaking Styles Using A Large Language ModelAtli Sigurgeirsson, Simon King 0001. 10851-10855 [doi]
- Correction Focused Language Model Training For Speech RecognitionYingyi Ma, Zhe Liu 0011, Ozlem Kalinli. 10856-10860 [doi]
- Enhancing Speaker Diarization with Large Language Models: A Contextual Beam Search ApproachTaejin Park, Kunal Dhawan, Nithin Rao Koluguri, Jagadeesh Balam. 10861-10865 [doi]
- Diarist: Streaming Speech Translation with Speaker DiarizationMu Yang, Naoyuki Kanda, Xiaofei Wang 0009, Junkun Chen, Peidong Wang, Jian Xue, Jinyu Li 0001, Takuya Yoshioka. 10866-10870 [doi]
- FIRNet: Fundamental Frequency Controllable Fast Neural Vocoder With Trainable Finite Impulse Response FilterYamato Ohtani, Takuma Okamoto, Tomoki Toda, Hisashi Kawai. 10871-10875 [doi]
- Keep Decoding Parallel With Effective Knowledge Distillation From Language Models To End-To-End Speech RecognisersMichael Hentschel, Yuta Nishikawa, Tatsuya Komatsu, Yusuke Fujita. 10876-10880 [doi]
- Emohrnet: High-Resolution Neural Network Based Speech Emotion RecognitionAkshay Muppidi, Martin Radfar. 10881-10885 [doi]
- Enhancing Code-Switching Speech Recognition With Interactive Language BiasesHexin Liu, Leibny Paola Garcia, Xiangyu Zhang, Andy W. H. Khong, Sanjeev Khudanpur. 10886-10890 [doi]
- Contrastive Speaker Embedding With Sequential DisentanglementYouzhi Tu, Man-Wai Mak, Jen-Tzung Chien. 10891-10895 [doi]
- Contextualized Automatic Speech Recognition With Attention-Based Bias Phrase Boosted Beam SearchYui Sudo, Muhammad Shakeel 0001, Yosuke Fukumoto, Yifan Peng, Shinji Watanabe 0001. 10896-10900 [doi]
- Leveraging in-the-wild Data for Effective Self-supervised Pretraining in Speaker RecognitionShuai Wang 0016, Qibing Bai, Qi Liu 0018, Jianwei Yu, Zhengyang Chen, Bing Han, Yanmin Qian, Haizhou Li 0001. 10901-10905 [doi]
- Shapley Value Guided Extractive Text SummarizationXiaoxia Cheng, Weiming Lu 0001. 10906-10910 [doi]
- A Prompt-Based Method with Multi-View Optimization for Open Relation ExtractionYing Zhang, Depeng Dang, Ning Wang, Hu Gao. 10911-10915 [doi]
- DETS: End-to-End Single-Stage Text-to-Speech Via Hierarchical Diffusion Gan ModelsLinqin Wang, Zhengtao Yu 0001, Shengxiang Gao, Cunli Mao, Yuxin Huang. 10916-10920 [doi]
- A Chat about Boring Problems: Studying GPT-Based Text NormalizationYang Zhang 0089, Travis M. Bartley, Mariana Graterol-Fuenmayor, Vitaly Lavrukhin, Evelina Bakhturina, Boris Ginsburg. 10921-10925 [doi]
- Multi-Modality Speech Recognition Driven by Background Visual ScenesCheng Luo, Yiguang Liu, Wenhui Sun, Zhoujian Sun. 10926-10930 [doi]
- Joint Unsupervised and Supervised Training for Automatic Speech Recognition via Bilevel OptimizationA F. M. Saif, Xiaodong Cui, Han Shen, Songtao Lu, Brian Kingsbury, Tianyi Chen. 10931-10935 [doi]
- Unsupervised Speech Recognition with N-skipgram and Positional Unigram MatchingLiming Wang, Mark Hasegawa-Johnson, Chang D. Yoo. 10936-10940 [doi]
- Enhancing Multilingual Speech Recognition through Language Prompt Tuning and Frame-Level Language AdapterSong Li, Yongbin You, Xuezhi Wang 0008, Ke Ding, Guanglu Wan. 10941-10945 [doi]
- Adversarial Learning on Compressed Posterior Space for Non-Iterative Score-based End-to-End Text-to-SpeechWon-Gook Choi, Donghyun Seong, Joon-Hyuk Chang. 10946-10950 [doi]
- On-Device Constrained Self-Supervised Learning for Keyword Spotting via Quantization Aware Pre-Training and Fine-TuningGene-Ping Yang, Yue Gu, Sashank Macha, Qingming Tang, Yuzong Liu. 10951-10955 [doi]
- Revise the NLU: A Prompting Strategy for Robust Dialogue SystemMahdin Rohmatillah, Jen-Tzung Chien. 10956-10960 [doi]
- Electrolaryngeal Speech Intelligibility Enhancement through Robust Linguistic EncodersLester Phillip Violeta, Wen-Chin Huang, Ding Ma, Ryuichi Yamamoto, Kazuhiro Kobayashi, Tomoki Toda. 10961-10965 [doi]
- Towards Optimal Voice Disentanglement with Weak SupervisionMohammad Rasool Izadi, Yujia Yan, Shuo Zhang, Robert Stevenson. 10966-10970 [doi]
- Joint End-to-End Spoken Language Understanding and Automatic Speech Recognition Training Based on Unified Speech-to-Text Pre-TrainingEesung Kim, Yun Tang, Taeyeon Ki, Divya Neelagiri, Vijendra Raj Apsingekar. 10971-10975 [doi]
- Towards High-Performance and Low-Latency Feature-Based Speaker Adaptation of Conformer Speech Recognition SystemsJiajun Deng, Xurong Xie, Guinan Li, Mingyu Cui, Mengzhe Geng, Zengrui Jin, Tianzi Wang, Shujie Hu, Zhaoqing Li, Xunying Liu. 10976-10980 [doi]
- Branchformer-Based TDNN for Automatic Speaker VerificationYuhang Sun, Chenxing Li, Biao Li. 10981-10985 [doi]
- Parameter Efficient Finetuning for Speech Emotion Recognition and Domain AdaptationNineli Lashkarashvili, Wen Wu, Guangzhi Sun, Philip C. Woodland. 10986-10990 [doi]
- Libriheavy: A 50, 000 Hours ASR Corpus with Punctuation Casing and ContextWei Kang, Xiaoyu Yang, Zengwei Yao, Fangjun Kuang, Yifan Yang, Liyong Guo, Long Lin, Daniel Povey. 10991-10995 [doi]
- CPAUG: Refining Copy-Paste Augmentation for Speech Anti-SpoofingLinjuan Zhang, Kong-Aik Lee, Lin Zhang, Longbiao Wang, Baoning Niu. 10996-11000 [doi]
- Enhancing Conversation Smoothness in Language Learning Chatbots: An Evaluation of GPT4 for ASR Error CorrectionLong Mai, Julie Carson-Berndsen. 11001-11005 [doi]
- KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo LabelsJiaming Zhou, Shiwan Zhao, Yaqi Liu, Wenjia Zeng, Yong Chen, Yong Qin. 11006-11010 [doi]
- Joint Inference of Speaker Diarization and ASR with Multi-Stage Information SharingWeiqing Wang, Danwei Cai, Ming Cheng, Ming Li 0026. 11011-11015 [doi]
- STREAMVC: Real-Time Low-Latency Voice ConversionYang Yang 0010, Yury Kartynnik, Yunpeng Li, Jiuqiang Tang, Xing Li, George Sung, Matthias Grundmann. 11016-11020 [doi]
- Neural Network-Based Virtual Microphone Estimation with Virtual Microphone and Beamformer-Level Multi-Task LossHanako Segawa, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita, Shoko Araki, Takeshi Yamada, Shoji Makino. 11021-11025 [doi]
- Large Language Model-Based Emotional Speech Annotation Using Context and Acoustic Feature for Speech Emotion RecognitionJennifer Santoso, Kenkichi Ishizuka, Taiichi Hashimoto. 11026-11030 [doi]
- How Does End-To-End Speech Recognition Training Impact Speech Enhancement Artifacts?Kazuma Iwamoto, Tsubasa Ochiai, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri. 11031-11035 [doi]
- PVCG: Prompt-Based Vision-Aware Classification and Generation for Multi-Modal Rumor DetectionTing Zou, Zhong Qian, Peifeng Li, Qiaoming Zhu. 11036-11040 [doi]
- Soft Alignment of Modality Space for End-to-End Speech TranslationYuhao Zhang, Kaiqi Kou, Bei Li, Chen Xu 0008, Chunliang Zhang, Tong Xiao, Jingbo Zhu. 11041-11045 [doi]
- Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech RecognitionShaoshi Ling, Yuxuan Hu, Shuangbei Qian, Guoli Ye, Yao Qian, Yifan Gong 0001, Ed Lin, Michael Zeng 0001. 11046-11050 [doi]
- Fine-Grained Disentangled Representation Learning For Multimodal Emotion RecognitionHaoqin Sun, Shiwan Zhao, Xuechen Wang, Wenjia Zeng, Yong Chen, Yong Qin. 11051-11055 [doi]
- Loss Masking Is Not Needed In Decoder-Only Transformer For Discrete-Token-Based ASRQian Chen 0003, Wen Wang, Qinglin Zhang, Siqi Zheng, Shiliang Zhang, Chong Deng, Yukun Ma, Hai Yu, Jiaqing Liu, Chong Zhang 0003. 11056-11060 [doi]
- Type-Aware Decoding Via Explicitly Aggregating Event Information for Document-Level Event ExtractionGang Zhao, Yidong Shi, Shudong Lu, Xinjie Yang, Guanting Dong, Jian Xu, Xiaocheng Gong, Si Li 0001. 11061-11065 [doi]
- MF-AED-AEC: Speech Emotion Recognition by Leveraging Multimodal Fusion, Asr Error Detection, and Asr Error CorrectionJiajun He, Xiaohan Shi, Xingfeng Li 0001, Tomoki Toda. 11066-11070 [doi]
- A Cross Search Method for Data Augmentation in Neural Machine TranslationMengchao Zhang, Mei Tu, Fan Zhang, Song Liu. 11071-11075 [doi]
- SlideSpeech: A Large Scale Slide-Enriched Audio-Visual CorpusHaoxu Wang, Fan Yu, Xian Shi, YueZhang Wang, Shiliang Zhang, Ming Li. 11076-11080 [doi]
- Asymmetric Clean Segments-Guided Self-Supervised Learning for Robust Speaker VerificationChong-Xin Gan, Man-Wai Mak, Weiwei Lin 0002, Jen-Tzung Chien. 11081-11085 [doi]
- Prompt-Driven Target Speech DiarizationYidi Jiang, Zhengyang Chen, Ruijie Tao, Liqun Deng, Yanmin Qian, Haizhou Li 0001. 11086-11090 [doi]
- Data Driven Grapheme-to-Phoneme Representations for a Lexicon-Free Text-to-SpeechAbhinav Garg, Jiyeon Kim, Sushil Khyalia, Chanwoo Kim 0001, Dhananjaya Gowda. 11091-11095 [doi]
- MHPS: Multimodality-Guided Hierarchical Policy Search for Knowledge Graph ReasoningChen Gao, Xugong Qin, Peng Zhang, Yongquan He, Xinjian Huang, Ming Zhou, Liehuang Zhu, Qingfeng Tan. 11096-11100 [doi]
- Nuclear-Norm Maximization for Low-Rank UpdatesHuanxi Liu, Yuanzhao Zhai, Kele Xu, Dawei Feng, Yiying Li. 11101-11105 [doi]
- Dualvc 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice ConversionZiqian Ning, Yuepeng Jiang, Pengcheng Zhu 0004, Shuai Wang, Jixun Yao, Lei Xie 0001, Mengxiao Bi. 11106-11110 [doi]
- Investigating Salient Representations and Label Variance in Dimensional Speech Emotion AnalysisVikramjit Mitra, Jingping Nie, Erdrin Azemi. 11111-11115 [doi]
- VoiceFlow: Efficient Text-To-Speech with Rectified Flow MatchingYiwei Guo, Chenpeng Du, Ziyang Ma, Xie Chen 0001, Kai Yu 0004. 11121-11125 [doi]
- Sifisinger: A High-Fidelity End-to-End Singing Voice Synthesizer Based on Source-Filter ModelJianwei Cui, Yu Gu, Chao Weng, Jie Zhang, Liping Chen, Lirong Dai 0001. 11126-11130 [doi]
- Text-Only Unsupervised Domain Adaptation for Neural Transducer-Based ASR Personalization Using Synthesized DataDong-hyun Kim, Jae Hong Lee, Joon-Hyuk Chang. 11131-11135 [doi]
- Esihgnn: Event-State Interactions Infused Heterogeneous Graph Neural Network for Conversational Emotion RecognitionXupeng Zha, Huan Zhao, Zixing Zhang 0001. 11136-11140 [doi]
- Unifying One-Shot Voice Conversion and Cloning with Disentangled Speech RepresentationsHui Lu, Xixin Wu, Haohan Guo, Songxiang Liu, Zhiyong Wu 0001, Helen Meng. 11141-11145 [doi]
- Leveraging Speech PTM, Text LLM, And Emotional TTS For Speech Emotion RecognitionZiyang Ma, Wen Wu, Zhisheng Zheng, Yiwei Guo, Qian Chen 0003, Shiliang Zhang, Xie Chen 0001. 11146-11150 [doi]
- Multimodal Sentiment Analysis Based on 3D Stereoscopic AttentionJian Huang, Yuanyuan Pu, Dongming Zhou, Hang Shi, Zhengpeng Zhao, Dan Xu 0001, Jinde Cao. 11151-11155 [doi]
- Generative Context-Aware Fine-Tuning of Self-Supervised Speech ModelsSuwon Shon, Kwangyoun Kim, Prashant Sridhar, Yi-Te Hsu, Shinji Watanabe 0001, Karen Livescu. 11156-11160 [doi]
- Residualtransformer: Residual Low-Rank Learning With Weight-Sharing For Transformer LayersYiming Wang, Jinyu Li 0001. 11161-11165 [doi]
- Latent Filling: Latent Space Data Augmentation for Zero-Shot Speech SynthesisJae-Sung Bae, Joun Yeop Lee, Ji-Hyun Lee, Seongkyu Mun, Taehwa Kang, Hoon-Young Cho, Chanwoo Kim 0001. 11166-11170 [doi]
- Mdrt: Multi-Domain Synthetic Speech LocalizationAmit Kumar Singh Yadav, Kratika Bhagtani, Sriram Baireddy, Paolo Bestagini, Stefano Tubaro, Edward J. Delp. 11171-11175 [doi]
- Adaptive Data Augmentation for Aspect Sentiment Quad PredictionWenyuan Zhang, Xinghua Zhang, Shiyao Cui, Kun Huang, Xuebin Wang, Tingwen Liu. 11176-11180 [doi]
- Diacorrect: Error Correction Back-End for Speaker DiarizationJiangyu Han, Federico Landini, Johan Rohdin, Mireia Díez, Lukás Burget, Yuhang Cao, Heng Lu, Jan Cernocký. 11181-11185 [doi]
- NeXt-TDNN: Modernizing Multi-Scale Temporal Convolution Backbone for Speaker VerificationHyunjun Heo, Ui-Hyeop Shin, Ran Lee, Youngju Cheon, Hyung-Min Park. 11186-11190 [doi]
- Alleviating Hallucinations Via Supportive Window Indexing in Abstractive SummarizationJiaxin Duan, Fengyu Lu, Junfei Liu. 11191-11195 [doi]
- Investigating the Clusters Discovered By Pre-Trained AV-HuBERTAnja Virkkunen, Marek Sarvas, Guangpu Huang, Tamás Grósz, Mikko Kurimo. 11196-11200 [doi]
- TRUST-SER: On The Trustworthiness Of Fine-Tuning Pre-Trained Speech Embeddings For Speech Emotion RecognitionTianTian Feng, Rajat Hebbar, Shrikanth Narayanan. 11201-11205 [doi]
- Label Dependencies-Aware Set Prediction Networks for Multi-Label Text ClassificationXinkai Du, Quanjie Han, Yalin Sun, Chao Lv, Maosong Sun. 11206-11210 [doi]
- Whisper-Based Transfer Learning for Alzheimer Disease Classification: Leveraging Speech Segments with Full Transcripts as PromptsJinpeng Li, Wei-Qiang Zhang 0001. 11211-11215 [doi]
- Iterative Autoregressive Generation for Abstractive SummarizationJiaxin Duan, Fengyu Lu, Junfei Liu. 11216-11220 [doi]
- Attention-Driven Multichannel Speech Enhancement in Moving Sound Source ScenariosYuzhu Wang, Archontis Politis, Tuomas Virtanen. 11221-11225 [doi]
- An Effective Mixture-Of-Experts Approach For Code-Switching Speech Recognition Leveraging Encoder DisentanglementTzu-Ting Yang, Hsin-Wei Wang, Yi-Cheng Wang, Chi-Han Lin, Berlin Chen. 11226-11230 [doi]
- Collaborative Watermarking for Adversarial Speech SynthesisLauri Juvela, Xin Wang 0037. 11231-11235 [doi]
- Extending Large Language Models for Speech and Audio CaptioningChangli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan 0019, Wei Li 0119, Lu Lu 0015, Zejun Ma, Chao Zhang 0031. 11236-11240 [doi]
- Dynamic Multi-Scale Context Aggregation for Conversational Aspect-Based Sentiment Quadruple AnalysisYuqing Li, Wenyuan Zhang 0002, Binbin Li, Siyu Jia, Zisen Qi, Xingbang Tan. 11241-11245 [doi]
- HAFFormer: A Hierarchical Attention-Free Framework for Alzheimer's Disease Detection From Spontaneous SpeechZhongren Dong, Zixing Zhang 0001, Weixiang Xu, Jing Han 0010, Jianjun Ou, Björn W. Schuller. 11246-11250 [doi]
- One-Class Knowledge Distillation for Spoofing Speech DetectionJingze Lu, Yuxiang Zhang, Wenchao Wang, Zengqiang Shang, Pengyuan Zhang. 11251-11255 [doi]
- RSED: Zero-Shot Relation Triplet Extraction via Relation Selection and Entity Boundary DetectionYuquan Lan, Dongxu Li, Yunqi Zhang, Hui Zhao, Gang Zhao. 11256-11260 [doi]
- STYLECAP: Automatic Speaking-Style Captioning from Speech Based on Speech and Language Self-Supervised Learning ModelsKazuki Yamauchi, Yusuke Ijima, Yuki Saito. 11261-11265 [doi]
- DurIAN-E 2: Duration Informed Attention Network with Adaptive Variational Autoencoder and Adversarial Learning for Expressive Text-to-Speech SynthesisYu Gu, Qiushi Zhu, Guangzhi Lei, Chao Weng, Dan Su 0002. 11266-11270 [doi]
- BCC: Bidirectional Consistency Constraint Method for Hierarchical Text ClassificationYinghan Shen, Yu Yan, Dechun Yin, Huawei Shen. 11271-11275 [doi]
- Hystoc: Obtaining Word Confidences for Fusion of End-To-End ASR SystemsKarel Benes, Martin Kocour, Lukás Burget. 11276-11280 [doi]
- NTT Speaker Diarization System for Chime-7: Multi-Domain, Multi-Microphone end-to-end and Vector Clustering DiarizationNaohiro Tawara, Marc Delcroix, Atsushi Ando, Atsunori Ogawa. 11281-11285 [doi]
- Unsupervised Topic-Conditional Extractive SummarizationItzik Malkiel, Yakir Yehuda, Jonathan Ephrath, Ori Katz, Oren Barkan, Nir Nice, Noam Koenigstein. 11286-11290 [doi]
- Dynamic Data Sampler for Cross-Language Transfer Learning in Large Language ModelsYudong Li, Yuhao Feng, Wen Zhou, Zhe Zhao 0006, LinLin Shen, Cheng Hou, Xianxu Hou. 11291-11295 [doi]
- One-Shot Sensitivity-Aware Mixed Sparsity Pruning for Large Language ModelsHang Shao, Bei Liu, Yanmin Qian. 11296-11300 [doi]
- Meta Representation Learning Method for Robust Speaker Verification in Unseen DomainsJian-Tao Zhang, Yan Song 0001, Jin Li, Wu Guo, Hao-Yu Song, Ian McLoughlin 0001. 11301-11305 [doi]
- End-to-End Speech Translation with Mutual Knowledge DistillationHao Wang, Zhengshan Xue, Yikun Lei, Deyi Xiong. 11306-11310 [doi]
- Gradient Weighting for Speaker Verification in Extremely Low Signal-to-Noise RatioYi Ma, Kong-Aik Lee, Ville Hautamäki, Meng Ge, Haizhou Li 0001. 11311-11315 [doi]
- Enhancing Two-Stage Finetuning for Speech Emotion Recognition Using AdaptersYuan Gao, Hao Shi, Chenhui Chu, Tatsuya Kawahara. 11316-11320 [doi]
- Persona Extraction Through Semantic Similarity for Emotional Support Conversation GenerationSeunghee Han, Se Jin Park, Chae Won Kim, Yong Man Ro. 11321-11325 [doi]
- Customising General Large Language Models for Specialised Emotion Recognition TasksLiyizhe Peng, Zixing Zhang 0001, Tao Pang, Jing Han 0010, Huan Zhao, Hao Chen, Björn W. Schuller. 11326-11330 [doi]
- Chunked Attention-Based Encoder-Decoder Model for Streaming Speech RecognitionMohammad Zeineldeen, Albert Zeyer, Ralf Schlüter, Hermann Ney. 11331-11335 [doi]
- DCTTS: Discrete Diffusion Model with Contrastive Learning for Text-to-Speech GenerationZhichao Wu, Qiulin Li, Sixing Liu, Qun Yang. 11336-11340 [doi]
- Matcha-TTS: A Fast TTS Architecture with Conditional Flow MatchingShivam Mehta, Ruibo Tu, Jonas Beskow, Éva Székely, Gustav Eje Henter. 11341-11345 [doi]
- MLPs Compass: What is Learned When MLPs are Combined with PLMs?Li Zhou, Wenyu Chen, Yong Cao, Dingyi Zeng, Wanlong Liu, Hong Qu. 11346-11350 [doi]
- TDT-KWS: Fast and Accurate Keyword Spotting Using Token-and-Duration TransducerYu Xi, Hao Li, Baochen Yang, Haoyu Li, Hainan Xu, Kai Yu. 11351-11355 [doi]
- Online Speaker Diarization of Meetings Guided by Speech SeparationElio Gruttadauria, Mathieu Fontaine 0002, Slim Essid. 11356-11360 [doi]
- SEGLLM: Topic-Oriented Call Segmentation Via LLM-Based Conversation SynthesisItzik Malkiel, Uri Alon 0002, Yakir Yehuda, Shahar Keren, Oren Barkan, Royi Ronen, Noam Koenigstein. 11361-11365 [doi]
- ViLaS: Exploring the Effects of Vision and Language Context in Automatic Speech RecognitionZiyi Ni, Minglun Han, Feilong Chen, Linghui Meng 0001, Jing Shi 0003, Pin Lv, Bo Xu 0002. 11366-11370 [doi]
- Enhancing Low-Latency Speaker Diarization with Spatial Dictionary LearningWeiguang Chen, Tran The Anh, Xionghu Zhong, Eng Siong Chng. 11371-11375 [doi]
- Opine: Leveraging a Optimization-Inspired Deep Unfolding Method for Multi-Channel Speech EnhancementAndong Li, Rilin Chen, Yu Gu, Chao Weng, Dan Su. 11376-11380 [doi]
- Robust Speaker Personalisation Using Generalized Low-Rank Adaptation for Automatic Speech RecognitionArun Baby, George Joseph, Shatrughan Singh. 11381-11385 [doi]
- Glancing Future for Simultaneous Machine TranslationShoutao Guo, Shaolei Zhang, Yang Feng 0004. 11386-11390 [doi]
- Summarizing Community-Based Question-Answer Pairs with Focus RectificationMingyang Mei, Yue Hu 0002, Yifan Deng, Xingsheng Zhang, Yunpeng Li, Hao You. 11391-11395 [doi]
- Automatic Channel Selection and Spatial Feature Integration for Multi-Channel Speech Recognition Across Various Array TopologiesBingshen Mu, Pengcheng Guo, Dake Guo, Pan Zhou, Wei Chen, Lei Xie. 11396-11400 [doi]
- Assessing Vibroacoustic Sound Massage Through The Biosignal of Human Speech: Evidence of Improved WellbeingCharlotte Fooks, Oliver Niebuhr. 11401-11405 [doi]
- LabCLIP: Label-Enhanced Clip for Improving Zero-Shot Text ClassificationYongheng Zhang, Peng Wang, Qiguang Chen, Jingxuan Zhou, Yongmei Wang, Min Li 0007, Libo Qin 0001. 11406-11410 [doi]
- Adversarial Speech for Voice Privacy Protection from Personalized Speech GenerationShihao Chen, Liping Chen, Jie Zhang 0042, Kong-Aik Lee, Zhenhua Ling, Lirong Dai 0001. 11411-11415 [doi]
- Syllable Level Features for Parkinson's Disease Detection from SpeechSevada Hovsepyan, Mathew Magimai.-Doss. 11416-11420 [doi]
- Synvox2: Towards A Privacy-Friendly Voxceleb2 DatasetXiaoxiao Miao, Xin Wang 0037, Erica Cooper, Junichi Yamagishi, Nicholas W. D. Evans, Massimiliano Todisco, Jean-François Bonastre, Mickael Rouvier. 11421-11425 [doi]
- Considering Temporal Connection between Turns for Conversational Speech SynthesisKangdi Mei, Zhaoci Liu, Hui-Peng Du, Hengyu Li, Yang Ai, Liping Chen, Zhenhua Ling. 11426-11430 [doi]
- BRAVEn: Improving Self-supervised pre-training for Visual and Auditory Speech RecognitionAlexandros Haliassos, Andreas Zinonos, Rodrigo Mira, Stavros Petridis, Maja Pantic. 11431-11435 [doi]
- Self-Supervised Adaptive Pre-Training of Multilingual Speech Models for Language and Dialect IdentificationMohammed Maqsood Shaik, Dietrich Klakow, Badr M. Abdullah. 11436-11440 [doi]
- An Experimental Comparison of Noise-Robust Text-To-Speech Synthesis Systems Based On Self-Supervised RepresentationXiao-Ying Zhao, Qiushi Zhu, Yuchen Hu. 11441-11445 [doi]
- Cooking-Clip: Context-Aware Language-Image Pretraining for Zero-Shot Recipe GenerationLin Wang, Haithm M. Al-Gunid, Ammar Hawbani, Yan Xiong. 11446-11450 [doi]
- Pre-Trained Acoustic-and-Textual Modeling for End-To-End Speech-To-Text TranslationWeitai Zhang, Hanyi Zhang, Chenxuan Liu, Zhongyi Ye, Xinyuan Zhou, Chao Lin, Lirong Dai 0001. 11451-11455 [doi]
- NeuroHeed+: Improving Neuro-Steered Speaker Extraction with Joint Auditory Attention DetectionZexu Pan, Gordon Wichern, François G. Germain, Sameer Khurana, Jonathan Le Roux. 11456-11460 [doi]
- Improving Biomedical Entity Linking with Retrieval-Enhanced LearningZhenxi Lin, Ziheng Zhang, Xian Wu 0001, Yefeng Zheng 0001. 11461-11465 [doi]
- Progressive Unsupervised Domain Adaptation for ASR Using Ensemble Models and Multi-Stage TrainingRehan Ahmad, Muhammad Umar Farooq, Thomas Hain. 11466-11470 [doi]
- Noise-Robust Zero-Shot Text-to-Speech Synthesis Conditioned on Self-Supervised Speech-Representation Model with AdaptersKenichi Fujita, Hiroshi Sato, Takanori Ashihara, Hiroki Kanagawa, Marc Delcroix, Takafumi Moriya, Yusuke Ijima. 11471-11475 [doi]
- AS-pVAD: A Frame-Wise Personalized Voice Activity Detection Network with Attentive Score LossFenting Liu, Feifei Xiong, Yiya Hao, Kechenying Zhou, Chenhui Zhang, Jinwei Feng. 11476-11480 [doi]
- Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative StudyXuankai Chang, Brian Yan, KwangHee Choi, Jee-weon Jung, Yichen Lu, Soumi Maiti, Roshan S. Sharma, Jiatong Shi, Jinchuan Tian, Shinji Watanabe 0001, Yuya Fujita, Takashi Maekaku, Pengcheng Guo, Yao-Fei Cheng, Pavel Denisov, Kohei Saijo, Hsiu-Hsuan Wang. 11481-11485 [doi]
- Frame-Level Emotional State Alignment Method for Speech Emotion RecognitionQifei Li, Yingming Gao, Cong Wang, Yayue Deng, Jinlong Xue, Yichen Han, Ya Li. 11486-11490 [doi]
- Combining Conformer and Dual-Path-Transformer Networks for Single Channel Noisy Reverberant Speech SeparationWilliam Ravenscroft, Stefan Goetze, Thomas Hain. 11491-11495 [doi]
- Gradient-Based Dimensionality Reduction for Speech Emotion Recognition Using Deep NetworksHongxuan Wang, Prahlad Vadakkepat. 11496-11500 [doi]
- Creating Personalized Synthetic Voices from Articulation Impaired Speech Using Augmented Reconstruction LossYusheng Tian, Jingyu Li, Tan Lee 0001. 11501-11505 [doi]
- Phase Continuity-Aware Self-Attentive Recurrent Network with Adaptive Feature Selection for Robust VADMinjie Tang, Hao Huang, Wenbo Zhang, Liang He. 11506-11510 [doi]
- Exploring Label Hierarchy in Dialogue Intent ClassificationSimin Huang, Peijie Huang, Yuhong Xu, Jingzhou Liang, Jingde Niu. 11511-11515 [doi]
- Anchor-Guided GAN with Contrastive Loss for Low-Resource Out-of-Domain DetectionJiankai Zhu, Peijie Huang, Ziheng Ruan, Yuhui Zhu, Chaojie Liang, Yuhong Xu. 11516-11520 [doi]
- StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness AnnotationsSen Liu, Yiwei Guo, Xie Chen 0001, Kai Yu 0004. 11521-11525 [doi]
- Multimodal Modeling for Spoken Language IdentificationShikhar Bharadwaj, Min Ma, Shikhar Vashishth, Ankur Bapna, Sriram Ganapathy, Vera Axelrod, Siddharth Dalmia, Wei Han, Yu Zhang, Daan van Esch, Sandy Ritchie, Partha Talukdar, Jason Riesa. 11526-11530 [doi]
- T-SOT FNT: Streaming Multi-Talker ASR with Text-Only Domain Adaptation CapabilityJian Wu 0027, Naoyuki Kanda, Takuya Yoshioka, Rui Zhao 0017, Zhuo Chen 0006, Jinyu Li 0001. 11531-11535 [doi]
- Contrastive Learning with High-Quality and Low-Quality Augmented Data for Query-Focused SummarizationShaoyao Huang, Ziqiang Cao, Luozheng Qin, Jun Gao, Jun Zhang. 11536-11540 [doi]
- Bootstrap Predictive Coding: Investigating a Non-Contrastive Self-Supervised Learning ApproachYumnah Mohamied, Peter Bell 0001. 11541-11545 [doi]
- Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed DataTakaaki Saeki, Gary Wang, Nobuyuki Morioka, Isaac Elias, Kyle Kastner, Andrew Rosenberg, Bhuvana Ramabhadran, Heiga Zen, Françoise Beaufays, Hadar Shemtov. 11546-11550 [doi]
- The 2nd Clarity Prediction Challenge: A Machine Learning Challenge for Hearing Aid Intelligibility PredictionJon Barker, Michael A. Akeroyd, Will Bailey, Trevor J. Cox, John F. Culling, Jennifer Firth, Simone Graetzer, Graham Naylor. 11551-11555 [doi]
- Robust Wake Word Spotting With Frame-Level Cross-Modal Attention Based Audio-Visual ConformerHaoxu Wang, Ming Cheng, Qiang Fu, Ming Li. 11556-11560 [doi]
- SELM: Speech Enhancement using Discrete Tokens and Language ModelsZiqian Wang, Xinfa Zhu, Zihan Zhang, Yuanjun Lv, Ning Jiang, Guoqing Zhao, Lei Xie 0001. 11561-11565 [doi]
- CoQ: AN Empirical Framework for Multi-hop Question Answering Empowered by Large Language ModelsQiang Huang, Feng Huang, Dehao Tao, Yuetong Zhao, Bingkun Wang, Yongfeng Huang 0001. 11566-11570 [doi]
- Local and Global: Text Matching Via Syntax Graph CalibrationLiang Li, Qisheng Liao, Meiting Lai, Di Liang, Shangsong Liang. 11571-11575 [doi]
- SkillNet-X: A Multilingual Multitask Model with Sparsely Activated SkillsZhangyin Feng, Yong Dai, Fan Zhang 0092, Duyu Tang, Xiaocheng Feng, Shuangzhi Wu, Bing Qin 0001, Yunbo Cao, Shuming Shi 0001. 11576-11580 [doi]
- Multiscale Matching Driven by Cross-Modal Similarity Consistency for Audio-Text RetrievalQian Wang, Jia-Chen Gu, Zhen-Hua Ling. 11581-11585 [doi]
- Integrating Language Models with Symbolic Formulas for First-Order Logic ReasoningYu Sheng, Linjing Li, Yifei Wang, Daniel Zeng 0001. 11586-11590 [doi]
- SR-HuBERT : An Efficient Pre-Trained Model for Speaker VerificationYishuang Li, Hukai Huang, Zhicong Chen, Wenhao Guan, Jiayan Lin, Lin Li, Qingyang Hong. 11591-11595 [doi]
- An Investigation of Distribution Alignment in Multi-Genre Speaker RecognitionZhenyu Zhou, Junhui Chen, Namin Wang, Lantian Li, Dong Wang 0013. 11596-11600 [doi]
- Modeling Pseudo-Speaker Uncertainty in Voice AnonymizationLiping Chen, Kong-Aik Lee, Wu Guo, Zhen-Hua Ling. 11601-11605 [doi]
- Significant ASR Error Detection for Conversational Voice AssistantsJohn Harvill, Rinat Khaziev, Scarlett Li, Randy Cogill, Lidan Wang, Gopinath Chennupati, Hari Thadakamalla. 11606-11610 [doi]
- GLA-GRAD: A Griffin-Lim Extended Waveform Generation Diffusion ModelHaocheng Liu, Teysir Baoueb, Mathieu Fontaine 0002, Jonathan Le Roux, Gaël Richard. 11611-11615 [doi]
- Semantic Enrichment for Video Question Answering with Gated Graph Neural NetworksChenyang Lyu, Wenxi Li, Tianbo Ji, Yi Yu 0001, Longyue Wang. 11616-11620 [doi]
- Using Clustering to Improve the Performance of few-shot LearningYanan Zhang, Chaofan Wu, Rongkun Shi, Yiying Zhang. 11621-11625 [doi]
- Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence ArchitectureGaobin Yang, Maokui He, Shutong Niu, Ruoyu Wang 0029, Yanyan Yue, Shuangqing Qian, Shilong Wu, Jun Du, Chin-Hui Lee 0001. 11626-11630 [doi]
- Improving Domain Generalization in Speech Emotion Recognition with WhisperErik Goron, Lena Asai, Elias Rut, Martin Dinov. 11631-11635 [doi]
- Improving Short Utterance Anti-Spoofing with Aasist2Yuxiang Zhang, Jingze Lu, Zengqiang Shang, Wenchao Wang, Pengyuan Zhang. 11636-11640 [doi]
- Label-Aware Auxiliary Learning for Dialogue State TrackingYuncong Liu, Lu Chen, Kai Yu. 11641-11645 [doi]
- Speech Swin-Transformer: Exploring a Hierarchical Transformer with Shifted Windows for Speech Emotion RecognitionYong Wang, Cheng Lu, Hailun Lian, Yan Zhao, Björn W. Schuller, Yuan Zong, Wenming Zheng. 11646-11650 [doi]
- EMOCONV-Diff: Diffusion-Based Speech Emotion Conversion for Non-Parallel and in-the-Wild DataNavin Raj Prabhu, Bunlong Lay, Simon Welker, Nale Lehmann-Willenbrock, Timo Gerkmann. 11651-11655 [doi]
- Open-Vocabulary Keyword-Spotting with Adaptive Instance NormalizationAviv Navon, Aviv Shamsian, Neta Glazer, Gill Hetz, Joseph Keshet. 11656-11660 [doi]
- Retrieval-Generation Synergy Augmented Large Language ModelsZhangyin Feng, Xiaocheng Feng, Dezhi Zhao, Maojin Yang, Bing Qin 0001. 11661-11665 [doi]
- Contrastive Learning with Audio Discrimination for Customizable Keyword Spotting in Continuous SpeechYu Xi, Baochen Yang, Hao Li, Jiaqi Guo, Kai Yu. 11666-11670 [doi]
- TNFormer: Single-Pass Multilingual Text Normalization with a Transformer Decoder ModelBinbin Shen, Jie Wang, Meng Meng, Yujun Wang. 11671-11675 [doi]
- Posterior Variance-Parameterised Gaussian Dropout: Improving Disentangled Sequential Autoencoders for Zero-Shot Voice ConversionYin-Jyun Luo, Simon Dixon. 11676-11680 [doi]
- Semi-Autoregressive Streaming ASR with Label ContextSiddhant Arora, George Saon, Shinji Watanabe 0001, Brian Kingsbury. 11681-11685 [doi]
- Disentanglement Network: Disentangle the Emotional Features from Acoustic Features for Speech Emotion RecognitionZhichen Yuan, C. L. Philip Chen, Shuzhen Li, Tong Zhang. 11686-11690 [doi]
- A Federated Graph to Embedding Approach for Knowledge Graph CompletionHongliang Sun, Xiaofeng Bi, Dianbo Sui, Zhiying Tu. 11691-11695 [doi]
- Improving Speaker-Independent Speech Emotion Recognition using Dynamic Joint Distribution AdaptationCheng Lu, Yuan Zong, Hailun Lian, Yan Zhao, Björn W. Schuller, Wenming Zheng. 11696-11700 [doi]
- A Novel Cascade Instruction Tuning Method for Biomedical NERJin Zhao 0004, Chao Liu, Jiaqing Liang, Zhixu Li, Yanghua Xiao. 11701-11705 [doi]
- Gmm-Resnext: Combining Generative and Discriminative Models for Speaker VerificationHui Yan, Zhenchun Lei, Changhong Liu, Yong Zhou. 11706-11710 [doi]
- Distilling Hubert with LSTMs via Decoupled Knowledge DistillationDanilo de Oliveira, Timo Gerkmann. 11711-11715 [doi]
- Improving Speed/Accuracy Tradeoff for Online Streaming ASR via Real-Valued and Trainable StridesDario Albesano, Nicola Ferri, Felix Weninger, Puming Zhan. 11716-11720 [doi]
- EMALG: An Enhanced Mandarin Lombard Grid Corpus with Meaningful SentencesBaifeng Li, Qingmu Liu, Yuhong Yang 0001, Hongyang Chen, Weiping Tu, Song Lin. 11721-11725 [doi]
- MSFR: Stance Detection Based on Multi-Aspect Semantic Feature Representation via Hierarchical Contrastive LearningXuechen Zhao, Lei Tian, Feng Xie, Bin Zhou 0004, Haiyang Wang, Hongzhou Wu, Liqun Gao. 11726-11730 [doi]
- End-To-End Real Time Tracking of Children's Reading with Pointer NetworkVishal Sunder, Beulah Karrolla, Eric Fosler-Lussier. 11731-11735 [doi]
- Improving Chinese Spelling Correction with Text-Phonetics Differentiation and Adaptive FusionWenhao Zhang, Shiyao Cui, Wenyuan Zhang, Xinghua Zhang, Tingwen Liu, Hongbo Xu. 11736-11740 [doi]
- Hubertopic: Enhancing Semantic Representation of Hubert Through Self-Supervision Utilizing Topic ModelTakashi Maekaku, Jiatong Shi, Xuankai Chang, Yuya Fujita, Shinji Watanabe 0001. 11741-11745 [doi]
- Acoustic BPE for Speech Generation with Discrete TokensFeiyu Shen, Yiwei Guo, Chenpeng Du, Xie Chen 0001, Kai Yu 0004. 11746-11750 [doi]
- Sparsely Shared Lora on Whisper for Child Speech RecognitionWei Liu, Ying Qin, Zhiyuan Peng, Tan Lee 0001. 11751-11755 [doi]
- Study of Abuse Detection in Continuous Speech for Indian LanguagesRini A. Sharon, Debdoot Mukherjee. 11756-11760 [doi]
- A Generative Adversarial Framework for Dialogue Generation with Neural Architecture SearchYi Huang, Wei Hu, Junlan Feng. 11761-11765 [doi]
- Improving Multi-Modal Emotion Recognition Using Entropy-Based Fusion and Pruning-Based Network Architecture OptimizationHaotian Wang, Jun Du, Yusheng Dai, Chin-Hui Lee, Yuling Ren, Yu Liu. 11766-11770 [doi]
- Unsupervised Multiple Choices Question Answering Via Universal CorpusQin Zhang, Hao Ge, Xiaojun Chen, Meng Fang. 11771-11775 [doi]
- Improving Long Text Understanding with Knowledge Distilled from Summarization ModelYan Liu, Yazheng Yang, Xiaokang Chen. 11776-11780 [doi]
- Robust Cross-Domain Speaker Verification with Multi-Level Domain AdaptersWen Huang, Bing Han, Shuai Wang 0016, Zhengyang Chen, Yanmin Qian. 11781-11785 [doi]
- Exploring Object-Centered External Knowledge for Fine-Grained Video Paragraph CaptioningGuorui Yu, Yimin Hu, Yiqian Xu, Yuejie Zhang, Rui Feng, Tao Zhang 0022, Shang Gao 0003. 11786-11790 [doi]
- CLAP4Emo: ChatGPT-Assisted Speech Emotion Retrieval with Natural Language SupervisionWei-Cheng Lin, Shabnam Ghaffarzadegan, Luca Bondi, Abinaya Kumar, Samarjit Das, Ho-Hsiang Wu. 11791-11795 [doi]
- Anim-400K: A Large-Scale Dataset for Automated End to End Dubbing of VideoKevin Cai, Chonghua Liu, David M. Chan. 11796-11800 [doi]
- USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation ModelsGuanlong Zhao, Yongqiang Wang, Jason Pelecanos, Yu Zhang, Hank Liao, Yiling Huang, Han Lu, Quan Wang. 11801-11805 [doi]
- Task Oriented Dialogue as a Catalyst for Self-Supervised Automatic Speech RecognitionDavid M. Chan, Shalini Ghosh, Hitesh Tulsiani, Ariya Rastrow, Björn Hoffmeister. 11806-11810 [doi]
- Semantics Driven Multi-View Knowledge Graph Embedding for Cross-Lingual Entity AlignmentXin Zhang, Yu Liu 0035, Zhehuan Zhao. 11811-11815 [doi]
- Extreme Encoder Output Frame Rate Reduction: Improving Computational Latencies of Large End-to-End ModelsRohit Prabhavalkar, Zhong Meng, Weiran Wang, Adam Stooke, Xingyu Cai, Yanzhang He, Arun Narayanan, Dongseong Hwang, Tara N. Sainath, Pedro J. Moreno 0001. 11816-11820 [doi]
- Promptformer: Prompted Conformer Transducer for ASRSergio Duarte Torres, Arunasish Sen, Aman Rana, Lukas Drude, Alejandro Gomez-Alanis, Andreas Schwarz, Leif Rädel, Volker Leutnant. 11821-11825 [doi]
- Improving Medical Dialogue Generation with Abstract Meaning RepresentationsBohao Yang, Chen Tang, Chenghua Lin. 11826-11830 [doi]
- Less Peaky and More Accurate CTC Forced Alignment by Label PriorsRuizhe Huang, Xiaohui Zhang 0007, Zhaoheng Ni, Li Sun, Moto Hira, Jeff Hwang, Vimal Manohar, Vineel Pratap, Matthew Wiesner, Shinji Watanabe 0001, Daniel Povey, Sanjeev Khudanpur. 11831-11835 [doi]
- FastInject: Injecting Unpaired Text Data into CTC-Based ASR TrainingKeqi Deng, Philip C. Woodland. 11836-11840 [doi]
- Comparing data-Driven and Handcrafted Features for Dimensional Emotion RecognitionBogdan Vlasenko, Sargam Vyas, Mathew Magimai-Doss. 11841-11845 [doi]
- Emotion-Aware Contrastive Adaptation Network for Source-Free Cross-Corpus Speech Emotion RecognitionYan Zhao, Jincen Wang, Cheng Lu 0005, Sunan Li, Björn W. Schuller, Yuan Zong, Wenming Zheng. 11846-11850 [doi]
- Noise-Robust DSP-Assisted Neural Pitch Estimation With Very Low ComplexityKrishna Subramani, Jean-Marc Valin, Jan Büthe, Paris Smaragdis, Mike Goodwin. 11851-11855 [doi]
- One Model to Rule Them All ? Towards End-to-End Joint Speaker Diarization and Speech RecognitionSamuele Cornell, Jee-weon Jung, Shinji Watanabe 0001, Stefano Squartini. 11856-11860 [doi]
- Balancing Speaker-Rater Fairness for Gender-Neutral Speech Emotion RecognitionWoan-Shiuan Chien, Shreya G. Upadhyay, Chi-Chun Lee. 11861-11865 [doi]
- Addressing Data Scarcity in Voice Disorder Detection with Self-Supervised ModelsRijul Gupta, Catherine J. Madill, Dhanshree R. Gunjawate, Duy Duong Nguyen, Craig T. Jin. 11866-11870 [doi]
- Discriminative Training of VBx DiarizationDominik Klement, Mireia Díez, Federico Landini, Lukás Burget, Anna Silnova, Marc Delcroix, Naohiro Tawara. 11871-11875 [doi]
- Boosting End-to-End Multilingual Phoneme Recognition Through Exploiting Universal Speech Attributes ConstraintsHao Yen, Sabato Marco Siniscalchi, Chin-Hui Lee 0001. 11876-11880 [doi]
- Apollo's Unheard Voices: Graph Attention Networks for Speaker Diarization and Clustering for Fearless Steps Apollo CollectionMeena M. Chandra Shekar, John H. L. Hansen. 11881-11885 [doi]
- Geodesic Interpolation of Frame-Wise Speaker Embeddings for the Diarization of Meeting ScenariosTobias Cord-Landwehr, Christoph Böddeker, Catalin Zorila, Rama Doddipatla, Reinhold Haeb-Umbach. 11886-11890 [doi]
- Communication-Oriented Automatic Assessment System for Accented Spoken Chinese in Read-Aloud TasksHuazhen Wang, Huan Wang, Jianguo Chen, Shiyue Zhu, Hao Zhou, Yifei Zhao. 11891-11895 [doi]
- M2BART: Multilingual and Multimodal Encoder-Decoder Pre-Training for Any-to-Any Machine TranslationPeng-Jen Chen, Bowen Shi, Kelvin Niu, Ann Lee 0001, Wei-Ning Hsu. 11896-11900 [doi]
- Folding Attention: Memory and Power Optimization for On-Device Transformer-Based Streaming Speech RecognitionYang Li, Liangzhen Lai, Yuan Shangguan, Forrest N. Iandola, Zhaoheng Ni, Ernie Chang, Yangyang Shi, Vikas Chandra. 11901-11905 [doi]
- Profile-Error-Tolerant Target-Speaker Voice Activity DetectionDongmei Wang, Xiong Xiao, Naoyuki Kanda, Midia Yousefi, Takuya Yoshioka, Jian Wu. 11906-11910 [doi]
- Improving Neural Diarization through Speaker Attribute Attractors and Local Dependency ModelingDavid Palzer, Matthew Maciejewski, Eric Fosler-Lussier. 11911-11915 [doi]
- Controllable Prosody Generation with Partial InputsDan-Andrei Iliescu, Devang S. Ram Mohan, Tian Huey Teh, Zack Hodari. 11916-11920 [doi]
- Fine-Tuning Self-Supervised Models for Language Identification Using Orthonormal ConstraintAmrutha Prasad, Andrés Carofilis, Geoffroy Vanderreydt, Driss Khalil, Srikanth R. Madikeri, Petr Motlícek, Christof Schüpbach. 11921-11925 [doi]
- The Effects of Loudness and Smiling on Timbre Features: Implications for Charismatic Voices in Mandarin, German and DanishRongjie Shi, Oliver Niebuhr, Wentao Gu, Nafiseh Taghva. 11926-11930 [doi]
- Speak While You Think: Streaming Speech Synthesis During Text GenerationAvihu Dekel, Slava Shechtman, Raul Fernandez, David Haws, Zvi Kons, Ron Hoory. 11931-11935 [doi]
- Prompting Audios Using Acoustic Properties for Emotion RepresentationHira Dhamyal, Benjamin Elizalde, Soham Deshmukh, Huaming Wang, Bhiksha Raj, Rita Singh. 11936-11940 [doi]
- Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter SharingBrian Yan, Xuankai Chang, Antonios Anastasopoulos, Yuya Fujita, Shinji Watanabe 0001. 11941-11945 [doi]
- Snapshot Prompt Ensemble for Parameter-Efficient Soft Prompt TransferXinglong Wu, C. L. Philip Chen, Shuzhen Li, Tony Zhang. 11946-11950 [doi]
- AGADIR: Towards Array-Geometry Agnostic Directional Speech RecognitionJu Lin, Niko Moritz, Yiteng Huang, Ruiming Xie, Ming Sun, Christian Fuegen, Frank Seide. 11951-11955 [doi]
- MCM-CSD: Multi-Granularity Context Modeling with Contrastive Speaker Detection for Emotion Recognition in Real-Time ConversationYuan Xu, Meng Yang. 11956-11960 [doi]
- Kenet: Knowledge-Enhanced DOC-Label Attention Network for Multi-Label Text ClassificationBo Li, Yuyan Chen, Liang Zeng. 11961-11965 [doi]
- CryCeleb: A Speaker Verification Dataset Based on Infant Cry SoundsDavid Budaghyan, Charles C. Onu, Arsenii Gorin, Cem Subakan, Doina Precup. 11966-11970 [doi]
- Enhancing End-to-End Conversational Speech Translation Through Target Language Context UtilizationAmir Hussein, Brian Yan, Antonios Anastasopoulos, Shinji Watanabe 0001, Sanjeev Khudanpur. 11971-11975 [doi]
- Speech Emotion Recognition with Distilled Prosodic and Linguistic Affect RepresentationsDebaditya Shome, Ali Etemad. 11976-11980 [doi]
- SECP: A Speech Enhancement-Based Curation Pipeline for Scalable Acquisition of Clean SpeechAdam Sabra, Cyprian M. Wronka, Michelle Mao, Samer L. Hijazi. 11981-11985 [doi]
- Cross-Speaker Encoding Network for Multi-Talker Speech RecognitionJiawen Kang 0002, Lingwei Meng, Mingyu Cui, Haohan Guo, Xixin Wu, Xunying Liu, Helen Meng. 11986-11990 [doi]
- UniX-Encoder: A Universal X-Channel Speech Encoder for AD-HOC Microphone Array Speech ProcessingZili Huang, Yiwen Shao, Shi-Xiong Zhang, Dong Yu 0001. 11991-11995 [doi]
- Can LLM Find the Green Circle? Investigation and Human-Guided Tool Manipulation for Compositional GeneralizationMin Zhang, Jianfeng He, Shuo Lei, Murong Yue, Linhan Wang, Chang-Tien Lu. 11996-12000 [doi]
- Audiovisual Speaker Separation with Full- and Sub-Band Modeling in the Time-Frequency DomainVahid Ahmadi Kalkhorani, Anurag Kumar 0003, Ke Tan 0001, Buye Xu, DeLiang Wang. 12001-12005 [doi]
- Speech Collage: Code-Switched Audio Generation by Collaging Monolingual CorporaAmir Hussein, Dorsa Zeinali, Ondrej Klejch, Matthew Wiesner, Brian Yan, Shammur Absar Chowdhury, Ahmed Ali 0001, Shinji Watanabe 0001, Sanjeev Khudanpur. 12006-12010 [doi]
- Detecting Check-Worthy Claims in Political Debates, Speeches, and Interviews Using Audio DataPetar Ivanov, Ivan Koychev, Momchil Hardalov, Preslav Nakov. 12011-12015 [doi]
- Refining Text Input For Augmentative and Alternative Communication (AAC) Devices: Analysing Language Model Layers For OptimisationHussein Yusufali, Roger K. Moore, Stefan Goetze. 12016-12020 [doi]
- Longitudinal Modeling of Depression Shifts Using Speech and LanguagePaula Andrea Pérez-Toro, Judith Dineley, Agnieszka Kaczkowska, Pauline Conde, Yuezhou Zhang, Faith Matcham, Sara Siddi, Josep Maria Haro, Stuart Bruce, Til Wykes, Raquel Bailón, Srinivasan Vairavan, Richard J. B. Dobson, Andreas K. Maier, Elmar Nöth, Juan Rafael Orozco-Arroyave, Vaibhav A. Narayan, Nicholas Cummins. 12021-12025 [doi]
- Transducers with Pronunciation-Aware Embeddings for Automatic Speech RecognitionHainan Xu, Zhehuai Chen, Fei Jia, Boris Ginsburg. 12026-12030 [doi]
- Generalization of Self-Supervised Learning-Based Representations for Cross-Domain Speech Emotion RecognitionAbinay Reddy Naini, Mary A. Kohler, Elizabeth Richerson, Donita Robinson, Carlos Busso. 12031-12035 [doi]
- Dynamic Speech Emotion Recognition Using A Conditional Neural ProcessLuz Martinez-Lucas, Carlos Busso. 12036-12040 [doi]
- Stateful Conformer with Cache-Based Inference for Streaming Automatic Speech RecognitionVahid Noroozi, Somshubra Majumdar, Ankur Kumar, Jagadeesh Balam, Boris Ginsburg. 12041-12045 [doi]
- TAROT: A Hierarchical Framework with Multitask co-pretraining on Semi-Structured Data Towards Effective Person-Job fitYihan Cao, Xu Chen 0022, Lun Du, Hao Chen, Qiang Fu, Shi Han, Yushu Du, Yanbin Kang, Guangming Lu, Zi Li. 12046-12050 [doi]
- Revisiting Self-supervised Learning of Speech Representation from a Mutual Information PerspectiveAlexander H. Liu, Sung-Lin Yeh, James R. Glass. 12051-12055 [doi]
- Retrieval Augmented End-to-End Spoken Dialog ModelsMingqiu Wang, Izhak Shafran, Hagen Soltau, Wei Han 0002, Yuan Cao 0007, Dian Yu, Laurent El Shafey. 12056-12060 [doi]
- Self-Supervised Models of Speech Infer Universal Articulatory KinematicsCheol Jun Cho, Abdelrahman Mohamed, Alan W. Black, Gopala Krishna Anumanchipalli. 12061-12065 [doi]
- Enabling Device Control Planning Capabilities of Small Language ModelSudipta Paul 0011, Lingyu Zhang, Yilin Shen, Hongxia Jin. 12066-12070 [doi]
- AugSumm: Towards Generalizable Speech Summarization Using Synthetic Labels from Large Language ModelsJee-weon Jung, Roshan S. Sharma, William Chen, Bhiksha Raj, Shinji Watanabe 0001. 12071-12075 [doi]
- SD-HuBERT: Sentence-Level Self-Distillation Induces Syllabic Organization in HubertCheol Jun Cho, Abdelrahman Mohamed, Shang-wen Li 0001, Alan W. Black, Gopala Krishna Anumanchipalli. 12076-12080 [doi]
- Revealing Emotional Clusters in Speaker Embeddings: A Contrastive Learning Strategy for Speech Emotion RecognitionIsmail Rasim Ulgen, Zongyang Du, Carlos Busso, Berrak Sisman. 12081-12085 [doi]
- SCORE: Self-Supervised Correspondence Fine-Tuning for Improved Content RepresentationsAmit Meghanani, Thomas Hain. 12086-12090 [doi]
- Task Selection and Assignment for Multi-Modal Multi-Task Dialogue Act Classification with Non-Stationary Multi-Armed BanditsXiangheng He, Junjie Chen, Björn W. Schuller. 12091-12095 [doi]
- Improving ASR Contextual Biasing with Guided AttentionJiyang Tang, Kwangyoun Kim, Suwon Shon, Felix Wu, Prashant Sridhar. 12096-12100 [doi]
- GMM-ResNet2: Ensemble of Group Resnet Networks for Synthetic Speech DetectionZhenchun Lei, Hui Yan, Changhong Liu, Yong Zhou, Minglei Ma. 12101-12105 [doi]
- Exploring Soft Prompt Initialization Strategy for Few-Shot Continual Text ClassificationZhehao Zhang, Tong Yu 0001, Handong Zhao, Kaige Xie, Lina Yao 0001, Shuai Li 0010. 12106-12110 [doi]
- Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech RecognitionKrishna C. Puvvada, Nithin Rao Koluguri, Kunal Dhawan, Jagadeesh Balam, Boris Ginsburg. 12111-12115 [doi]
- Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and AugmentingTianTian Feng, Shrikanth Narayanan. 12116-12120 [doi]
- Turn-Taking and Backchannel Prediction with Acoustic and Large Language Model FusionJinhan Wang, Long Chen, Aparna Khare, Anirudh Raju, Pranav Dheram, Di He 0004, Minhua Wu, Andreas Stolcke, Venkatesh Ravichandran. 12121-12125 [doi]
- Learning Arousal-Valence Representation from Categorical Emotion Labels of SpeechEnting Zhou, You Zhang 0001, Zhiyao Duan. 12126-12130 [doi]
- Efficient Adapter Tuning of Pre-Trained Speech Models for Automatic Speaker VerificationMufan Sang, John H. L. Hansen. 12131-12135 [doi]
- Dynamic-Superb: Towards a Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark For SpeechChien-Yu Huang, Ke-Han Lu, Shih-Heng Wang, Chi-Yuan Hsiao, Chun-Yi Kuan, Haibin Wu, Siddhant Arora, Kai-Wei Chang, Jiatong Shi, Yifan Peng, Roshan S. Sharma, Shinji Watanabe 0001, Bhiksha Ramakrishnan, Shady Shehata, Hung-yi Lee. 12136-12140 [doi]
- Self-Supervised Speaker Verification with Adaptive Threshold and Hierarchical TrainingZehua Zhou, Haoyuan Yang, Takahiro Shinozaki. 12141-12145 [doi]
- ED-TTS: Multi-Scale Emotion Modeling Using Cross-Domain Emotion Diarization for Emotional Speech SynthesisHaobin Tang, Xulong Zhang 0001, Ning Cheng 0001, Jing Xiao 0006, Jianzong Wang. 12146-12150 [doi]
- VFD-Net: Vocoder Fingerprints Detection for Fake AudioJunlong Deng, Yanzhen Ren, Tong Zhang, Hongcheng Zhu, Zongkun Sun. 12151-12155 [doi]
- SingFake: Singing Voice Deepfake DetectionYongyi Zang, You Zhang 0001, Mojtaba Heydari, Zhiyao Duan. 12156-12160 [doi]
- ESVC: Combining Adaptive Style Fusion and Multi-Level Feature Disentanglement for Expressive Singing Voice ConversionZeyu Yang, Minchuan Chen, Yanping Li, Wei Hu, Shaojun Wang, Jing Xiao, Zijian Li. 12161-12165 [doi]
- MTA: A Lightweight Multilingual Text Alignment Model for Cross-Language Visual Word Sense DisambiguationQihao Yang, Xuelin Wang, Yong Li, Lap-Kei Lee, Fu Lee Wang, Tianyong Hao. 12166-12170 [doi]
- Spontts: Modeling and Transferring Spontaneous Style for TTSHanzhao Li, Xinfa Zhu, Liumeng Xue, Yang Song, Yunlin Chen, Lei Xie. 12171-12175 [doi]
- Bridging the Gaps of Both Modality and Language: Synchronous Bilingual CTC for Speech Translation and Speech RecognitionChen Xu 0008, Xiaoqian Liu, Erfeng He, Yuhao Zhang, Qianqian Dong, Tong Xiao, Jingbo Zhu, Dapeng Man, Wu Yang 0001. 12176-12180 [doi]
- On the Importance of Neural Wiener Filter for Resource Efficient Multichannel Speech EnhancementTsun-An Hsieh, Jacob Donley, Daniel Wong, Buye Xu, Ashutosh Pandey 0004. 12181-12185 [doi]
- Enhancing Multilingual TTS with Voice Conversion Based Data Augmentation and Posterior EmbeddingHyun-Wook Yoon, Jin Seob Kim, Ryuichi Yamamoto, Ryo Terashima, Chan Ho Song, Jae Min Kim, Eunwoo Song. 12186-12190 [doi]
- Rethinking Targeted Adversarial Attacks for Neural Machine TranslationJunjie Wu 0007, Lemao Liu, Wei Bi, Dit-Yan Yeung. 12191-12195 [doi]
- VIC-KD: Variance-Invariance-Covariance Knowledge Distillation to Make Keyword Spotting More Robust Against Adversarial AttacksHeitor R. Guimarães, Arthur Pimentel, Anderson R. Avila, Tiago H. Falk. 12196-12200 [doi]
- Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of a Multilingual ASR ModelJiamin Xie, Ke Li, Jinxi Guo, Andros Tjandra, Yuan Shangguan, Leda Sari, Chunyang Wu, Junteng Jia, Jay Mahadeokar, Ozlem Kalinli. 12201-12205 [doi]
- Decoupled Spatial and Temporal Processing for Resource Efficient Multichannel Speech EnhancementAshutosh Pandey 0004, Buye Xu. 12206-12210 [doi]
- Fact-Aware Summarization with Contrastive Learning for Few-Shot Dialogue State TrackingSijie Feng, Haoxiang Su, Hongyan Xie, Di Wu, Hao Huang 0009, Wushour Silamu. 12211-12215 [doi]
- Leveraging Large Pretrained Models for Line-by-Line Spoken Program RecognitionSadia Nowrin, Keith Vertanen. 12216-12220 [doi]
- Augmenting Conformers With Structured State-Space Sequence Models For Online Speech RecognitionHaozhe Shan, Albert Gu, Zhong Meng, Weiran Wang, Krzysztof Choromanski, Tara N. Sainath. 12221-12225 [doi]
- Max-Margin Transducer Loss: Improving Sequence-Discriminative Training Using a Large-Margin Learning StrategyRupak Vignesh Swaminathan, Grant P. Strimel, Ariya Rastrow, Sri Harish Mallidi, Kai Zhen, Hieu Duy Nguyen, Nathan Susanj, Athanasios Mouchtaris. 12226-12230 [doi]
- Leveraging Large Language Models for Exploiting ASR UncertaintyPranay Dighe, Yi Su, Shangshang Zheng, Yunshu Liu, Vineet Garg, Xiaochuan Niu, Ahmed H. Tewfik. 12231-12235 [doi]
- Multi-Objective Progressive Clustering for Semi-Supervised Domain Adaptation in Speaker VerificationZe Li, Yuke Lin, Ning Jiang, Xiaoyi Qin, Guoqing Zhao, Haiying Wu, Ming Li. 12236-12240 [doi]
- Efficient Personal Voice Activity Detection with Wake Word Reference SpeechBang Zeng, Ming Cheng, Yao Tian, Haifeng Liu, Ming Li. 12241-12245 [doi]
- Cross Modal Training for ASR Error Correction with Contrastive LearningJin Jiang, Xiaojun Wan 0001, Wei Peng, Rongjun Li, Jingyuan Yang 0008, Yanquan Zhou. 12246-12250 [doi]
- A Birgat Model for Multi-Intent Spoken Language Understanding with Hierarchical Semantic FramesHongshen Xu, Ruisheng Cao, Su Zhu, Sheng Jiang, Hanchong Zhang, Lu Chen 0002, Kai Yu 0004. 12251-12255 [doi]
- Task Vector Algebra for ASR ModelsGowtham Ramesh, Kartik Audhkhasi, Bhuvana Ramabhadran. 12256-12260 [doi]
- Fovea Transformer: Efficient Long-Context Modeling with Structured Fine-To-Coarse AttentionZiwei He, Jian Yuan, Le Zhou, Jingwen Leng, Bo Jiang. 12261-12265 [doi]
- Small-Footprint Convolutional Neural Network with Reduced Feature Map for Voice Activity DetectionHwabyeong Chae, Sunggu Lee. 12266-12270 [doi]
- MS-SENet: Enhancing Speech Emotion Recognition Through Multi-Scale Feature Fusion with Squeeze-and-Excitation BlocksMengbo Li, Yuanzhong Zheng, Dichucheng Li, Yulun Wu, Yaoxuan Wang, Haojun Fei. 12271-12275 [doi]
- Complexity Scaling for Speech DenoisingHangting Chen, Jianwei Yu, Chao Weng. 12276-12280 [doi]
- Crowdsourced and Automatic Speech Prominence EstimationMax Morrison, Pranav Pawar, Nathan Pruyne, Jennifer Cole 0001, Bryan Pardo. 12281-12285 [doi]
- Bridging the Gap: A Self-Learning Model Using Implicit Knowledge for Chinese Spelling CorrectionWenyao Cui, Jiahao Cai, Baohua Zhang, Yongyi Huang, Huaping Zhang. 12286-12290 [doi]
- Automatic Speech Recognition Tuned for Child Speech in the ClassroomRosy Southwell, Wayne H. Ward, Viet Anh Trinh, Charis Clevenger, Clay Clevenger, Emily Watts, Jason G. Reitman, Sidney D'Mello, Jacob Whitehill. 12291-12295 [doi]
- SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross AttentionJunjie Li, Yiwei Guo, Xie Chen 0001, Kai Yu 0004. 12296-12300 [doi]
- Dual Level Intent-Slot Interaction for Improved Multi-Intent Spoken Language UnderstandingDi Wu, Liting Jiang, Lili Yin, Kai Wang, Haoxiang Su, Zhe Li, Hao Huang. 12301-12305 [doi]
- UNIT-DSR: Dysarthric Speech Reconstruction System Using Speech Unit NormalizationYuejiao Wang, Xixin Wu, Disong Wang, Lingwei Meng, Helen Meng. 12306-12310 [doi]
- Enhancing Pre-Trained ASR System Fine-Tuning for Dysarthric Speech Recognition Using Adversarial Data AugmentationHuimeng Wang, Zengrui Jin, Mengzhe Geng, Shujie Hu, Guinan Li, Tianzi Wang, Haoning Xu, Xunying Liu. 12311-12315 [doi]
- Stylespeech: Self-Supervised Style Enhancing with VQ-VAE-Based Pre-Training for Expressive Audiobook Speech SynthesisXueyuan Chen, Xi Wang 0016, Shaofei Zhang, Lei He 0005, Zhiyong Wu 0001, Xixin Wu, Helen Meng. 12316-12320 [doi]
- Rethinking Session Variability: Leveraging Session Embeddings for Session Robustness in Speaker VerificationHee-Soo Heo, Kihyun Nam, Bong-Jin Lee, Youngki Kwon, MinJae Lee, You Jin Kim, Joon Son Chung. 12321-12325 [doi]
- Corpus Synthesis for Zero-Shot ASR Domain Adaptation Using Large Language ModelsHsuan Su, Ting-Yao Hu, Hema Swetha Koppula, Raviteja Vemulapalli, Jen-Hao Rick Chang, Karren D. Yang, Gautam Varma Mantena, Oncel Tuzel. 12326-12330 [doi]
- A Spatial Long-Term Iterative Mask Estimation Approach for Multi-Channel Speaker Diarization and Speech RecognitionFeng Ma, Yanhui Tu, Maokui He, Ruoyu Wang 0029, Shutong Niu, Lei Sun 0010, Zhongfu Ye, Jun Du, Jia Pan, Chin-Hui Lee 0001. 12331-12335 [doi]
- Learning Contextualized Representation on Discrete Space Via Hierarchical Product QuantizationHyung Yong Kim, Byeong-Yeol Kim, Yunkyu Lim, Jihwan Park, Jinseok Park, Youshin Lim, Seung Woo Yu, Hanbin Lee. 12336-12340 [doi]
- Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech ReconstructionXueyuan Chen, Yuejiao Wang, Xixin Wu, Disong Wang, Zhiyong Wu 0001, Xunying Liu, Helen Meng. 12341-12345 [doi]
- An Anchor Learning Approach for Citation Field LearningZilin Yuan, Borun Chen, Yimeng Dai, Yinghui Li, Hai-Tao Zheng, Rui Zhang. 12346-12350 [doi]
- Diversity-Based Core-Set Selection for Text-to-Speech with Linguistic and Acoustic FeaturesKentaro Seki, Shinnosuke Takamichi, Takaaki Saeki, Hiroshi Saruwatari. 12351-12355 [doi]
- Improving Speech Recognition for African American English with Audio ClassificationShefali Garg, Zhouyuan Huo, Khe Chai Sim, Suzan Schwartz, Mason Chua, Alëna Aksënova, Tsendsuren Munkhdalai, Levi King, Darryl Wright, Zion Mengesha, Dongseong Hwang, Tara N. Sainath, Françoise Beaufays, Pedro Moreno Mengibar. 12356-12360 [doi]
- Boosting LLMS with Ontology-Aware Prompt for Ner Data AugmentationZhizhao Luo, Youchen Wang, Wenjun Ke, Rui Qi, Yikai Guo, Peng Wang. 12361-12365 [doi]
- Learning Semantic Information from Raw Audio Signal Using Both Contextual and Phonetic RepresentationsJaeyeon Kim, Injune Hwang, Kyogu Lee. 12366-12370 [doi]
- Score Calibration Based on Consistency Measure Factor for Speaker VerificationYu Zheng, Yajun Zhang, Chuanying Niu, Yibin Zhan, Yanhua Long, Dongxing Xu. 12371-12375 [doi]
- Leveraging Visual Handicaps for Text-Based Reinforcement LearningSubhajit Chaudhury, Keerthiram Murugesan, Thomas Carta, Kartik Talamadupula, Michiaki Tatsubori. 12376-12380 [doi]
- New Intent Discovery with Multi-View ClusteringHan Liu 0008, Junjie Sun, Xiaotong Zhang 0003, Hongyang Chen. 12381-12385 [doi]
- A Robust Pitch-Fusion Model for Speech Emotion Recognition in Tonal LanguagesPham Viet Thanh, Ngo Thi Thu Huyen, Pham Ngoc Quan, Nguyen Thi Thu Trang. 12386-12390 [doi]
- Towards an Interpretable Representation of Speaker Identity via Perceptual Voice QualitiesRobin Netzorg, Bohan Yu, Andrea Guzman, Peter Wu, Luna McNulty, Gopala Krishna Anumanchipalli. 12391-12395 [doi]
- Boosting Speech Enhancement with Clean Self-Supervised Features Via Conditional Variational AutoencodersYoonhyung Lee, Kyomin Jung. 12396-12400 [doi]
- Context-Guided and Syntactic Augmented Dual Graph Convolutional Network for Aspect-Based Sentiment AnalysisJia-yi, Xiaoming Wu, Xiangzhi Liu. 12401-12405 [doi]
- End-to-End Speech Recognition Contextualization with Large Language ModelsEgor Lakomkin, Chunyang Wu, Yassir Fathullah, Ozlem Kalinli, Michael L. Seltzer, Christian Fuegen. 12406-12410 [doi]
- Unsupervised Extractive Dialogue Summarization in Hyperdimensional SpaceSeongmin Park, KyungHo Kim, Jaejin Seo, Jihwa Lee. 12411-12415 [doi]
- Improving Multi-Speaker ASR With Overlap-Aware Encoding And Monotonic AttentionTao Li, Feng Wang, Wenhao Guan, Lingyan Huang, Qingyang Hong, Lin Li. 12416-12420 [doi]
- DialCLIP: Empowering Clip As Multi-Modal Dialog RetrieverZhichao Yin, Binyuan Hui, Min Yang 0007, Fei Huang 0004, Yongbin Li. 12421-12425 [doi]
- Enhancing Quantised End-to-End ASR Models Via PersonalisationQiuming Zhao, Guangzhi Sun, Chao Zhang, Mingxing Xu, Thomas Fang Zheng. 12426-12430 [doi]
- Comparative Study of Tokenization Algorithms for End-to-End Open Vocabulary Keyword DetectionKrishna Gurugubelli, Sahil Mohamed, Rajesh Krishna K. S. 12431-12435 [doi]
- Theme-Enhanced Hard Negative Sample Mining for Open-Domain Question AnsweringFulu Li, Zhiwen Xie, Guangyou Zhou. 12436-12440 [doi]
- Generating Persona-Aware Empathetic Responses with Retrieval-Augmented Prompt LearningZhengjie Huang, Pingsheng Liu, Gerard de Melo, Liang He, Linlin Wang. 12441-12445 [doi]
- Does Audio Deepfake Detection Rely on Artifacts?Tsu-Hsien Shih, Chin-Yuan Yeh, Ming-Syan Chen. 12446-12450 [doi]
- A Weighted-Variance Variational Autoencoder Model for Speech EnhancementAli Golmakani, Mostafa Sadeghi, Xavier Alameda-Pineda, Romain Serizel. 12451-12455 [doi]
- Convnext-TTS And Convnext-VC: Convnext-Based Fast End-To-End Sequence-To-Sequence Text-To-Speech And Voice ConversionTakuma Okamoto, Yamato Ohtani, Tomoki Toda, Hisashi Kawai. 12456-12460 [doi]
- Dementia Assessment Using Mandarin Speech with an Attention-Based Speech Recognition EncoderZih-Jyun Lin, Yi-Ju Chen, Po-Chih Kuo, Likai Huang, Chaur-Jong Hu, Cheng-Yu Chen. 12461-12465 [doi]
- Posterior Sampling Algorithms for Unsupervised Speech Enhancement with Recurrent Variational AutoencoderMostafa Sadeghi, Romain Serizel. 12466-12470 [doi]
- Vector Quantization Knowledge Transfer for End-to-End Text Image Machine TranslationCong Ma, Yaping Zhang, Yang Zhao, Yu Zhou, Chengqing Zong. 12471-12475 [doi]
- SpeechDPR: End-To-End Spoken Passage Retrieval For Open-Domain Spoken Question AnsweringChyi-Jiunn Lin, Guan-Ting Lin, Yung-Sung Chuang, Wei-Lun Wu, Shang-wen Li 0001, Abdelrahman Mohamed, Hung-yi Lee, Lin-Shan Lee. 12476-12480 [doi]
- Unsupervised Speech Enhancement with Diffusion-Based Generative ModelsBerné Nortier, Mostafa Sadeghi, Romain Serizel. 12481-12485 [doi]
- Anonymizing Speaker Voices: Easy to Imitate, Difficult to Recognize?Jennifer Williams 0001, Karla Pizzi, Natalia Tomashenko, Sneha Das. 12491-12495 [doi]
- Concealing Medical Condition by Node Toggling in ASR for Dementia PatientsWei-Tung Hsu, Chin-Po Chen, Chi-Chun Lee. 12496-12500 [doi]
- Transfer the Linguistic Representations from TTS to Accent Conversion with Non-Parallel DataXi Chen, Jiakun Pei, Liumeng Xue, Mingyang Zhang. 12501-12505 [doi]
- Diffusion-Based Speech Enhancement with a Weighted Generative-Supervised Learning LossJean-Eudes Ayilo, Mostafa Sadeghi, Romain Serizel. 12506-12510 [doi]
- A Separation Priority Pipeline for Single-Channel Speech Separation in Noisy EnvironmentsShaoxiang Dang, Tetsuya Matsumoto, Yoshinori Takeuchi, Hiroaki Kudo. 12511-12515 [doi]
- Extending Whisper with Prompt Tuning to Target-Speaker ASRHao Ma, Zhiyuan Peng, Mingjie Shao, Jing Li, Ju Liu. 12516-12520 [doi]
- Domain-Slot Aware Contrastive Learning for Improved Dialogue State TrackingHaoxiang Su, Sijie Feng, Hongyan Xie, Di Wu, Hao Huang, Zhongjiang He, Shuangyong Song, Ruiyu Fang, Xiaomeng Huang, Wushour Silamu. 12521-12525 [doi]
- Do Learned Speech Symbols Follow Zipf's Law?Shinnosuke Takamichi, Hiroki Maeda, Joonyong Park, Daisuke Saito, Hiroshi Saruwatari. 12526-12530 [doi]
- Spoofing Attack Augmentation: Can Differently-Trained Attack Models Improve Generalisation?Wanying Ge, Xin Wang 0037, Junichi Yamagishi, Massimiliano Todisco, Nicholas W. D. Evans. 12531-12535 [doi]
- Automatic Design of Adapter Architectures for Enhanced Parameter-Efficient Fine-TuningSiya Xu, Xinyan Wen. 12536-12540 [doi]
- Fine-Grained Discrepancy Contrastive Learning for Robust Fake News DetectionJunwei Yin, Min Gao 0001, Kai Shu, Jia Wang, Yinqiu Huang, Wei Zhou 0028. 12541-12545 [doi]
- Leveraging Biases in Large Language Models: "bias-kNN" for Effective Few-Shot LearningYong Zhang, Hanzhang Li, Zhitao Li, Ning Cheng 0001, Ming Li, Jing Xiao 0006, Jianzong Wang. 12546-12550 [doi]
- VoxMM: Rich Transcription of Conversations in the WildDoyeop Kwak, Jaemin Jung, Kihyun Nam, Youngjoon Jang, Jee-weon Jung, Shinji Watanabe 0001, Joon Son Chung. 12551-12555 [doi]
- Are Deep Neural Networks Robust to Named Entities? An Adversarial Attack and Defense PerspectiveHongtao Wang 0002, Ang Li. 12556-12560 [doi]
- Training Generative Adversarial Network-Based Vocoder with Limited Data Using Augmentation-Conditional DiscriminatorTakuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka. 12561-12565 [doi]
- VoiceLDM: Text-to-Speech with Environmental ContextYeonghyeon Lee, Inmo Yeon, Juhan Nam, Joon Son Chung. 12566-12571 [doi]
- FusDom: Combining in-Domain and Out-of-Domain Knowledge for Continuous Self-Supervised LearningAshish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha. 12572-12576 [doi]
- Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice ConversionBinzhu Sha, Xu Li 0015, Zhiyong Wu 0001, Ying Shan, Helen Meng. 12577-12581 [doi]
- Decoupling and Refilling: A Simple Data Augmentation Method for Aspect Term ExtractionJiaxiang Chen, Yu Hong, Chaoqun Liu, Qingting Xu, Guodong Zhou. 12582-12586 [doi]
- A Two-Stage Framework in Cross-Spectrum Domain for Real-Time Speech EnhancementYuewei Zhang, Huanbin Zou, Jie Zhu. 12587-12591 [doi]
- Multitask Speech Recognition and Speaker Change Detection for Unknown Number of SpeakersShashi Kumar, Srikanth R. Madikeri, Iuliia Nigmatulina, Esaú Villatoro-Tello, Petr Motlícek, Karthik Pandia, S. Pavankumar Dubagunta, Aravind Ganapathiraju. 12592-12596 [doi]
- Self-Supervised Speaker Verification Employing A Novel Clustering AlgorithmAbderrahim Fathan, Jahangir Alam 0001. 12597-12601 [doi]
- Towards Interpretability of Automatic Phoneme Analysis in Cleft Lip and Palate SpeechIlja Baumann, Dominik Wagner, Maria Schuster, Elmar Nöth, Tobias Bocklet. 12602-12606 [doi]
- Leveraging Data Collection and Unsupervised Learning for Code-Switched Tunisian Arabic Automatic Speech RecognitionAhmed Amine Ben Abdallah, Ata Kabboudi, Amir Kanoun, Salah Zaiem. 12607-12611 [doi]
- Generation-Based Target Speech Extraction with Speech Discretization and VocoderLinfeng Yu, Wangyou Zhang, Chenpeng Du, Leying Zhang, Zheng Liang, Yanmin Qian. 12612-12616 [doi]
- Probability-Aware Word-Confusion-Network-To-Text Alignment Approach for Intent ClassificationEsaú Villatoro-Tello, Srikanth R. Madikeri, Bidisha Sharma, Driss Khalil, Shashi Kumar, Iuliia Nigmatulina, Petr Motlícek, Aravind Ganapathiraju. 12617-12621 [doi]
- Midi-Voice: Expressive Zero-Shot Singing Voice Synthesis via Midi-Driven PriorsDong-Min Byun, Sang-Hoon Lee, Ji-Sang Hwang, Seong-Whan Lee. 12622-12626 [doi]
- On the Relation Between Internal Language Model and Sequence Discriminative Training for Neural TransducersZijian Yang, Wei Zhou 0043, Ralf Schlüter, Hermann Ney. 12627-12631 [doi]
- Seeing Through The Conversation: Audio-Visual Speech Separation Based on Diffusion ModelSuyeon Lee, Chaeyoung Jung, Youngjoon Jang, Jaehun Kim, Joon Son Chung. 12632-12636 [doi]
- Connecting Speech Encoder and Large Language Model for ASRWenyi Yu, Changli Tang, Guangzhi Sun, Xianzhao Chen, Tian Tan 0019, Wei Li 0119, Lu Lu 0015, Zejun Ma, Chao Zhang 0031. 12637-12641 [doi]
- Iphonmatchnet: Zero-Shot User-Defined Keyword Spotting Using Implicit Acoustic Echo CancellationYong-Hyeok Lee, Namhyun Cho. 12642-12646 [doi]
- Relational Graph-Bridged Image-Text Interaction: A Novel Method for Multi-Modal Relation ExtractionZihao Zheng, Tao He, Ming Liu 0004, Zhongyuan Wang 0006, Ruiji Fu, Bing Qin 0001. 12647-12651 [doi]
- Contextual Biasing Methods for Improving Rare Word Detection in Automatic Speech RecognitionMrinmoy Bhattacharjee, Iuliia Nigmatulina, Amrutha Prasad, Pradeep Rangappa, Srikanth R. Madikeri, Petr Motlícek, Hartmut Helmke, Matthias Kleinert. 12652-12656 [doi]
- Enhancing Document-Level Event Extraction via Structure-Aware Heterogeneous Graph with Multi-Granularity SubsentencesYuhan Liu, Neng Gao, Yifei Zhang, Zhe Kong. 12657-12661 [doi]
- Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic PromptsShun Lei, Yixuan Zhou 0002, Liyang Chen, Dan Luo, Zhiyong Wu 0001, Xixin Wu, Shiyin Kang, Tao Jiang, Yahui Zhou, Yuxing Han 0001, Helen Meng. 12662-12666 [doi]
- Energy-Based Models for Speech SynthesisWanli Sun, Zehai Tu, Anton Ragni. 12667-12671 [doi]
- PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language DescriptionsReo Shimizu, Ryuichi Yamamoto, Masaya Kawamura, Yuma Shirahata, Hironori Doi, Tatsuya Komatsu, Kentaro Tachibana. 12672-12676 [doi]
- LLET: Lightweight Lexicon-Enhanced Transformer for Chinese NERZongcheng Ji, Yinlong Xiao. 12677-12681 [doi]
- Mels-Tts : Multi-Emotion Multi-Lingual Multi-Speaker Text-To-Speech System Via Disentangled Style TokensHeejin Choi, Jae-Sung Bae, Joun Yeop Lee, Seongkyu Mun, Jihwan Lee, Hoon-Young Cho, Chanwoo Kim 0001. 12682-12686 [doi]
- Effective Internal Language Model Training and Fusion for Factorized Transducer ModelJinxi Guo, Niko Moritz, Yingyi Ma, Frank Seide, Chunyang Wu, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer. 12687-12691 [doi]
- Modeling Intrapersonal and Interpersonal Influences for Automatic Estimation of Therapist Empathy in Counseling ConversationDehua Tao, Tan Lee 0001, Harold Chui, Sarah Luk. 12692-12696 [doi]
- PAVITS: Exploring Prosody-Aware VITS for End-to-End Emotional Voice ConversionTianhua Qi, Wenming Zheng, Cheng Lu, Yuan Zong, Hailun Lian. 12697-12701 [doi]
- Audio Deepfake Detection With Self-Supervised Wavlm And Multi-Fusion Attentive ClassifierYinlin Guo, Haofan Huang, Xi Chen 0025, He Zhao, Yuehai Wang. 12702-12706 [doi]
- AdaMER-CTC: Connectionist Temporal Classification with Adaptive Maximum Entropy Regularization for Automatic Speech RecognitionSooHwan Eom, Eunseop Yoon, Hee Suk Yoon, Chanwoo Kim 0001, Mark Hasegawa-Johnson, Chang D. Yoo. 12707-12711 [doi]
- SADE: A Speaker-Aware Dual Encoding Model Based on Diagbert for Medical Triage and Pre-DiagnosisHaozhou Li, Xinyuan Wang, Hongkai Du, Wentong Sun, Qinke Peng. 12712-12716 [doi]
- Dust: Dual-Grained Syntax-Aware Transformer Network for Chinese Named Entity RecognitionYinlong Xiao, Zongcheng Ji, JianQiang Li. 12717-12721 [doi]
- TranSentence: speech-to-speech Translation via Language-Agnostic Sentence-Level Speech Encoding without Language-Parallel DataSeung-bin Kim, Sang-Hoon Lee, Seong-Whan Lee. 12722-12726 [doi]
- Memory-Augmented speech-to-text Translation with Multi-Scale Context Translation StrategyYuxuan Yuan, Yue Zhou, Xiaodong Shi. 12727-12731 [doi]
- Efficient Black-Box Speaker Verification Model Adaptation With Reprogramming And Backend LearningJingyu Li, Tan Lee 0001. 12732-12736 [doi]
- Fewer-Token Neural Speech Codec with Time-Invariant CodesYong Ren, Tao Wang 0074, Jiangyan Yi, Le Xu, Jianhua Tao 0001, Chu-Yuan Zhang, Junzuo Zhou. 12737-12741 [doi]
- Tree of Uncertain Thoughts Reasoning for Large Language ModelsShentong Mo, Miao Xin. 12742-12746 [doi]
- Exploring Adapters with Conformers for Children's Automatic Speech RecognitionThomas Rolland, Alberto Abad. 12747-12751 [doi]
- L1-Aware Multilingual Mispronunciation Detection FrameworkYassine El Kheir, Shammur Absar Chowdhury, Ahmed Ali 0002. 12752-12756 [doi]
- Improved Children's Automatic Speech Recognition Combining Adapters and Synthetic Data AugmentationThomas Rolland, Alberto Abad. 12757-12761 [doi]
- Evaluation of an Improved Ultrasonic Imaging Helmet for Observing Articulatory DataYuxuan Li, Jianguo Wei, Qiang Fang, Xugang Lu. 12762-12766 [doi]
- Spectral Analysis of Vowels and Fricatives at Varied Levels of Dysarthria Severity for Amyotrophic Lateral SclerosisChowdam Venkata Thirumala Kumar, Tanuka Bhattacharjee, Seena Vengalil, Saraswati Nashi, Madassu Keerthipriya, Yamini Belur, Atchayaram Nalini, Prasanta Kumar Ghosh. 12767-12771 [doi]
- Enhancing Argumentative Relation Classification by Multi-Granularity Retrieval and Heterogeneous Graph ReasoningCaihua Yang, Jianzhu Bao, Bin Liang, Ruifeng Xu. 12772-12776 [doi]
- Context-Aware Dual Attention Network for Multimodal Sarcasm DetectionLiangyi Kang, Jie Liu, Dan Ye, Zhiyang Zhou. 12777-12781 [doi]
- PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic ModelYukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda. 12782-12786 [doi]
- Self-Supervised Adaptive AV Fusion Module for Pre-Trained ASR ModelsChristopher Simic, Tobias Bocklet. 12787-12791 [doi]
- A Study of Mispronunciation Detection and Diagnosis Based on Meta-LearningYukai Wan, Yuqi Shi, Binghuai Lin, Yanlu Xie. 12792-12796 [doi]
- CSNet: Contrastive Siamese Network for Robust SLUHao Yang, Min Zhang, Daimeng Wei, Jiaxin Guo. 12797-12801 [doi]
- Monte Carlo Self-Training for Speech RecognitionAnshuman Tripathi, Soheil Khorram, Han Lu, Jaeyoung Kim, Qian Zhang, Hasim Sak. 12802-12806 [doi]
- Widrow-Hoff LMS Adaline Demonstrator for Schools and CollegesStephen R. Alty, Clive Cheong Took. 12807-12810 [doi]
- GuessKT: Improving Knowledge Tracing via Considering Guess BehaviorsShuaishuai Zu, Songtao Cai, Weitao Tang, Chuyu Wang, Li Li, Jun Shen. 12811-12815 [doi]
- Fearless Steps Apollo: Team Communications Based Community Resource Development for Science, Technology, Education, and Historical PreservationJohn H. L. Hansen, Aditya Joglekar, Meena M. Chandra Shekar, Szu-Jui Chen, Xi Liu. 12816-12820 [doi]
- 3-D Near-Field Localization by Jointly Exploiting Spatial and Temporal Information Based on a Nonuniform Cross ArrayZelong Yi, Hua Chen, Zhiwei Jiang, Wei Liu, Qing Wang, Gang Wang. 12821-12825 [doi]
- Rapid Hybrid Modular Receive Beamforming Via Learned OptimizationOhad Levy, Nir Shlezinger. 12826-12830 [doi]
- IRS-Assisted Covert Communication with a BPP Distributed Warden outside a Safety ZoneZhilin Chen, Shihao Yan, Xiaobo Zhou 0004, Feng Shu 0002, Jiande Sun 0001, Derrick Wing Kwan Ng. 12831-12835 [doi]
- Model-Based Learning for Location-to-Channel MappingBaptiste Chatelier, Luc Le Magoarou, Vincent Corlay, Matthieu Criissière. 12836-12840 [doi]
- Diffusion-Based Adversarial Purification for Robust Deep Mri ReconstructionIsmail Alkhouri, Shijun Liang, Rongrong Wang, Qing Qu 0001, Saiprasad Ravishankar. 12841-12845 [doi]
- A Keyless Extraction Framework Targeting at Deep Learning Based Image-Within-Image ModelsRongxuan Peng, Xianbo Mo, Shunquan Tan, Bin Li 0011, Jiwu Huang. 12846-12850 [doi]
- Joint DOA Estimation and Distorted Sensor Detection Under Entangled Low-Rank and Row-Sparse ConstraintsHuiping Huang, Tianjian Zhang, Feng Yin, Bin Liao, Henk Wymeersch. 12851-12855 [doi]
- Towards ASR Robust Spoken Language Understanding Through in-Context Learning with Word Confusion NetworksKevin Everson, Yile Gu, Chao-Han Huck Yang, Prashanth Gurunath Shivakumar, Guan-Ting Lin, Jari Kolehmainen, Ivan Bulyko, Ankur Gandhe, Shalini Ghosh, Wael Hamza, Hung-yi Lee, Ariya Rastrow, Andreas Stolcke. 12856-12860 [doi]
- Localization in Sensor Networks Using Distributed Low-Rank Matrix CompletionYufan Fan, Marius Pesavento. 12861-12865 [doi]
- Decentralized Generalized Approximate Message-Passing for Tree-Structured NetworksKeigo Takeuchi. 12866-12870 [doi]
- Multicast with Multiple Wardens in IRS-Aided Covert DFRC SystemIndrasish Ghosh, Arpan Chattopadhyay, Kumar Vijay Mishra, Athina P. Petropulu. 12871-12875 [doi]
- Tensor Reconstruction-Based Sparse Array 2-D DOA Estimation of Mixed Coherent and Uncorrelated SignalsSaidur R. Pavel, Yimin D. Zhang, Shunqiao Sun, André L. F. de Almeida. 12876-12880 [doi]
- Towards Efficient Modeling and Inference in Multi-Dimensional Gaussian Process State-Space ModelsZhidi Lin, Juan Maroñas, Ying Li, Feng Yin, Sergios Theodoridis. 12881-12885 [doi]
- Integrating Sensing, Communication, and Computation in the SkyYao Tang, Guangxu Zhu, Wei Xu 0001, Man Hon Cheung, Tat-Ming Lok, Shuguang Cui. 12886-12890 [doi]
- Sparse, Weight-Constrained Arrays With O(N) Aperture for Reduced Mutual CouplingPranav Kulkarni, P. P. Vaidyanathan. 12891-12895 [doi]
- Generative Al-aided Joint Training-free Secure Semantic Communications via Multi-modal PromptsHongyang Du, Guangyuan Liu, Dusit Niyato, Jiayi Zhang 0001, Jiawen Kang 0001, Zehui Xiong, Bo Ai 0001, Dong In Kim. 12896-12900 [doi]
- Simultaneous Positioning and Tracking Using Dynamic Factor Graphs and Geometric Average FusionHallysson Oliveira, Stiven S. Dias, Marcelo G. S. Bruno. 12901-12905 [doi]
- A Distributed Joint Integrated Probabilistic Data Association (JIPDA) Filter with Soft Object AssociationThomas Kropfreiter, Florian Meyer, Franz Hlawatsch. 12906-12910 [doi]
- Multi-Agent 3D Seismic Exploration Using Adapt-then-Combine Full Waveform Inversion in a hardware-in-the-loop SystemBan-Sok Shin, Dhruv Patel, Luis Wientgens, Dmitriy Shutin, Armin Dekorsy. 12911-12915 [doi]
- Learning Multiplex Graph With Inter-Layer CouplingChenyue Zhang, Hoi-To Wai. 12916-12920 [doi]
- A Statistical Characterization Of Communication Performance In RIS-Aided NetworksFrancesco Guidi, Anna Guerra, Emanuele Mengoli, Alberto Zanella. 12921-12925 [doi]
- Integrated Sensing And Communication In Unlicensed Mmwave Bands: Joint Beamforming Training And Energy AllocationQimei Chen, Yipeng Liang, Hao Jiang 0010. 12926-12930 [doi]
- Exploiting A Quantum Multiple Kernel Learning Approach For Low-Resource Spoken Command RecognitionXianyan Fu, Xiao-Lei Zhang 0001, Chao-Han Huck Yang, Jun Qi 0002. 12931-12935 [doi]
- Decentralized Low Rank Matrix Recovery from Column-Wise Projections by Alternating GD and MinimizationShana Moothedath, Namrata Vaswani. 12936-12940 [doi]
- PaCaS-WAA: Patch-Based Contrastive Semi-Supervised Learning with Wavelet Guidance and Adaptive Augmentation for Tumour SegmentationWanqing Xiong, Zailiang Chen 0001, Qing Liu 0003, Wenjia Wu, Jian Zhang, Hailan Shen. 12941-12945 [doi]
- Mainlobe Deceptive Jammer Suppression Using FDA-MIMO Radar in the Presence of Multipath PropagationYitao Zhang, Lan Lan 0001, Guisheng Liao, Shengqi Zhu 0001, Jingwei Xu 0002, Hing-Cheung So. 12946-12950 [doi]
- Diffusion-Based Speech Enhancement with Joint Generative and Predictive DecodersHao Shi, Kazuki Shimada, Masato Hirano, Takashi Shibuya 0001, Yuichiro Koyama, Zhi Zhong, Shusuke Takahashi, Tatsuya Kawahara, Yuki Mitsufuji. 12951-12955 [doi]
- A Green Learning Approach to Spoofed Speech DetectionChengwei Wei, Runqi Pang, C. C. Jay Kuo. 12956-12960 [doi]
- Sensing with Random SignalsShihang Lu, Fan Liu 0005, Fuwang Dong, Yifeng Xiong, Jie Xu 0002, Ya-Feng Liu. 12961-12965 [doi]
- Optimal Ber Minimum Precoder Design for OTFS-Based ISAC SystemsJun Wu, Weijie Yuan, Zhiqiang Wei 0001, Jinjin Yan, Derrick Wing Kwan Ng. 12966-12970 [doi]
- Nonasymptotic Performance Limits of Low-Latency Secure Integrated Sensing and Communication SystemsOnur Günlü, Matthieu R. Bloch, Rafael F. Schaefer, Aylin Yener. 12971-12975 [doi]
- MADRL-Based UAVs Trajectory Design with Anti-Collision Mechanism in Vehicular NetworksLeonardo Spampinato, Enrico Testi, Chiara Buratti, Riccardo Marini. 12976-12980 [doi]
- Cross-Subject EEG Emotion Recognition Based on Interconnected Dynamic Domain AdaptationYanling An, Shaohai Hu, Shuaiqi Liu 0001, Zeyao Wang, Xinrui Wang, Xiaole Ma. 12981-12985 [doi]
- Topological Neural Networks over the AirSimone Fiorellino, Claudio Battiloro, Paolo Di Lorenzo. 12986-12990 [doi]
- Diagnosis of Autism Spectrum Disorder Based on Contrastive Functional Connectivity Graph Learning NetworkShuaiqi Liu, Siqi Wang, Beibei Liang, Bing Li, Jianpeng Xu. 12991-12995 [doi]
- Partially Observable Model-Based Learning FOR ISAC Resource AllocationPetteri Pulkkinen, Visa Koivunen. 12996-13000 [doi]
- A Hybrid CNN-Transformer for Focal Liver Lesion ClassificationLing Zhao, Shuaiqi Liu, Bing Li, Wenjia Cai, Ping Liang, Jie Yu, Jie Zhao. 13001-13005 [doi]
- Likelihood Consensus 2.0: Reducing Interagent Communication in Distributed Bayesian Target TrackingErik Sausa, Pavel Rajmic, Franz Hlawatsch. 13006-13010 [doi]
- Analysis of the SINR in LEO-PNT Systems with 5G PRS Multiplexing: Integration of PRS and NTNGonzalez-Garrido. Alejandro, Querol. Jorge, Chatzinotas. Symeon. 13016-13020 [doi]
- Enhancing Semantic Communication with Deep Generative Models: An OverviewEleonora Grassucci, Yuki Mitsufuji, Ping Zhang, Danilo Comminiello. 13021-13025 [doi]
- Energy-Efficient Decentralized Learning Via Graph SparsificationXusheng Zhang, Cho-Chun Chiu, Ting He 0001. 13026-13030 [doi]
- MIMO imaging method with iterative-based super-resolution for automotive radarBong Seok Kim, Jonghun Lee, Youngseok Jin, Sangdong Kim, Ram M. Narayanan. 13031-13035 [doi]
- Noise2one: One-Shot Image Denoising with Local Implicit LearningKwanyoung Kim, Jong Chul Ye. 13036-13040 [doi]
- Low Overhead DMG Sensing for Vital Signs DetectionSteve Blandino, Jihoon Bang, Jian Wang, Samuel Berweger, Jack Chuang, Jelena Senic, Tanguy Ropitault, Camillo Gentile, Nada Golmie. 13041-13045 [doi]
- Training Ultra-Low-Latency Spiking Neural Networks from ScratchGourav Datta, Zeyu Liu 0003, Peter A. Beerel. 13046-13050 [doi]
- Parameter-Efficient Adaptation for Computational ImagingNebiyou Yismaw, Ulugbek S. Kamilov, M. Salman Asif. 13051-13055 [doi]
- Space-Time Adaptive Processing for Radars in Connected and Automated Vehicular PlatoonsZahra Esmaeilbeig, Kumar Vijay Mishra, Mojtaba Soltanalian. 13056-13060 [doi]
- Repurposing Mu-Mimo Downlink For Joint Wireless Communications And Imaging Via Virtual UsersKris Li, David Ramirez, Kumar Vijay Mishra, Ashutosh Sabharwal. 13061-13065 [doi]
- Train Long and Test Long: Leveraging Full Document Contexts in Speech ProcessingWilliam Chen, Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Shinji Watanabe 0001. 13066-13070 [doi]
- DIFFSC: Semantic Communication Framework With Enhanced Denoising Through Diffusion Probabilistic ModelsZeyu Jiang, Xiaohong Liu, Guoxing Yang, Weizhi Li, Aini Li, Guangyu Wang. 13071-13075 [doi]
- CSI-Free Over-The-Air Decentralized Learning Over Frequency Selective ChannelsNicolò Michelusi. 13076-13080 [doi]
- IHT-Inspired Neural Network for Single-Snapshot DOA Estimation with Sparse Linear ArraysYunqiao Hu, Shunqiao Sun. 13081-13085 [doi]
- Isac Beamforming Optimization For Robust Transmission In Dynamic Mmwave Mimo NetworksLei Li 0030, Tenghao Cai, Tsung-Hui Chang. 13086-13090 [doi]
- Open-Set Deepfake Detection To Fight The UnknownMichael Macedo Diniz, Anderson Rocha 0001. 13091-13095 [doi]
- Multi-Person Respiration Rate Estimation With Single Pair Of Transmit And Receive AntennaHao-Hsuan Chang, Vishnu V. Ratnam, Hao Chen 0010, Junsu Choi, Charlie Jianzhong Zhang. 13096-13100 [doi]
- Enhanced Channel Estimation in mm-Wave Mimo Systems Leveraging Integrated Communication and SensingSilvia Mura, Marouan Mizmizi, Umberto Spagnolini, Athina P. Petropulu. 13101-13105 [doi]
- A Joint Look on Lunar Satellite and Cooperative Surface PNTRobert Pöhlmann, Emanuel Staudinger, Siwei Zhang, Armin Dammann. 13106-13110 [doi]
- A Near-Field Source Localization Method for Uniform/Sparse Centrally Symmetric Rectangular ArraysXiaohuan Wu, Jiang Wang, Yazhou Liu, Jianing Li. 13111-13115 [doi]
- Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-Based ASRXugang Lu, Peng Shen, Yu Tsao 0001, Hisashi Kawai. 13116-13120 [doi]
- Uncertainty Quantification in Deep Learning Based Kalman FiltersYehonatan Dahan, Guy Revach, Jindrich Duník, Nir Shlezinger. 13121-13125 [doi]
- Benchmarking Adversarial Robustness of Image Shadow Removal with Shadow-Adaptive AttacksChong Wang, Yi Yu, Lanqing Guo, Bihan Wen. 13126-13130 [doi]
- A Robust Audio Deepfake Detection System via Multi-View FeatureYujie Yang, Haochen Qin, Hang Zhou, Chengcheng Wang, Tianyu Guo 0001, Kai Han 0002, Yunhe Wang 0001. 13131-13135 [doi]
- Diffusion Models for Audio Semantic CommunicationEleonora Grassucci, Christian Marinoni, Andrea Rodriguez, Danilo Comminiello. 13136-13140 [doi]
- Graphical Inference in Non-Markovian Linear-Gaussian State-Space ModelsEmilie Chouzenoux, Víctor Elvira. 13141-13145 [doi]
- Tensor Decomposition-Based Data Fusion for Biomarker Extraction from Multiple EEG ExperimentsK. R. Stunnenberg, Richard C. Hendriks, J. L. Vroegop, M. L. Adank, Borbála Hunyadi. 13146-13150 [doi]
- Inference of Genetic Effects via Approximate Message PassingAl Depope, Marco Mondelli, Matthew R. Robinson. 13151-13155 [doi]
- Inferring the Graph of Networked Dynamical Systems under Partial Observability and Spatially Colored NoiseAugusto Santos, Diogo Rente, Rui Seabra, José M. F. Moura. 13156-13160 [doi]
- Sparse Bayesian Synthetic Aperture Processing Based DOA Estimation with Deformed Towed ArraysJie Yang, Yixin Yang, Bin Liao. 13161-13165 [doi]
- Beamforming Design and Performance Evaluation for RIS-Aided Localization Using LEO Satellite SignalsLei Wang, Pinjun Zheng, Xing Liu 0012, Tarig Ballal, Tareq Y. Al-Naffouri. 13166-13170 [doi]
- Debris Sensing Based on Leo Constellation: An Intersatellite Channel Parameter Estimation ApproachYuan Liu, M. R. Bhavani Shankar, Linlong Wu, Björn E. Ottersten. 13171-13175 [doi]
- M3DSYNTH: A Dataset of Medical 3D Images with AI-Generated Local ManipulationsGiada Zingarini, Davide Cozzolino, Riccardo Corvi, Giovanni Poggi, Luisa Verdoliva. 13176-13180 [doi]
- Self-Supervised Path Planning in UAV-Aided Wireless Networks Based on Active InferenceAli Krayani, Khalid Khan, Lucio Marcenaro, Mario Marchese, Carlo S. Regazzoni. 13181-13185 [doi]
- Efficient Quantum Recurrent Reinforcement Learning Via Quantum Reservoir ComputingSamuel Yen-Chi Chen. 13186-13190 [doi]
- Sampling and Recovery of Signals Over Product Cell StructuresThummaluru Siddartha Reddy, Sundeep Prabhakar Chepuri. 13191-13195 [doi]
- Tracking of Multiple Spawning Targets with Heterogeneous Sensors for Seabed-To-Space Situational AwarenessDomenico Gaglione, Leonardo Maria Millefiori, Paolo Braca, Peter Willett 0001, Moe Z. Win. 13196-13200 [doi]
- Radio Slam with Hybrid Sensing for Mixed Reflection Type EnvironmentsJaebok Lee, Hyunwoo Park, Hyeonjin Chung, Sunwoo Kim 0001. 13201-13205 [doi]
- Human Perception-Guided Meta-Training for Few-Shot NeRFBingyin Li, Xiaoyu Xu, Sheyang Tang, Li Yu 0003, Zhou Wang 0001. 13206-13210 [doi]
- On Improved Distributed Random Reshuffling over NetworksPranay Sharma, Jiarui Li, Gauri Joshi. 13211-13215 [doi]
- Kalman Filter for Tracking Network DynamicLital Dabush, Tirza Routtenberg. 13216-13220 [doi]
- Low-Rank Completion Based Normal Guided Lidar Point Cloud Up-SamplingPei-an, Di Zhu, You Yang, Jie Ma. 13221-13225 [doi]
- Deep Unfolded Annealed Stein Particle Filter for Vehicle TrackingMarco Piavanini, Luca Barbieri, Mattia Brambilla, Monica Nicoli. 13226-13230 [doi]
- An Interpretable and Generalizable Speech Detector Based on a CNN-LSTM FrameworkZijun Wan, Yunying Wu, Mohamed Baha Ben Ticha, Gaël Le Godais, Philippe Kahane, Stéphan Chabardès, Weidong Chen 0002, Shaomin Zhang, Blaise Yvert. 13231-13235 [doi]
- A Gibbs Sampler for Bayesian Nonparametric State-Space ModelsChristos Merkatas, Simo Särkkä. 13236-13240 [doi]
- Communication-Efficient Federated Optimization over Semi-Decentralized NetworksHe Wang, Yuejie Chi. 13241-13245 [doi]
- Updated Corpora and Benchmarks for Long-Form Speech RecognitionJennifer Drexler Fox, Desh Raj, Natalie Delworth, Quinn McNamara, Corey Miller, Migüel Jetté. 13246-13250 [doi]
- A Bayesian Approach to High-Order Link PredictionGeorgios Vasileios Karanikolas, Alba Pagès-Zamora, Georgios B. Giannakis. 13251-13255 [doi]
- Recent Advances in Scalable Energy-Efficient and Trustworthy Spiking Neural Networks: from Algorithms to TechnologySouvik Kundu 0002, Rui-Jie Zhu, Akhilesh Jaiswal 0001, Peter A. Beerel. 13256-13260 [doi]
- Object Trajectory Estimation with Multi-Band Wi-Fi Neural Dynamic FusionSorachi Kato, Pu Wang, Toshiaki Koike-Akino, Takuya Fujihashi, Hassan Mansour, Petros Boufounos. 13261-13265 [doi]
- Radar Perception with Scalable Connective Temporal Relations for Autonomous DrivingRyoma Yataka, Pu Wang, Petros Boufounos, Ryuhei Takahashi. 13266-13270 [doi]
- Reinforcement Learning-Guided Optogenetic Stimulation Policies for Robust Functional Network DiscoveryShoutik Mukherjee, Peter Jendrichovksy, Patrick O. Kanold, Behtash Babadi. 13271-13275 [doi]
- Inference of Time-Varying Graph Topologies via Gaussian ProcessesChen Cui, Petar M. Djuric. 13276-13280 [doi]
- Vector Approximate Message Passing for Not So Large N.I.I.D. Generalized I/O Linear ModelsZilu Zhao, Fangqing Xiao, Dirk Slock. 13281-13285 [doi]
- Near-Field Localization with 1-bit Quantized Hybrid A/D ReceptionIoannis Gavras, Italo Atzeni, George C. Alexandropoulos. 13286-13290 [doi]
- Joint Channel Estimation and Data Detection in Massive Mimo Systems Based on Diffusion ModelsNicolas Zilberstein, Ananthram Swami, Santiago Segarra. 13291-13295 [doi]
- WI-FI based Indoor Monitoring Enhanced by Multimodal FusionChiori Hori, Pu Wang, Mahbub Rahman, Cristian J. Vaca-Rubio, Sameer Khurana, Anoop Cherian, Jonathan Le Roux. 13296-13300 [doi]
- Joint Near-Field Target Tracking and Communications with full Duplex Holographic MIMOIoannis Gavras, George C. Alexandropoulos. 13301-13305 [doi]
- Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive StudyW. Ronny Huang, Cyril Allauzen, Tongzhou Chen, Kilol Gupta, Ke Hu, James Qin, Yu Zhang 0033, Yongqiang Wang, Shuo-Yiin Chang, Tara N. Sainath. 13306-13310 [doi]
- Are SNNs Truly Energy-efficient? - A Hardware PerspectiveAbhiroop Bhattacharjee, Ruokai Yin, Abhishek Moitra, Priyadarshini Panda. 13311-13315 [doi]
- FedLion: Faster Adaptive Federated Optimization with Fewer CommunicationZhiwei Tang, Tsung-Hui Chang. 13316-13320 [doi]
- VoxtLM: Unified Decoder-Only Models for Consolidating Speech Recognition, Synthesis and Speech, Text Continuation TasksSoumi Maiti, Yifan Peng, Shukjae Choi, Jee-weon Jung, Xuankai Chang, Shinji Watanabe 0001. 13326-13330 [doi]
- Coupled Block-Term Tensor Decomposition for Near-Field Localization in multi-static MIMO Radar SystemsLiana Khamidullina, Martin Haardt. 13331-13335 [doi]
- On Generalized Signature GraphsGerald Matz. 13336-13340 [doi]
- Dialog Modeling in Audiobook SynthesisCheng-chieh Yeh, Amirreza Shirani, Weicheng Zhang, Tuomo Raitio, Ramya Rasipuram, Ladan Golipour, David Winarsky. 13341-13345 [doi]
- Analysis of High-Order Brain Networks Resolved in Time and Frequency Using CP DecompositionAshkan Faghiri, Armin Iraji, Tülay Adali, Vince D. Calhoun. 13346-13350 [doi]
- Prompting Large Language Models with Speech Recognition AbilitiesYassir Fathullah, Chunyang Wu, Egor Lakomkin, Junteng Jia, Yuan Shangguan, Ke Li, Jinxi Guo, Wenhan Xiong, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer. 13351-13355 [doi]
- 2M) for Neuromorphic Vision SensorsMd. Abdullah-Al Kaiser, Akhilesh R. Jaiswal. 13356-13360 [doi]
- Generalized Hole-Filling Strategy for Overlapping Hole-Existing Coprime Arrays for DOA EstimationXiang Li 0034, Feng-Gang Yan, Ming Jin 0004, Maria Sabrina Greco, Fulvio Gini. 13361-13365 [doi]
- Investigating End-to-End ASR Architectures for Long Form Audio TranscriptionNithin Rao Koluguri, Samuel Kriman, Georgy Zelenfroind, Somshubra Majumdar, Dima Rekesh, Vahid Noroozi, Jagadeesh Balam, Boris Ginsburg. 13366-13370 [doi]
- CORAAL QA: A Dataset and Framework for Open Domain Spontaneous Speech Question Answering from Long Audio FilesNatarajan Balaji Shankar, Alexander Johnson, Christina Chance, Hariram Veeramani, Abeer Alwan. 13371-13375 [doi]
- QUAPPROX: A Framework for Benchmarking the Approximability of Variational Quantum CircuitJinyang Li 0001, Ang Li 0006, Weiwen Jiang. 13376-13380 [doi]
- Monostatic DMG Passive Sensing with Hypothesis TestingPu Wang 0004, Petros Boufounos. 13381-13385 [doi]
- Data-Driven Convex Regularizers for Inverse ProblemsS. Mukherjee, Sören Dittmer, Zakhar Shumaylov, Sebastian Lunz, O. Öktem, Carola B. Schönlieb. 13386-13390 [doi]
- WIFIACT: Enhancing Human Sensing Through Environment Robust Preprocessing And Bayesian Self-Supervised LearningNiall Lyons, Avik Santra, Vikram Kumar Ramanna, Kiran Uln, Rakesh Taori, Ashutosh Pandey. 13391-13395 [doi]
- A Hybrid Slow-Time Coding Framework for Automotive MIMO RadarAboulnasr Hassanien, Elias Aboutanios. 13396-13400 [doi]
- Quantum Federated Learning with Quantum NetworksTyler Wang, Huan-Hsin Tseng, Shinjae Yoo. 13401-13405 [doi]
- Automotive Radar Interference Characterization: FMCW or PMCW?Khurram Usman Mazher, Andrew M. Graff, Nuria González Prelcic, Robert W. Heath Jr.. 13406-13410 [doi]
- High Accuracy Device Localization in Indoor Mmwave Networks Exploiting Channel Sparsity and Virtual Anchor MappingJoan Palacios, Murat Bayraktar, Nuria González Prelcic, Hao Chen 0010. 13411-13415 [doi]
- Memory Efficient Corner Detection for Event-Driven Dynamic Vision SensorsPao-Sheng Vincent Sun, Arren Glover, Chiara Bartolozzi, Arindam Basu. 13416-13420 [doi]
- Can Whisper Perform Speech-Based In-Context Learning?Siyin Wang, Chao-Han Huck Yang, Ji Wu, Chao Zhang. 13421-13425 [doi]
- IRS-Assisted Joint Sensing and Communication Design for Autonomous DrivingWeitong Zhai, Xiangrong Wang 0001, Moeness G. Amin, Maria S. Greco, Fulvio Gini. 13426-13430 [doi]
- Bayesian Learning-Based Kalman Smoothing For Linear Dynamical Systems With Unknown Sparse InputsRupam Kalyan Chakraborty, Geethu Joseph, Chandra R. Murthy. 13431-13435 [doi]
- Contextual Human Object Interaction Understanding from Pre-Trained Large Language ModelJianjun Gao, Kim-Hui Yap, Kejun Wu, Duc Tri Phan, Kratika Garg, Boon Siew Han. 13436-13440 [doi]
- Uncertainty-Guided Physics-Driven Deep Learning Reconstruction via Cyclic Measurement ConsistencyChi Zhang 0057, Mehmet Akçakaya. 13441-13445 [doi]
- SC-MAD: Mixtures of Higher-Order Networks for Data AugmentationMadeline Navarro, Santiago Segarra. 13446-13450 [doi]
- Effect of Beampattern on Matrix Completion with Sparse ArraysRobin Rajamäki, Mehmet Can Hücümenoglu, Pulak Sarangi, Piya Pal. 13451-13455 [doi]
- Weakly-Supervised Crowd Counting with Token Attention and Fusion: A Simple and Effective BaselineYi Wang, Qiongyang Hu, Lap-Pui Chau. 13456-13460 [doi]
- Echocardiography Video Synthesis from End Diastolic Semantic Map Via Diffusion ModelNguyen Van Phi, Tran Minh Duc, Pham Huy Hieu, Tran Quoc Long. 13461-13465 [doi]
- Frequency Masking for Universal Deepfake DetectionChandler Timm C. Doloriel, Ngai-Man Cheung. 13466-13470 [doi]
- Estimation of Spectral Lines Using Expectation PropagationJiang Zhu 0004, Xupeng Lei, Mihai-Alin Badiu. 13471-13475 [doi]
- Hypergraph-Mlp: Learning on Hypergraphs Without Message PassingBohan Tang, Siheng Chen, Xiaowen Dong 0001. 13476-13480 [doi]
- Efficient Video and Audio Processing with Loihi 2Sumit Bam Shrestha, Jonathan Timcheck, Edward Paxon Frady, Leobardo Campos-Macias, Mike Davies. 13481-13485 [doi]
- SG2SC: A Generative Semantic Communication Framework for Scene Understanding-Oriented Image TransmissionMinxi Yang, Dahua Gao, Feng Xie, Jiaxuan Li, Xiaodan Song, Guangming Shi. 13486-13490 [doi]
- Deep INCM Reconstruction for Adaptive BeamformingChengyuan He, Chengwei Zhou, Zhiguo Shi 0001, Jiming Chen 0001. 13491-13495 [doi]
- Fundamental Performance Bounds for Carrier Phase Positioning in LEO-PNT SystemsJeongwan Kang, Paulson Eberechukwu, Jeonghaeng Lee, Henk Wymeersch, Sunwoo Kim 0001. 13496-13500 [doi]
- Semantic-Preserving Image Coding Based on Conditional Diffusion ModelsFrancesco Pezone, Osman Musa, Giuseppe Caire, Sergio Barbarossa. 13501-13505 [doi]
- Language-Oriented Communication with Semantic Coding and Knowledge Distillation for Text-to-Image GenerationHyelin Nam, Jihong Park, Jinho Choi 0001, Mehdi Bennis, Seong-Lyun Kim. 13506-13510 [doi]
- Joint Signal Recovery and Graph Learning from Incomplete Time-SeriesAmirhossein Javaheri, Arash Amini, Farokh Marvasti, Daniel P. Palomar. 13511-13515 [doi]
- On Unique Localization of Uncorrelated Constant-Modulus Sources Using Sparse Linear ArraysWenlong Wang, Zai Yang, Xunmeng Wu. 13516-13520 [doi]
- SALM: Speech-Augmented Language Model with in-Context Learning for Speech Recognition and TranslationZhehuai Chen, He Huang, Andrei Andrusenko, Oleksii Hrinchuk, Krishna C. Puvvada, Jason Li, Subhankar Ghosh, Jagadeesh Balam, Boris Ginsburg. 13521-13525 [doi]
- Fast Dynamics of Brain-wide Patterns on Neuronal OscillationsLei Ding 0004, Han Yuan. 13526-13530 [doi]
- Healthy Aging is Marked by Entropy Reduction in Cortical Spontaneous ActivityDa Chang, Xi-Nian Zuo. 13531-13535 [doi]