Abstract is missing.
- An End-to-End Channel-Adaptive Feature Compression Approach in Device-Edge Co-Inference SystemsYuan Ouyang, Ping Wang, Lijun He, Fan Li. 1-6 [doi]
- The Whu Wake Word Lipreading System for the 2024 Chat-Scenario Chinese Lipreading ChallengeHaoxu Wang, Cancan Li, Fei Su, Juan Liu, Hongbin Suo, Ming Li. 1-6 [doi]
- Visual-Language Alignment for Background SubtractionJiahe Liu, Dandan Zhu, Sajid Javed. 1-7 [doi]
- The NERCSLIP-USTC System for Semi-Supervised Acoustic Scene Classification of ICME 2024 Grand ChallengeQing Wang, Guirui Zhong, Hengyi Hong, Lei Wang, Mingqi Cai, Xin Fang, Ya Jiang, Jun Du. 1-4 [doi]
- MTYOLO: A Multi-Task Model to Concurrently Obtain the Vital Characteristics of Individuals or AnimalsKian Eng Ong, Sivaji Retta, Ramarajulu Srinivasan, Shawn Tan, Jun Liu. 1-4 [doi]
- Optimizing Facial Landmark Estimation for Embedded Systems Through Iterative Autolabeling and Model PruningYu-Hsi Chen, I-Hsuan Tai. 1-6 [doi]
- A Multimodal Behavior Recognition Network with Interconnected ArchitecturesNuoer Long, Kin-Seong Un, Chengpeng Xiong, Zhuolin Li, Shaobin Chen, Tao Tan, Chan-Tong Lam, Yue Sun. 1-6 [doi]
- Unveiling Soil-Vegetation Interactions: Reflection Relationships and an Attention-Based Deep Learning Approach for Carbon EstimationDristi Datta, Manoranjan Paul, M. Manzur Murshed, Shyh Wei Teng, Leigh M. Schmidtke. 1-6 [doi]
- AI-Assisted Content Creation of Naked-Eye 3D Effects on Curved LED Screen: Enhancing Artistic Expression and CreativityYeming Li, Junrong Song, David Kei-man Yip. 1-5 [doi]
- Segmentation-Based Parametric PaintingManuel Ladron de Guevara, Matt Fisher, Aaron Hertzmann. 1-6 [doi]
- VidBot: Intelligent Video Learning Tool for Content Mining and Playback Traffic StatisticsQinhua Xie, Weicong Liu, Fan Yuan, Jifan Shi, Ziyu Liu, Yanbing Zhang. 1-3 [doi]
- 3DMIT: 3D Multi-Modal Instruction Tuning for Scene UnderstandingZeju Li, Chao Zhang, Xiaoyan Wang, Ruilong Ren, Yifan Xu, Ruifei Ma, Xiangde Liu, Rong Wei. 1-5 [doi]
- ȌAI Life" and Human Fear: from Phenomenological Insights to Digital CreationJiaying Fu, Tianyue Gong, Jialin Gu, Tiange Zhou. 1-6 [doi]
- Real-Time Interaction with Animated Human Figures in Chinese Ancient PaintingsYifan Wei, Wenkang Shan, Qi Zhang, Liuxin Zhang, Jian Zhang, Siwei Ma. 1-6 [doi]
- Language-Guided Zero-Shot Object CountingMingjie Wang, Song Yuan, Zhuohang Li, Longlong Zhu, Eric Buys, Minglun Gong. 1-6 [doi]
- Leveraging Multimodal Knowledge for Spatio-Temporal Action LocalizationKeke Chen, Zhewei Tu, Xiangbo Shu. 1-5 [doi]
- Decoupling Classification and Localization of CLIPMuyang Yi, Zhaozhi Xie, Yuwen Yang, Chang Liu, Yue Ding 0001, Hongtao Lu. 1-6 [doi]
- Optimizing an Open VVC Encoder for Low Delay Remote Desktop ApplicationsAnastasia Henkel, Benjamin Bross, Jens Brandenburg, Adam Wieckowski, Detlev Marpe, Andoni Morales, Sergio Sanchez. 1-6 [doi]
- Semi-Supervised Acoustic Scene Classification under Domain Shift with MixMatch and Information Bottleneck OptimizationYongpeng Yan, Wuyang Liu, Yi Chai, Yanzhen Ren. 1-4 [doi]
- Summary of the 2024 Low-Power Efficient and Accurate Facial-Landmark Detection for Embedded SystemsYu-Shu Ni, Han-Chun Chen, Chia-Chi Tsai, Chih-Cheng Chen, Po-Yu Chen, Hsien-Kai Kuo, Jun-Ying Hunag, Po-Chi Hu, Jenq-Neng Hwang, Jiun-In Guo. 1-6 [doi]
- Impact of Prioritized HTTP/3 Transport on Low-Latency Live StreamingAyse B. Demir, Mervegul Parlak, Zafer Gurel, Deniz Ugur, Ali C. Begen. 1-6 [doi]
- Characteristics of Visual Complexity: Calligraphic Fonts vs. Printed FontsYuchen Wang, Ruimin Lyu. 1-6 [doi]
- LFCAVE: Interactive 3D Space with Multiple Light Field DisplaysHaopeng Lu, Wenkang Shan, Yuhuai Zhang, Li Song 0001, Xinfeng Zhang, Siwei Ma, Liuxin Zhang, Wen Gao 0001. 1-2 [doi]
- Dual Attribute-Spatial Relation Alignment for 3D Visual GroundingX. Yue, Kaizhi Yang, Kai Cheng, Jiebo Luo, Xuejin Chen. 1-6 [doi]
- Rate Control Optimizing Model for Constraining Over-Saturated Live Streaming QualityHuiwen Ren, Zhao Wang, Jiexi Wang, H. Yuwen, M. Siwei, Li Zhang, Wen Gao 0001. 1-6 [doi]
- Self-Supervised Learning via Multi-Transformation Classification for Action RecognitionDuc Quang Vu, Ngan Le, Jia-Ching Wang. 1-6 [doi]
- Exploring Semi-Supervised, Subcategory Classification and Subwords Alignment for Visual Wake Word SpottingShifu Xiong, Li-Rong Dai 0001. 1-6 [doi]
- An Intra- and Inter-Frame Sequence Model with Discrete Cosine Transform for Streaming Speech EnhancementYuewei Zhang, Huanbin Zou, Jie Zhu. 1-4 [doi]
- Enhancing Video Grounding with Dual-Path Modality Fusion on Animal Kingdom DatasetsChengpeng Xiong, Zhengxuan Chen, Nuoer Long, Kin-Seong Un, Zhuolin Li, Shaobin Chen, Tao Tan, Chan-Tong Lam, Yue Sun. 1-6 [doi]
- Body-Part Guided Animal Pose EstimationJiyong Rao, Tianyang Xu, Xiaoning Song, Zhenhua Feng, Xiaojun Wu 0001. 1-6 [doi]
- Equipped with Monocular Depth Estimation and Intelligent Wake-Up Vision Based Tracking System for a Human-Following Mobile RobotTsung-Han Tsai 0001, Chun-Lin Lee. 1-2 [doi]
- Beyond Aligned Target Face: StyleGAN-Based Face-Swapping via Inverted Identity LearningL. Yuanhang, Qi Mao, Libiao Jin. 1-6 [doi]
- Multimodal Semantic-Aware Automatic Colorization with Diffusion PriorHan Wang, Xinning Chai, Yiwen Wang, Yuhong Zhang, Rong Xie, Li Song 0001. 1-6 [doi]
- An Enhanced Multimodal Negative Feedback Detection Framework with Target Retrieval in Thai Spoken AudioPantid Chantangphol, Sattaya Singkul, Thanawat Lodkaew, Nattasit Maharattamalai, Atthakorn Petchsod, Theerat Sakdejayont, Tawunrat Chalothorn. 1-7 [doi]
- Creating and Experiencin 3D Immersion Using Generative 2D Diffusion: An Integrated FrameworkZiming He, Xiaomin Zou, Pengfei Wu, Ling Fan, Xiaomei Li. 1-6 [doi]
- Assistant Referee System in Da-Qiang(Pike) CompetitionChia-Chun Yen, Show-Po Guo, Tsì-Uí Ik. 1-8 [doi]
- Chinese Ancient Painting Figure Face Restoration and its Application in a Q&A Interaction SystemRui Li, Yifan Wei, Haopeng Lu, Siwei Ma, Zhenyu Liu, Hui Liu, QianYing Wang, Yaqiang Wu, Jianrong Tan. 1-6 [doi]
- A Micro-Expression Recognition System with Event CamerasPeilin Xiao, Yueyi Zhang, Dachun Kai, Yansong Peng, Zheyu Zhang, Xiaoyan Sun 0001. 1-2 [doi]
- SEMIPL: A Semi-Supervised Method for Event Sound Source LocalizationYue Li, Baiqiao Yin, Jinfu Liu, Jiajun Wen 0003, Jiaying Lin, Mengyuan Liu. 1-6 [doi]
- SFMVIT: Slowfast Meet VIT in Chaotic WorldJiaying Lin, Jiajun Wen 0003, Mengyuan Liu, L. Yue, Jinfu Liu, Baiqiao Yin. 1-6 [doi]
- Low-Complexity Video PSNR Measurement in Real-Time Communication ProductsYu-chen Sun, Jie Dong, Ahmed Fouad, Jian Zhou, Roger Zhou, Shyam Sadhwani. 1-4 [doi]
- I3FNET: Instance-Aware Feature Fusion for Few-Shot Point Cloud Generation from Single ImagePu Ching, Wen-Cheng Chen, Min-Chun Hu. 1-6 [doi]
- Summary on the Chat-Scenario Chinese Lipreading (ChatCLR) ChallengeChen-Yue Zhang, Hang Chen, Jun Du, Sabato Marco Siniscalchi, Ya Jiang, Chin-Hui Lee 0001. 1-6 [doi]
- Multimodal Guidance Network for Missing- Modality Inference in Content ModerationZhuokai Zhao, Harish Palani, Tianyi Liu, Lena Evans, Ruth Toner. 1-4 [doi]
- Towards Task-Compatible Compressible RepresentationsAnderson de Andrade, Ivan V. Bajic. 1-6 [doi]
- Adaptive Intra Period Size for Deep Learning-Based Screen Content Video CodingYuyang Wu, Liang Xie, Shangkun Sun, Wei Gao 0003, Yiqiang Yan. 1-6 [doi]
- The WuShu Database for Cursive Script Character and Style RecognitionXinrui Shan, Kejun Zhang, Lyukesheng Shen, Bolin Wang. 1-6 [doi]
- Semi-Supervised Acoustic Scene Classification Under Domain Shift Using an Attention Module and Angular LossMichael Neri, Marco Carli. 1-6 [doi]
- High-Fidelity 3D Model Generation with Relightable Appearance from Single Freehand Sketches and Text GuidanceTianrun Chen, Runlong Cao, Ankang Lu, Tao Xu, Xiaoling Zhang, Papa Mao, Min Zhang, Lingyun Sun, Ying Zang. 1-6 [doi]
- A Hybrid Multi-Perspective Complementary Model for Human Skeleton-Based Action RecognitionLinze Li, Youwei Zhou, Jiannan Hu, Cong Wu, Tianyang Xu, Xiaojun Wu 0001. 1-6 [doi]
- Attribute Vision Transformer for UAV-Human Re-IdentificationHao Ni, Yuke Li, Ping Lai, Pengpeng Zeng, Hangyu Guo, Lianli Gao. 1-6 [doi]
- A Survey on Backbones for Deep Video Action RecognitionZixuan Tang, Youjun Zhao, Yuhang Wen 0001, Mengyuan Liu. 1-6 [doi]
- Semi-Supervised Acoustic Scene Classification with Test-Time AdaptationWen Huang, Anbai Jiang, Bing Han, Xinhu Zheng, Yihong Qiu, Wenxi Chen, Yuzhe Liang, Pingyi Fan, Wei-Qiang Zhang, Cheng Lu 0007, Xie Chen 0001, Jia Liu 0001, Yanmin Qian. 1-5 [doi]
- Learning to Learn Multiview Detection by Camera-Aware AttentionHung-Min Hsu, Zhongwei Cheng, Xinyu Yuan, Lin Chen. 1-4 [doi]
- MemoMusic 4.0: Personalized Emotion Music Generation Conditioned by Valence and Arousal as Virtual TokensLuntian Mou, Yihan Sun 0006, Yunhan Tian, Ruichen He, Feng Gao 0014, Zijin Li, Ramesh C. Jain. 1-6 [doi]
- Attribute-Aware Network for Pedestrian Attribute RecognitionZesen Wu, Mang Ye, Shuoyi Chen, Bo Du 0001. 1-6 [doi]
- LLM-SAP: Large Language Models Situational Awareness-Based PlanningLiman Wang, Hanyang Zhong. 1-6 [doi]
- Multi-Modal Knowledge Transfer for Target Speaker Lipreading with Improved Audio-Visual Pretraining and Cross-Lingual Fine-TuningGenshun Wan, Zhongfu Ye. 1-6 [doi]
- Learning Discriminative and Robust Representations for UAV-View Skeleton-Based Action RecognitionShaofan Sun, Jiahang Zhang, Guo Tang, Chuanmin Jia, Jiaying Liu. 1-6 [doi]
- Compression without Compromise: Optimizing Point Cloud Object Detection with Bottleneck Architectures For Split ComputingVinay Kashyap, Nilesh A. Ahuja, Omesh Tickoo. 1-6 [doi]
- HDBN: A Novel Hybrid Dual-Branch Network for Robust Skeleton-Based Action RecognitionJinfu Liu, Baiqiao Yin, Jiaying Lin, Jiajun Wen 0003, Yue Li, Mengyuan Liu. 1-6 [doi]
- Automatic Malleefowl Mound Detection Using LiDAR-based Ground and Habitat Features with Planar Terrain ModellingNazia Hossain, M. Manzur Murshed, Mohammad Awrangjeb, Singarayer Florentine, Marc Irvin, Shyh Wei Teng. 1-6 [doi]
- Intelligent Music Chord Recognition and Evaluation Based on Convolution and AttentionShuo Wang, L. Xiaobing, Qingwen Zhou, Yun Tie, Yan Gao, Xinran Zhang. 1-6 [doi]
- Anatomically-Informed Vector Quantization Variational Auto-Encoder for Text to Motion GenerationLian Chen, Zehai Niu, Qingyuan Liu, Jinbao Wang, Jian Xue, Ke Lu 0002. 1-6 [doi]
- Optimizing Quality and Energy Efficiency in Webrtc with ML-Powered Adaptive FECJason Gerard, David C. Bonilla, Abdelhak Bentaleb, Sandra Céspedes. 1-6 [doi]
- Region-of-Interest-Based Video Coding for MachinesOlgierd Stankiewicz, Tomasz Grajek, Slawomir Mackowiak, Jakub Stankowski, Slawomir Rózek, Mateusz Lorkiewicz, Maciej Wawrzyniak, Marek Domanski. 1-6 [doi]
- Real-Time Human Motion Transfer System for Holographic DisplaysWenkang Shan, Haopeng Lu, Chuanmin Jia, Xinfeng Zhang 0001, Siwei Ma, Yaqiang Wu, Wen Gao 0001. 1-2 [doi]
- Compressive Feature Selection for Remote Visual Multi-Task InferenceSaeed Ranjbar Alvar, Ivan V. Bajic. 1-6 [doi]
- An SoC Based Hardware Accelerator for Blind Assistive SystemTsung-Han Tsai, Chun-Yu Chen. 1-2 [doi]
- Afc: Asymmetrical Feature Coding for Multi-Task Machine IntelligenceYuan Zhang, Hanming Wang, Yunlong Li, Lu Yu. 1-6 [doi]
- Neuproofreader: An Interactive Proofreading System with Suggestive Prompts for ConnectomicsYixiong Liu, Qihua Chen, Xuejin Chen. 1-2 [doi]
- Using Large Language Models to Understand Leadership Perception and ExpectationYundi Zhang, Xin Wang, Ziyi Zhang, Xueying Wang, Xiaohan Ma, Yingying Wu, Han-Wu-Shuang Baao, Xiyang Zhang. 1-7 [doi]
- Visibility-Aware Human Mesh Recovery via Balancing Dense Correspondence and Probability ModelYanjun Wang, Wenjia Wang, Jun Ling, Rong Xie, Li Song 0001. 1-6 [doi]
- AJA-Pose: A Framework for Animal Pose Estimation Based on VHR Network ArchitectureAustin Kaburia Kibaara, Joan Kabura, Antony Gitau, Ciira Maina. 1-6 [doi]
- Aesthetic Assessment of Movie Still Frame for Various Field of ViewsXin Jin, Jinyu Wang, Wenbo Yuan, B. Yihang, Heng Huang, Yiran Zhang, Bao Peng, X. Peng, Xin Song, Hanbing Yang. 1-6 [doi]
- LIghtweight Texture-Guided Fast Partition Method for Luma and Chroma Intra Coding in VVCZhikai Liu, Zhidao Zhou, Fan Liang, Wei Sun. 1-6 [doi]
- Joint Modal Circular Complementary Attention for Multimodal Aspect-Based Sentiment AnalysisHao Liu, Lijun He, Jiaxi Liang. 1-6 [doi]
- Popular Hooks: A Multimodal Dataset of Musical Hooks for Music Understanding and GenerationXinda Wu, Jiaming Wang, Jiaxing Yu, Tieyao Zhang, Kejun Zhang. 1-6 [doi]
- Pedestrian Attributes Recognition for UAV-HumanHao Ni 0002, Ping Lai, Yuke Li, Pengpeng Zeng, Haonan Zhang, Jingkuan Song. 1-5 [doi]
- Blender-NeRF: A Monocular Dynamic Human Body Explicit Reconstruction and Rendering MethodShuo Chen, Wu Liu, Binbin Yan, Xinzhu Sang, Alicia Li, Xiangcheng Yi. 1-6 [doi]
- Enhancing Lip Reading with Multi-Scale Video and Multi-EncoderHe Wang, Pengcheng Guo, Xucheng Wan, Huan Zhou 0004, Lei Xie 0001. 1-6 [doi]
- Improving Acoustic Scene Classification via Self-Supervised and Semi-Supervised Learning with Efficient Audio TransformerYuzhe Liang, Wenxi Chen, Anbai Jiang, Yihong Qiu, Xinhu Zheng, Wen Huang, Bing Han, Yanmin Qian, Pingyi Fan, Wei-Qiang Zhang, L. Cheng, Jia Liu, Xie Chen 0001. 1-6 [doi]
- Dual-Phase Msqnet for Species-Specific Animal Activity RecognitionAn Yu, Jeremy Varghese, Ferhat Demirkiran, Peter Buonaiuto, Xin Li, Ming-Ching Chang. 1-6 [doi]
- Partclip: How Does Clip Assist Mechanical Part Image Retrieval?Shangbo Mao, Dongyun Lin, Aiyuan Guo, Yiqun Li. 1-5 [doi]
- Efficient Facial Landmark Detection for Embedded SystemsJi-Jia Wu. 1-6 [doi]
- Robust Person Re-Identification Approach with Deep Learning and Optimized Feature ExtractionJian Ding, Linze Li 0002, L. Rongchang, W. Cong, X. Tianyang, Xiaojun Wu 0001. 1-6 [doi]
- Enhancing Visual Wake Word Spotting with Pretrained Model and Feature Balance ScalingXuandong Huang, Shangfei Wang, Jinghao Yan, Kai Tang, Pengfei Hu 0004. 1-6 [doi]
- Styleself: Style-Controllable High-Fidelity Conversational Virtual Avatars GenerationYilin Guo, Ruoke Yan, Yaqiang Wu, Siwei Ma. 1-6 [doi]
- Q-Boost: On Visual Quality Assessment Ability of Low-Level Multi-Modality Foundation ModelsZicheng Zhang, Haoning Wu 0001, Zhongpeng Ji, Chunyi Li, Erli Zhang 0001, Wei Sun 0029, Xiaohong Liu 0001, Xiongkuo Min, Fengyu Sun, Shangling Jui, Weisi Lin, Guangtao Zhai. 1-6 [doi]