Abstract is missing.
- A Lightweight and Efficient Model for Audio Anti-SpoofingQiaowei Ma, Jinghui Zhong, Yitao Yang, Weiheng Liu, Ying Gao, Wing W. Y. Ng. [doi]
- Vision-Language Navigation for Quadcopters with Conditional Transformer and Prompt-based Text RephraserZhe Chen, Jiyi Li, Fumiyo Fukumoto, Peng Liu, Yoshimi Suzuki. [doi]
- Generic Attention-model Explainability by Weighted Relevance AccumulationYiming Huang, Aozhe Jia, Xiaodan Zhang, Jiawei Zhang. [doi]
- Music-Graph2Vec: An Efficient Method for Embedding Pitch SegmentTaiwei Wu, Jianhao Zhang, Lian Duan, Yuanzhe Cai. [doi]
- SFNet: Saliency fast Fourier convolutional Network for medical image segmentationShangwang Liu, Danyang Liu, Yinghai Lin, Ziqi Wei. [doi]
- A Cross-modal and Redundancy-reduced Network for Weakly-Supervised Audio-Visual Violence DetectionYidan Fan, Yongxin Yu, Wenhuan Lu, Yahong Han. [doi]
- Moving Inside the Box: Interacting with Interpretation of Historical Artefacts Through Tangible Augmented RealitySuzanne Kobeisse, Lars Erik Holmquist. [doi]
- Occlusion-Aware Manga Character Re-identification with Self-Paced Contrastive LearningCi-Yin Zhang, Wei-Ta Chu. [doi]
- Joint Coordinate Regression and Association For Multi-Person Pose Estimation, A Pure Neural Network ApproachDongyang Yu, Yunshi Xie, Wangpeng An, Zhang Li, Yufeng Yao. [doi]
- Feature Enhancement and Foreground-Background Separation for Weakly Supervised Temporal Action LocalizationPeng Liu, Chuanxu Wang, Jianwei Qin, Guocheng Lin. [doi]
- Geometric Style Transfer for Face PortraitsMiaomiao Dai, Hao Yin, Ran Yi, Lizhuang Ma. [doi]
- NeRF-IS: Explicit Neural Radiance Fields in Semantic SpaceJiansong Sha, Haoyu Zhang, Yuchen Pan, Guang Kou, Xiaodong Yi 0006. [doi]
- VLM-BCD: Unsupervised Building Change DetectionYiyun Zhang, Zijian Wang. [doi]
- Guided Spatio-Temporal Learning Method for 4K Video Super-ResolutionJie Liu, Qin Jiang, Qinglin Wang. [doi]
- Prior Knowledge Guided Network for Video Anomaly DetectionZhewen Deng, Dongyue Chen, Shizhuo Deng. [doi]
- Optical Flow based Feature Prediction and Decomposed Context for Video CompressionHuashan Sun, Qian Huang, Yiming Wang, Xiaotong Guo, Ruoyu Hao. [doi]
- Global-Local GraphFormer: Towards Better Understanding of User Intentions in Sequential RecommendationHong Chen, Bin Huang, Xin Wang, Yuwei Zhou, Wenwu Zhu 0001. [doi]
- Developing a VR-based contextualized language learning system to Enhance Junior High School Students' Pragmatic CompetenceKuo-Yu Liu, Yuanshan Chen, Ming-Fang Lin, Li-Jung Daphne Huang, Cheah Ping Xiang. [doi]
- Multi-head Siamese Prototype Learning against both Data and Label CorruptionPeng-fei Zhang, Zi Helen Huang. [doi]
- A consulting system for guiding various image recognitionsRyo Kawai, Noboru Yoshida, Jianquan Liu. [doi]
- Graph-Guided MLP-Mixer for Skeleton-Based Human Motion PredictionXinshun Wang, Qiongjie Cui, Chen Chen 0015, Shen Zhao, Mengyuan Liu. [doi]
- End-to-End Variable-Rate Image Compression with Bi-Resolution Spatial-Channel Context AggregationXiaotong Guo, Qian Huang, Yiming Wang, Huashan Sun. [doi]
- Easy Travelogue: A Travelogue Editor with Automatic Image Recommendation and InsertionFan Yu, Huanyu Xing, Jia Bei, Tongwei Ren. [doi]
- VQ-VDM: Video Diffusion Models with 3D VQGANRyota Kaji, Keiji Yanai. [doi]
- Multi-Task Self-Blended Images for Face Forgery DetectionPo-Han Huang, Yue-Hua Han, Ernie Chu, Jun-Cheng Chen, Kai-Lung Hua. [doi]
- SOFTCUTMIX: Data Augmentation and Algorithmic Enhancements for Cross-Modality Person Re-IdentificationYuxiang Wan, Banghai Wang, Lunke Fei. [doi]
- Learning Snippet-to-Motion Progression for Skeleton-based Human Motion PredictionXinshun Wang, Qiongjie Cui, Chen Chen 0015, Shen Zhao, Mengyuan Liu. [doi]
- DiffuseGAE: Controllable and High-fidelity Image Manipulation from Disentangled RepresentationYipeng Leng, Qiangjuan Huang, Zhiyuan Wang, Yangyang Liu, Haoyu Zhang. [doi]
- An Evaluation of Decentralized Group Formation Techniques for Flying Light SpecksHamed Alimohammadzadeh, Heather Culbertson, Shahram Ghandeharizadeh. [doi]
- Multi-region CNN-Transformer for Micro-gesture Recognition in Face and Upper BodyKeita Suzuki, Satoshi Suzuki, Ryo Masumura, Atsushi Ando, Naoki Makishima. [doi]
- Self-supervised anomaly detection of medical images based on dual-module discrepancyYuqing Song 0006, Jinyong Cheng. [doi]
- Dual-domain Feature Learning and Cross Dimension Interaction Attention for Nighttime Image DehazingYun Liang 0003, Shijie Peng, Xinjie Xiao, Lianghui Li. [doi]
- Object Detection via Fisheye CameraYi-Zeng Hsieh, Hau-Ching Chen, Yi-Hung Yeh. [doi]
- Hierarchical Multi-Scale Adaptive Conv-LSTM Network for Human Action Recognition Based on Wearable SensorsWeiliang Xie, Qian Huang, Chang Li, Yanfang Wang, Yanwei Liu. [doi]
- Mask-based Food Image Synthesis with Cross-Modal Recipe EmbeddingsZhongtao Chen, Yuma Honbu, Keiji Yanai. [doi]
- Reducing Objective Difficulty Without Influencing Subjective Difficulty in a Video GameShunta Sakaue, Taiju Kimura, Hiroki Nishino. [doi]
- Facial Parameter Splicing: A Novel Approach to Efficient Talking Face GenerationXianhao Chen, Kuan Chen, Yuzhe Mao, Linna Zhou, Weike You. [doi]
- Research on Multi-Person Pose Estimation Based on YOLO and Decoupled Multi-Level Feature Layers FusionBin Zheng, He Zhang, Lu Jin. [doi]
- RGB-D Tracking via Hierarchical Modality Aggregation and Distribution NetworkBoyue Xu, Yi Xu, Ruichao Hou, Jia Bei, Tongwei Ren, Gangshan Wu. [doi]
- Exploring User-oriented Social Recommendation System through Granting Users Control over a Social GroupJeonguk Hong, Gyewon Jeon, Sangwon Lee. [doi]
- History-Detr: Optimize Query Initialization Strategy by Using Historical Information and KinematicsWeijie Luo, Zihao Liu, Guohao Dai, Ningyi Xu. [doi]
- NeRF-SDP: Efficient Generalizable Neural Radiance Field with Scene Depth PerceptionQiuwen Wang, Shuai Guo, Haoning Wu, Rong Xie, Li Song, Wenjun Zhang. [doi]
- Targeted Transferable Attack against Deep Hashing RetrievalFei Zhu, Wanqian Zhang, Dayan Wu, Lin Wang, Bo Li, Weiping Wang. [doi]
- I2SRM: Intra- and Inter-Sample Relationship Modeling for Multimodal Information ExtractionYusheng Huang, Zhouhan Lin. [doi]
- FinGuard: A Multimodal AIGC Guardrail in Financial ScenariosWenlong Du, Qingquan Li, Jian Zhou 0011, Xu Ding, Xuewei Wang, Zhongjun Zhou, Jin Liu. [doi]
- Learning a Contextualized Multimodal Embedding for Zero-shot Cooking Video Caption GenerationLin Wang, Hongyi Zhang, Xingfu Wang, Yan Xiong. [doi]
- Robust Tracking via Unifying Pretrain-Finetuning and Visual Prompt TuningGuangtong Zhang, Qihua Liang, Ning Li, Zhiyi Mo, Bineng Zhong. [doi]
- Improve Singing Quality Prediction Using Self-supervised Transfer Learning and Human Perception FeedbackPing-Chen Chan, Po-Wei Chen, Von-Wun Soo. [doi]
- Reprogramming Self-supervised Learning-based Speech Representations for Speaker AnonymizationXiaojiao Chen, Sheng Li 0010, Jiyi Li, Hao Huang, Yang Cao, Liang He. [doi]
- Monocular 3D Pose Estimation of Very Small Airplane in the AirSung Kwon On, Songhyon Kim, Kwangjin Yang, Younggun Lee. [doi]
- GTTrack: Gaussian Transformer Tracker for Visual TrackingYun Liang 0003, Fumian Long, Qiaoqiao Li, Dong Wang. [doi]
- Efficient Hand Gesture Recognition using Multi-Task Multi-Modal Learning and Self-DistillationJie-ying Li, Herman Prawiro, Chia-Chen Chiang, Hsin-Yu Chang, Tse-Yu Pan, Chih-Tsun Huang, Min-Chun Hu 0001. [doi]
- Automatic Dataset Creation from User-generated Recipes for Ingredient-centric Food Image AnalysisLiangyu Wang, Yoko Yamakata, Kiyoharu Aizawa. [doi]
- Semantic-Aware Dynamic Feature Selection and Fusion for Object Detection in UAV VideosJianping Zhong, Zhaobo Qi, Weigang Zhang, Qingming Huang. [doi]
- MontageNet: Annotated Dataset of Furniture Components in Real-World ImagesIuan Kai Fang, Bo-Hao Zhang, Te Lun Liu, Hao Tan, Wei Syun Chen, Che-Rung Lee. [doi]
- RanLayNet: A Dataset for Document Layout Detection used for Domain Adaptation and GeneralizationAvinash Anand, Raj Jaiswal, Mohit Gupta, Siddhesh S. Bangar, Pijush Bhuyan, Naman Lal, Rajeev Singh, Ritika Jha, Rajiv Ratn Shah, Shin'ichi Satoh 0001. [doi]
- Relevance and Irrelevance Considered Subspace Mapping Neural Networks for Remote Sensing Text-Image RetrievalXiu Li, Chengyu Zheng, Jie Nie, Ruoyu Zhang, Xinyue Liang, Zhiqiang Wei 0002. [doi]
- Speech Spoofing Detection Based on Graph Attention Networks with Spectral and Temporal InformationPeng Zhang, Yida Chen, Meijuan Li, Hui Zhao, Jianqiang Zhang, Fuqiang Wang, Xiaoming Wu. [doi]
- OmniScorer: Real-Time Shot Spot Analysis for Court View Basketball VideosYen-Pin Cheng, Tsung-Hsun Tsai, Tai-Chen Tsai, Yi-Hsuan Chiu, Hung-Kuo Chu, Min-Chun Hu 0001. [doi]
- Achieving Privacy-Preserving Multi-View Consistency with Advanced 3D-Aware Face De-identificationJingyi Cao, Bo Liu 0001, Yunqian Wen, Rong Xie, Li Song. [doi]
- Learning Surface-awareness Network for X-Ray Prohibited Item DetectionYing Shen, Wei Li, Zhaoquan Yuan, Xiao Wu 0001. [doi]
- Contextual Associated Triplet Queries for Panoptic Scene Graph GenerationJingbin Xu, Junwen Chen, Keiji Yanai. [doi]
- Domain-Adaptive Mean Teacher for Category-Level Object Pose EstimationI-Ju Hsieh, Yo-Chung Lau, Peng-Yuan Kao, Shih-Ping Hung, Yi-Ping Hung. [doi]
- MA-Net: Multi-Attention Network for Skeleton-Based Action RecognitionJingwen Cui, Qian Huang, Chang Li 0001, Yunfei Zhang. [doi]
- One-Epoch Training for Object Detection in Fisheye ImagesYu-Hsi Chen. [doi]
- Image Cropping under Design ConstraintsTakumi Nishiyasu, Wataru Shimoda, Yoichi Sato. [doi]
- An Efficient CNN-based Prediction for Reversible Data HidingMingjin Wu, Shijun Xiang. [doi]
- Feature Adaptation with CLIP for Few-shot ClassificationGuangxing Wu, Junxi Chen, Wentao Zhang, Ruixuan Wang. [doi]
- ADNet: An Asymmetric Dual-Stream Network for RGB-T Salient Object DetectionYaqun Fang, Ruichao Hou, Jia Bei, Tongwei Ren, Gangshan Wu. [doi]
- GhostVec: A New Threat to Speaker Privacy of End-to-End Speech Recognition SystemXiaojiao Chen, Sheng Li 0010, Jiyi Li, Yang Cao, Hao Huang, Liang He. [doi]
- Summary of the 2023 PAIR-LITEON Competition: Embedded AI Object Detection Model Design Contest on Fish-eye Around-view CamerasYu-Shu Ni, Chia-Chi Tsai, Jyun-Syu Lin, Hsien-Po Meng, Po-Chi Hu, Jiun-Shiung Chen, Kun-Hung Lin, Chih-Yuan Chuang, Jiun-In Guo. [doi]
- Adaptive Fusion for Visual Question Answering: Integrating Multi-Label Classification and Similarity MatchingZhengtao Yu, Jia Zhao, Huiling Wang, Chenliang Guo, Tong Zhou, Chongxiang Sun. [doi]
- Block based Adaptive Compressive Sensing with Sampling Rate ControlKosuke Iwama, Ryugo Morita, Jinjia Zhou. [doi]
- Improving Class Representation for Zero-Shot Action RecognitionLijuan Zhou, Jianing Mao. [doi]
- From Pixels to Explanations: Uncovering the Reasoning Process in Visual Question AnsweringSiqi Zhang, Jing Liu, Zhihua Wei. [doi]
- Personalized Federated Learning via Backbone Self-DistillationPengju Wang, Bochao Liu, Dan Zeng 0001, Chenggang Yan 0001, Shiming Ge. [doi]
- A Trajectory-based Statistics and Tactics Analysis System for Table TennisGuan Yu Wu, Chun Ho Hung, Hsuan-Wei Chen, Wei-Ta Chu. [doi]
- Open-Vocabulary Segmentation Approach for Transformer-Based Food Nutrient EstimationSatayu Parinayok, Yoko Yamakata, Kiyoharu Aizawa. [doi]
- Adapting Hierarchical Transformer for Scene-Level Sketch-Based Image RetrievalJie Yang, Aihua Ke, Bo Cai. [doi]
- Confidence-guided Boundary Adaption Network for Multimodal Fake News DetectionJiajie Lin, Zhuopan Yang, Zhenguo Yang, Xiaoping Li, Fu Lee Wang, Wenyin Liu. [doi]
- A Multi-scale and Dense Object Detector for Tibetan Thangka ImagesGaohuan Dong, Qing Xie 0002, Jiachen Li 0002, Yanchun Ma, Yuhan Liu, Yongjian Liu. [doi]
- Reimagining 3D Visual Grounding: Instance Segmentation and Transformers for Fragmented Point Cloud ScenariosZehan Tan, Weidong Yang, Zhiwei Wang. [doi]
- Key Parts Spatio-Temporal Learning for Video Person Re-identificationWei Guo, Hao Wang. [doi]
- A Spatial-Spectral Decoupling Fusion Framework for Visible and Near-Infrared ImagesZhenglin Tang, Hai-Miao Hu. [doi]
- Cross-modal Image-Recipe Retrieval via Multimodal FusionLijie Li, Caiyue Hu, Haitao Zhang, Akshita Maradapu Vera Venkata Sai. [doi]
- From Global to Local: An Adaptive Environmental Illumination Estimation for Non-uniform ScatteringHuaizhuo Liu, Hai-Miao Hu. [doi]
- A Decoupled Cross-layer Fusion Network with Bidirectional Guidance for Detecting Small LogosSonghui Zhao, Sujuan Hou, Baisong Zhang. [doi]
- FTUnet: Feature Transferred U-Net For Single HDR Image ReconstructionShifeng Xie, Yi Liu, Wenjing Shuai. [doi]
- EmAGAN: Embedded Blocks Search and Mask Attention GAN for Makeup TransferYan Li, Shibin Wang. [doi]
- Few-Shot Learning for Word Recognition in Handwritten Seventeenth-Century Spanish American Notary RecordsNouf Alrasheed, Shraboni Sarker, Viviana Grieco, Praveen Rao 0001. [doi]
- Learning a Robust Model with Pseudo Boundaries for Noisy Temporal Action LocalizationXinyi Yuan, Liansheng Zhuang. [doi]
- Rethinking Parking Slot Detection with Rotated Bounding BoxShengli Zhang, Shikui Wei, Shiyin Zhang, Sen Xu, Weiyan Xu, Yao Zhao 0001. [doi]
- Power Efficient Mobile VTuber Live StreamingZichen Zhu, Stefano Petrangeli, Viswanathan Swaminathan, Sheng Wei 0001. [doi]
- Class-aware Convolution and Attentive Aggregation for Image ClassificationZitan Chen, Zhuang Qi, Xiangxian Li, Yuqing Wang, Lei Meng, Xiangxu Meng. [doi]
- Cross-Modal Retrieval for Motion and Text via DropTriple LossSheng Yan, Yang Liu, Haoqiang Wang, Xin Du, Mengyuan Liu, Hong Liu. [doi]
- Multi-Scale Superpoint Network for 3D Point Cloud Semantic SegmentationFt Zheng, Le Hui, Jin Xie, Haofeng Zhang. [doi]
- SASSM: Semantic Awareness and Self-Support Matching for Semi-Supervised Video Object SegmentationYun Liang, Ming Junhui, Jintu Zheng. [doi]
- TelEmoScatter: Enabling Remote Interaction and Emotional Connections in Virtual and Physical Music PerformanceChen-Wei Fu, Wei-Lun Huang, Pin-Xuan Liu, Yu-Hsuan Chen, Ming-Cong Su, Andrew Chen, Ping-Hsuan Han, Tse-Yu Pan. [doi]
- TrackNetV3: Enhancing ShuttleCock Tracking with Augmentations and Trajectory RectificationYu-Jou Chen, Yu-Shuen Wang. [doi]
- Cross-modal Consistency Learning with Fine-grained Fusion Network for Multimodal Fake News DetectionJun Li, Yi Bin, Jie Zou, Jiwei Wei, Guoqing Wang, Yang Yang. [doi]
- Independent and Collaborative Demosaicking Neural NetworksYan Niu, Lixue Zhang, Chenlai Li. [doi]
- Towards Representation Alignment and Uniformity in Long-tailed ClassificationYi Zheng, Zuqiang Meng. [doi]
- Adaptive Sampling for Computer Vision-Oriented Compressive SensingLuyang Liu, Hiroki Nishikawa, Jinjia Zhou, Ittetsu Taniguchi, Takao Onoye. [doi]
- Exploring Feature Fusion from A Contrastive Multi-Modality Learner for Liver Cancer DiagnosisYang Fan Chiang, Pei-Xuan Li, Ding-You Wu, Hsun-Ping Hsieh, Ching-Chung Ko. [doi]
- NuclSeg: nuclei segmentation using semi-supervised stain deconvolutionHaixin Wang, Jian Yang, Ryohei Katayama, Michiya Matsusaki, Tomoyuki Miyao, Jinjia Zhou. [doi]
- Lambda-Domain Rate Control for Neural Image CompressionNaifu Xue, Yuan Zhang. [doi]
- Towards Digital Twin of Crops for Growth Modelling using Virtual RealityKaranvir Singh, Mukesh Saini. [doi]
- Toward Optimal Real-time Dynamic Point Cloud Streaming over Bandwidth-constrained NetworksQuang Long Nguyen, Duc Nguyen, Huong Thu Truong. [doi]
- Adapting Object Detection to Fisheye Cameras: A Knowledge Distillation with Semi-Pseudo-Label ApproachChih-Chung Hsu, Wen-Hai Tseng, Ming-Hsuan Wu, Chia-Ming Lee, Wei-Hao Huang. [doi]
- Multi-view-enhanced modal fusion hashing for Unsupervised cross-modal retrievalLongfei Ma, Honggang Zhao, Zheng Jiang, Mingyong Li. [doi]
- Directional Sound Source Representation Using Paired Microphone Array with Different Characteristics Suitable for Volumetric Video CaptureShota Okubo, Tomoaki Konno, Toshiharu Horiuchi, Tatsuya Kobayashi. [doi]
- AniCropify: Image Matting for Anime-Style IllustrationYuki Matsuura, Takahiro Hayashi. [doi]
- RecipeMeta: Metapath-enhanced Recipe Recommendation on Heterogeneous Recipe NetworkJialiang Shi, Takahiro Komamizu, Keisuke Doman, Haruya Kyutoku, Ichiro Ide. [doi]