7013 | -- | 7025 | Mengyao Liu, Ruhan Liu, Jia Shu, Qirong Liu, Yuan Zhang, Lixin Jiang. AutoDDH: A dual-attention multi-task network for grading developmental dysplasia of the hip in ultrasound images |
7027 | -- | 7047 | Lakshita Agarwal, Bindu Verma. Enriching image description generation through multi-modal fusion of VGG16, scene graphs and BiGRU |
7049 | -- | 7061 | Main Uddin, Zhangjie Fu, Xiang Zhang. Deepfake face detection via multi-level discrete wavelet transform and vision transformer |
7063 | -- | 7078 | Mengnan Hu, Qianli Zhou, Rong Wang. Bridging visible and infrared modalities: a dual-level joint align network for person re-identification |
7079 | -- | 7092 | Hao Liu, Ye Liu, Shuanglong Yao, Tongshuai Yu, Ke Gao, Pengcheng Hao, Shuqing He, Ji Chen, Xing Wang. ISTFormer: lightweight transformer for enhanced super-resolution of coal rock images via iterative feature extraction |
7093 | -- | 7108 | Zhehang Qiu, Huijuan Zhang, Jie Zhou, Jianming Zhan. Image restoration for both deblurring and dehazing based on multi-channel frequency information using deep neural network |
7109 | -- | 7121 | Xi Li, Yulong Feng, Xianguo Yu, Yirui Cong, Lili Chen. Epipolar constraint-guided differentiable keypoint detection and description |
7123 | -- | 7139 | Wei Pan, Zhe Yang 0005. A lightweight enhanced YOLOv8 algorithm for detecting small objects in UAV aerial photography |
7141 | -- | 7167 | Sung-Wook Park, Se-Hoon Jung, Chun-Bo Sim. NeXtSRGAN: enhancing super-resolution GAN with ConvNeXt discriminator for superior realism |
7169 | -- | 7184 | Yuyan Liu, Qing Zhang, Yilin Zhao, Yanjiao Shi. A dual-stream learning framework for weakly supervised salient object detection with multi-strategy integration |
7185 | -- | 7199 | Guoquan Jiang, Canyu Wang, Zhanqiang Huo, Huan Xu. Multi-channel correlated diffusion for text-driven artistic style transfer |
7201 | -- | 7214 | Lihua Yang, Jinxian Zhao, Ziming Wang, Yuheng Liu, Dazhao Chi. M-KANUNet: enhanced defect segmentation in X-ray images of copper pipe welds via multi-scale representation and Kolmogorov-Arnold Networks |
7215 | -- | 7232 | Xingyue Zou, Jiqiang Tang. Guided fusion of infrared and visible images using gradient-based attentive generative adversarial networks |
7233 | -- | 7248 | Lei Dai, Wen Gao, Chengyu Tang, Min Wang, Zhihua Chen. MTMFNet: multi-threshold and multi-scale feature fusion network for text detection |
7249 | -- | 7267 | Huaiguang Cai, Yang Yang 0056, Yongqiang Tang, Zhengya Sun, Wensheng Zhang 0002. Shapley value-based class activation mapping for improved explainability in neural networks |
7269 | -- | 7283 | Wei Song, Yaobin Huang. Adaptive feature recalibration transformer for enhancing few-shot image classification |
7285 | -- | 7302 | Jialin Zhang, Xiao Wang, Hui Wei, Kui Jiang, Nan Mu, Zheng Wang. Context-aware target texture perturbation attack for concealed object detection |
7303 | -- | 7317 | Qida Cao, Jiajun Ding, Zhenyang Liu, Zhenzhong Kuang, Yijie Shao, Yilan Shen. VC-GS: view-consistent deblurring Gaussian splatting via alternating branch optimization |
7319 | -- | 7340 | Fuqiang Gou, Yonglong Li, Yanpian Mao, Chunyao Hou, Gang Wan, Jialong Li, Haoran Wang, Yongcan Chen. Planar tunnel point cloud fine registration under multiple constraints |
7341 | -- | 7350 | Haitian Ren, Quinten Kwok, Meng Sun, Xuyan Huang, Jianlin Zhu, Haoxuan Li. Toward artificial general intelligence in health care |
7351 | -- | 7365 | Chen-Bin Feng, Qi Lai, Kangdao Liu, Houcheng Su, Hao Chen, Kaixi Luo, Chi-Man Vong. Learning few-shot semantic segmentation with error-filtered segment anything model |
7367 | -- | 7377 | Peng Zhang, Yuming Yan, Yuangao Ai, Benhong Wang, Houming Shen, Zhonghan Peng. Unet-based image segmentation and binarization for water level detection |
7379 | -- | 7397 | Manuel Silva, Antonio Seoane, Omar A. Mures, Antonio M. López 0001, José Antonio Iglesias Guitián. Exploring the effects of synthetic data generation: a case study on autonomous driving for semantic segmentation |
7399 | -- | 7415 | Ronggui Wang, Hong Chen, Juan Yang, Lixia Xue. Adaptive sparse triple convolutional attention for enhanced visual question answering |
7417 | -- | 7432 | Die Yu, Zhaoyan Fang, Yong Jiang. Alleviating category confusion in fine-grained visual classification |
7433 | -- | 7446 | Haomiao Liu, Hao Xu, Chuhuai Yue, Bo Ma. Adaptive objectness learning for enhanced unknown object detection |
7447 | -- | 7458 | Xinbiao Lu, Yisen Chen, Yudan Chen, Xing Gao, Tieliu Yang, Guiyun Chen. STIG-Net: a spatial-temporal interactive graph framework for recognizing violent behaviors in videos |
7459 | -- | 7475 | Keqi Li, Yaping Wan, Gang Zou, Wangxiu Li, Jian Yang, Changyi Xie. Enhancing facial action unit recognition through topological feature integration and relational learning |
7477 | -- | 7491 | Yuenan Wang, Hua Wang, Fan Zhang 0045. Mask autoencoder for enhanced image reconstruction with position coding offset and combined masking |
7493 | -- | 7508 | Haowei Zhu, Suqin Bai, Jinlong Shi, Jiawen Lu, Xin Zuo, Shucheng Huang, Xu Yao. Ellipsoid-SLAM: enhancing dynamic scene understanding through ellipsoidal object representation and trajectory tracking |
7509 | -- | 7520 | Daikun Qu, Hongwei Zhao, Mingzhu Zhou. Unsupervised video object segmentation with mask transformer: boosting accuracy and efficiency through feature fusion |
7521 | -- | 7533 | Cheng Zhong, Xiaomin Yu, Huan Xia, Rongdong Xie, Qingyi Xu. Restoring intricate Miao embroidery patterns: a GAN-based U-Net with spatial-channel attention |
7535 | -- | 7549 | Jinyang Wang, Jihong Wang, Haoxuan Li, Xiaojun Huang, Jun Xia, Zhen Li, Weibing Wu, Bin Sheng. Temporal goal-aware transformer assisted visual reinforcement learning for virtual table tennis agent |
7551 | -- | 7565 | Junchi Ma, Yuanqing Wang, Guangmiao Ding, Wei Cao, Xiangyun Liao, Ping Zhang, Jianping Lv. Mamba-enhanced hierarchical attention network for precise visualization of hippocampus and amygdala |
7567 | -- | 7584 | Yuhao Zhang, Jiaqi Tong, Honglin Liu. SCAP: enhancing image captioning through lightweight feature sifting and hierarchical decoding |
7585 | -- | 7601 | Yan Zhang, Xueting Sang, Yemei Sun, Shudong Liu, Shengpei Zhou. DMTNet: dual-domain adaptive multi-scale feature fusion network with transformer for small target detection |
7603 | -- | 7616 | Xiaochun Wu, Ning Guo. MGSLU-Net: a lightweight network for efficient detection of water leakage in subway tunnel linings |
7617 | -- | 7640 | Kehao Chen, Zhiping Zhou, Kewei Li, Taoyong Su, Zhaozhong Zhang, Jinhua Liu, Chenghao Ying. Red green blue-depth salient object detection based on multi-scale refinement and cross-modalities fusion network |
7641 | -- | 7656 | Fang Zhou, Tingting Yang, Liuyan Tan, Xiaolong Xu, Mengdao Xing. DAP-Net: enhancing SAR target recognition with dual-channel attention and polarimetric features |
7657 | -- | 7670 | Cheng Jiang, Pengle Zhang, Ying Ni, Xiaoli Wang, Hanghang Peng, Sen Liu, Mengdi Fei, Yuxin He, Yaxuan Xiao, Jin Huang, Xingyu Ma, Tian Yang. Multimodal retrieval-augmented generation for financial documents: image-centric analysis of charts and tables with large language models |
7671 | -- | 7685 | Zhaozhao Yang, Yuhai Yu, Yongdong Huang, Jiana Meng. Innovative approaches in image processing: enhancing feature extraction and recognition capabilities |
7687 | -- | 7702 | Yihao Li, Junyu Liu, Xiaoyu Guan, Hanming Hou, Tianyu Huang. Introducing anisotropic fields for enhanced diversity in crowd simulation |
7703 | -- | 7721 | Liming Wan, Lin Song, Ying Zhou, Chenrui Kang, Shijian Zheng, Guo Chen. Dynamic neighbourhood-enhanced UNet with interwoven fusion for medical image segmentation |
7723 | -- | 7733 | Haomou Bai, Yue Sang. Ultra-lightweight convolutional network for efficient single-image super-resolution |
7735 | -- | 7750 | Sathish Mothe, Srinivas Kankanala. Multi-stage residual network with two fold attention mechanisms for low-light image enhancement |
7751 | -- | 7766 | Xie Chengjie, Lu Shuhua, Shi Yangyu, Zheng Diwen. Joint perturbation consistency across image and feature levels for cross-domain adaptive crowd counting |
7767 | -- | 7780 | Pengyun Chen, Shuang Cui, Ning Cao, Wenhao Zhang, Pengfei Wang, Shaohui Jin, Mingliang Xu. Lightweight multi-scale feature fusion with attention guidance for passive non-line-of-sight imaging |
7781 | -- | 7798 | Wu Shili, Guo Yongkun, Qian Chao, Li Ying, Zhang Xinyou. Global attention and context encoding for enhanced medical image segmentation |
7799 | -- | 7815 | Xiang Shijie, Zhou Dong, Tian Dan. Multi-scale feature fusion network for real-time semantic segmentation of urban street scenes: enhancing detail retention and accuracy |
7817 | -- | 7838 | Hao Li, Shengkun Wu, Lei Deng, Chenhua Liu, Yifan Chen, Hanrui Chen, Heng Yu, Mingli Dong, Lianqing Zhu. Enhancing infrared and visible image fusion through multiscale Gaussian total variation and adaptive local entropy |
7839 | -- | 7854 | Duo Liu, Guoyin Zhang, Yiqi Shi, Ye Tian, Liguo Zhang. Efficient feature difference-based infrared and visible image fusion for low-light environments |
7855 | -- | 7865 | Weichen Dai 0001, Hexing Wu, Xiaoyang Weng, Wanzeng Kong. Implicit guidance for enhancing low-light optical flow estimation via channel attention networks |
7867 | -- | 7882 | Roberto Alcover-Couso, Juan C. SanMiguel, Marcos Escudero-Viñolo, Jose M. MartÃnez. Layer-wise model merging for unsupervised domain adaptation in segmentation tasks |
7883 | -- | 7907 | Xinzhi Li, Yong Liu, Peng Yan. Optimizing feature map matching for marine benthic organism detection |
7909 | -- | 7923 | Zhen Song, Jianhua Chen. Adaptive rate compression for distributed video sensing in wireless visual sensor networks |
7925 | -- | 7938 | Jinxing Liang, Kaifang Han, Dongsheng Li, Ruixin Gao, Jiajia Peng, Tao Peng, Xinrong Hu. Enhancing low-frequency stitch code generation for knitted fabrics: an LFSCG-E-Net approach |
7939 | -- | 7950 | Jiahao Wang, Yongqiang Wang, Congling Zhou, Jiawei Huang. LF-RTMDet: an instance segmentation algorithm for real-time detection of water-filled barriers |
7951 | -- | 7963 | Xijun Wang, Xin Zhou, Yi Wang, Songto Zeng, Xinyu Liu, Haobo Shen, Song Fei, Lei Zhu. Msu-mamba: multi-scale defocus blur detection using cross-scale fusion and state-space models |
7965 | -- | 7981 | Xite Wang, Changsheng Qin, Mei Bai, Qian Ma 0003, Guanyu Li. CAFormer: a connectivity-aware vision transformer for road extraction from remote sensing images |
7983 | -- | 7995 | Zhenghao Xie, Junfen Chen, Yingying Wang, Bojun Xie. Enhanced fine-grained relearning for skeleton-based action recognition |
7997 | -- | 8008 | Doudou Zhang, Junchi Ma, Jie Chen, Linxia Xiao, Xiangyun Liao, Yong Zhang, Weixin Si. MF-SAM: enhancing multi-modal fusion with Mamba in SAM-Med3D for GPi segmentation |
8009 | -- | 8023 | Wubin Shi, Shaoyan Gai, Feipeng Da, Zeyu Cai, Jiaoling Wang. GRPoseNet: a generalizable and robust 6D object pose estimation network using sparse RGB views |
8025 | -- | 8040 | Zongyu Ye, Hongjuan Yan, Yewang Sun, Bin Li, Lei Liu, Wenbo Wu. MSPNet: real-time semantic segmentation with large kernel and atrous convolutions |
8041 | -- | 8053 | Zhengwei Guo, Bo Wang. Enhancing sandstorm images via color-guided spatial-frequency fusion network |
8055 | -- | 8073 | Yu Pang, Yang Huang, Chenyu Weng, Jialin Lyu, Chuanyue Bai, Xiaosheng Yu. Enhanced RGB-T saliency detection via thermal-guided multi-stage attention network |
8075 | -- | 8087 | Xiang Chen, Yuanqi Yao, Zhouyu Guan, Chenyang Li, Jian Guan, Jun Pu, Ruhan Liu, Bin Sheng 0001, Shankai Yin, Yiming Qin. DSTS-GF: a dual-stream temporal-spatial transformer with gated fusion for the classification of Obstructive Sleep Apnea |
8089 | -- | 8101 | Yuanqi Yao, Zehua Jiang, Zhouyu Guan, Yilun Luxue, Seungmin Lee, Xiang Chen, Haodong Yang, Yiming Qin. A visual-language foundation model for disease diagnosis and doctor-patient co-decision |
8103 | -- | 8116 | Shigang Hu, Darong Wu, Jianxin Wang, Shijun Huang. The image super-resolution network based on dual-branch feature interaction attention mechanism |
8117 | -- | 0 | Tao Shi, Yao Ding 0012, Kui-feng Zhu, Yan-jie Su. Correction: DFP-YOLO: a lightweight machine tool workpiece defect detection algorithm based on computer vision |
8119 | -- | 0 | Sung-Wook Park, Se-Hoon Jung, Chun-Bo Sim. Correction: NeXtSRGAN: enhancing super-resolution GAN with ConvNeXt discriminator for superior realism |