Journal: Vis. Intell.

Volume 3, Issue 1

0 -- 0Haonan Cheng, Hanyue Liu, JuanJuan Cai, Long Ye. CLFormer: a cross-lingual transformer framework for temporal forgery localization
0 -- 0Yifei Deng, Zhengyu Chen, Chenglong Li 0002, Jin Tang 0001. Uncertainty-aware coarse-to-fine alignment for text-image person retrieval
0 -- 0Yichen Shi, Yuhao Gao, Yingxin Lai, Hongyang Wang, Jun Feng, Lei He, Jun Wan 0001, Changsheng Chen, Zitong Yu, Xiaochun Cao. SHIELD: an evaluation benchmark for face spoofing and forgery detection with multimodal large language models
0 -- 0Hang Zhang, Wenxiao Zhang, Haoxuan Qu, Jun Liu 0036. Enhancing human-centered dynamic scene understanding via multiple LLMs collaborated reasoning
0 -- 0Jiaxin Mei, Tao Zhou 0002, Kaiwen Huang, Yizhe Zhang 0001, Yi Zhou 0007, Ye Wu 0001, Huazhu Fu. A survey on deep learning for polyp segmentation: techniques, challenges and future trends
0 -- 0Xiaohan Fang, Peilin Chen 0001, Meng Wang 0017, Shiqi Wang 0001. Immersive video interaction system: a survey
0 -- 0Suyan Li, Fuxiang Huang, Lei Zhang 0038. A survey of multimodal composite editing and retrieval
0 -- 0Yingjia Xu, Mengxia Wu, Zixin Guo, Min Cao, Mang Ye, Jorma Laaksonen. Efficient text-to-video retrieval via multi-modal multi-tagger derived pre-screening
0 -- 0Xiao Wang 0014, Yuehang Li, Wentao Wu, Jiandong Jin, Yao Rong, Bo Jiang 0002, Chuanfu Li, Jin Tang 0001. Pre-training on high-resolution X-ray images: an experimental study
0 -- 0Ruikun Zhang, Zhiyuan Yang, Liyuan Pan. DehazeMamba: large multi-modal model guided single image dehazing via mamba
0 -- 0Qianggang Ding, Zhichao Shen, WeiQiang Zhu, Bang Liu. DASFormer: self-supervised pretraining for earthquake monitoring
0 -- 0Mingjin Zhang, Qian Xu, Yuchun Wang, Xi Li, Haojuan Yuan. MIRSAM: multimodal vision-language segment anything model for infrared small target detection
0 -- 0Zhe Cao 0001, Lixin Xu, Jin Zhang, Biwen Yang, Kaizheng Chen, Ruiheng Zhang. DBDB: de-bimodal defocus blur in joint infrared-visible imaging
0 -- 0Yuli Zhou, Guolei Sun, Yawei Li 0001, Guo-Sen Xie, Luca Benini, Ender Konukoglu. When SAM2 meets video camouflaged object segmentation: a comprehensive evaluation and adaptation
0 -- 0Yasheng Sun, Bohan Li, Mingchen Zhuge, Deng-Ping Fan, Salman H. Khan 0001, Fahad Shahbaz Khan, Hideki Koike. Connecting dreams with visual brainstorming instruction