Caption Unification for Multi-View Lifelogging Images Based on In-Context Learning with Heterogeneous Semantic Contents

Masaya Sato, Keisuke Maeda, Ren Togo, Takahiro Ogawa 0001, Miki Haseyama. Caption Unification for Multi-View Lifelogging Images Based on In-Context Learning with Heterogeneous Semantic Contents. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2024, Seoul, Republic of Korea, April 14-19, 2024. pages 8085-8089, IEEE, 2024. [doi]

Abstract

Abstract is missing.