Caption Unification for Multi-View Lifelogging Images Based on In-Context Learning with Heterogeneous Semantic Contents

Masaya Sato, Keisuke Maeda, Ren Togo, Takahiro Ogawa 0001, Miki Haseyama. Caption Unification for Multi-View Lifelogging Images Based on In-Context Learning with Heterogeneous Semantic Contents. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2024, Seoul, Republic of Korea, April 14-19, 2024. pages 8085-8089, IEEE, 2024. [doi]

@inproceedings{SatoMT0H24,
  title = {Caption Unification for Multi-View Lifelogging Images Based on In-Context Learning with Heterogeneous Semantic Contents},
  author = {Masaya Sato and Keisuke Maeda and Ren Togo and Takahiro Ogawa 0001 and Miki Haseyama},
  year = {2024},
  doi = {10.1109/ICASSP48485.2024.10445969},
  url = {https://doi.org/10.1109/ICASSP48485.2024.10445969},
  researchr = {https://researchr.org/publication/SatoMT0H24},
  cites = {0},
  citedby = {0},
  pages = {8085-8089},
  booktitle = {IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2024, Seoul, Republic of Korea, April 14-19, 2024},
  publisher = {IEEE},
  isbn = {979-8-3503-4485-1},
}