Hongtao Yang, Yehui Liu, Minzheng Jia, Lu Han, Yongqiang Kong, Xin Jin 0015, Ping Shi 0001. Fine-grained aesthetic multi-attribute captioning with aligned vision-language representations. J. Visual Communication and Image Representation, 116:104732, 2026. [doi]
Abstract is missing.