Abstract is missing.
- Depth-Guided NeRF Training via Earth Mover's DistanceAnita Rau, Josiah Aklilu, F. Christopher Holsinger, Serena Yeung-Levy. 1-17 [doi]
- INTRA: Interaction Relationship-Aware Weakly Supervised Affordance GroundingJi Ha Jang, Hoigi Seo, Se Young Chun. 18-34 [doi]
- DEPICT: Diffusion-Enabled Permutation Importance for Image Classification TasksSarah Jabbour, Gregory Kondas, Ella Kazerooni, Michael W. Sjoding, David Fouhey, Jenna Wiens. 35-51 [doi]
- MEERKAT: Audio-Visual Large Language Model for Grounding in Space and TimeSanjoy Chowdhury, Sayan Nag, Subhrajyoti Dasgupta, Jun Chen 0021, Mohamed Elhoseiny, Ruohan Gao, Dinesh Manocha. 52-70 [doi]
- Diagnosing and Re-learning for Balanced Multimodal LearningYake Wei, Siwei Li, Ruoxuan Feng, Di Hu 0001. 71-86 [doi]
- Contribution-Based Low-Rank Adaptation with Pre-training Model for Real Image RestorationDongwon Park, Hayeon Kim, Se Young Chun. 87-105 [doi]
- Elucidating the Hierarchical Nature of Behavior with Masked AutoencodersLucas Stoffl, Andy Bonnetto, Stéphane d'Ascoli, Alexander Mathis. 106-125 [doi]
- BeyondScene: Higher-Resolution Human-Centric Scene Generation with Pretrained DiffusionGwanghyun Kim, Hayeon Kim, Hoigi Seo, Dong Un Kang, Se Young Chun. 126-142 [doi]
- SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse ViewsChao Xu 0016, Ang Li 0010, Linghao Chen, Yulin Liu, Ruoxi Shi, Hao Su 0001, Minghua Liu. 143-163 [doi]
- MMEarth: Exploring Multi-modal Pretext Tasks for Geospatial Representation LearningVishal Nedungadi, Ankit Kariryaa, Stefan Oehmcke, Serge J. Belongie, Christian Igel, Nico Lang. 164-182 [doi]
- Evolving Interpretable Visual Classifiers with Large Language ModelsMia Chiquier, Utkarsh Mall, Carl Vondrick. 183-201 [doi]
- LITA: Language Instructed Temporal-Localization AssistantDe-An Huang, Shijia Liao, Subhashree Radhakrishnan, Hongxu Yin, Pavlo Molchanov 0001, Zhiding Yu, Jan Kautz. 202-218 [doi]
- MARs: Multi-view Attention Regularizations for Patch-Based Feature Recognition of Space TerrainTimothy Chase, Karthik Dantu. 219-239 [doi]
- Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMsKeen You, Haotian Zhang, Eldon Schoop, Floris Weers, Amanda Swearngin, Jeffrey Nichols 0001, Yinfei Yang, Zhe Gan. 240-255 [doi]
- Bridging the Pathology Domain Gap: Efficiently Adapting CLIP for Pathology Image Analysis with Limited Labeled DataZhengfeng Lai, Joohi Chauhan, Brittany N. Dugger, Chen-Nee Chuah. 256-273 [doi]
- AugUndo: Scaling Up Augmentations for Monocular Depth Completion and EstimationYangchao Wu, Tian-Yu Liu, Hyoungseob Park, Stefano Soatto, Dong Lao, Alex Wong 0001. 274-293 [doi]
- CARB-Net: Camera-Assisted Radar-Based Network for Vulnerable Road User DetectionWei-Yu Lee, Martin D. Dimitrievski, David Van Hamme, Jan Aelterman, Ljubomir Jovanov, Wilfried Philips. 294-310 [doi]
- SAH-SCI: Self-supervised Adapter for Efficient Hyperspectral Snapshot Compressive ImagingHaijin Zeng, Yuxi Liu, Yongyong Chen, Youfa Liu, Chong Peng, Jingyong Su. 311-328 [doi]
- Minimalist Vision with Freeform PixelsJeremy Klotz, Shree K. Nayar. 329-346 [doi]
- All You Need Is Your Voice: Emotional Face Representation with Audio Perspective for Emotional Talking Face GenerationSeongho Kim, Byung Cheol Song. 347-363 [doi]
- LatentEditor: Text Driven Local Editing of 3D ScenesUmar Khalid, Hasan Iqbal, Nazmul Karim, Muhammad Tayyab, Jing Hua 0001, Chen Chen 0001. 364-380 [doi]
- Single-Photon 3D Imaging with Equi-Depth Photon HistogramsKaustubh Sadekar, David Maier 0001, Atul Ingle. 381-398 [doi]
- Asynchronous Bioplausible Neuron for Spiking Neural Networks for Event-Based VisionSanket Kachole, Hussain Sajwani, Fariborz Baghaei Naeini, Dimitrios Makris 0001, Yahya H. Zweiri. 399-415 [doi]
- Viewpoint Textual Inversion: Discovering Scene Representations and 3D View Control in 2D Diffusion ModelsJames Burgess, Kuan-Chieh Wang, Serena Yeung-Levy. 416-435 [doi]
- POET: Prompt Offset Tuning for Continual Human Action AdaptationPrachi Garg, K. J. Joseph, Vineeth N. Balasubramanian, Necati Cihan Camgöz, Chengde Wan, Kenrick Kin, Weiguang Si, Shugao Ma, Fernando De la Torre. 436-455 [doi]
- Domain Generalization of 3D Object Detection by Density-ResamplingShuangzhi Li 0003, Lei Ma 0003, Xingyu Li. 456-473 [doi]
- IG Captioner: Information Gain Captioners Are Strong Zero-Shot ClassifiersChenglin Yang, Siyuan Qiao, Yuan Cao 0007, Yu Zhang 0033, Tao Zhu 0005, Alan L. Yuille, Jiahui Yu. 474-490 [doi]