Towards A Generalist Code Embedding Model Based On Massive Data Synthesis

Chaofan Li, Jianlyu Chen, Yingxia Shao, Defu Lian, Zheng Liu 0011. Towards A Generalist Code Embedding Model Based On Massive Data Synthesis. In Danielle Belgrave, Cheng Zhang 0005, Laura N. Montoya, Hsuan-Tien Lin, Razvan Pascanu, Piotr Koniusz, Marzyeh Ghassemi, Nancy Chen, Iván Vladimir Meza Ruíz, Arturo Loaiza-Bonilla, editors, Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, NeurIPS 2025, San Diago, CA, USA, December 2-7, 2025 / Mexico City, Mexico, November 30 - December 5, 2025. 2025. [doi]

@inproceedings{LiCSLL25,
  title = {Towards A Generalist Code Embedding Model Based On Massive Data Synthesis},
  author = {Chaofan Li and Jianlyu Chen and Yingxia Shao and Defu Lian and Zheng Liu 0011},
  year = {2025},
  url = {http://papers.nips.cc/paper_files/paper/2025/hash/da5b2141ae6563ddfb813d928377c2d9-Abstract-Datasets_and_Benchmarks_Track.html},
  researchr = {https://researchr.org/publication/LiCSLL25},
  cites = {0},
  citedby = {0},
  booktitle = {Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, NeurIPS 2025, San Diago, CA, USA, December 2-7, 2025 / Mexico City, Mexico, November 30 - December 5, 2025},
  editor = {Danielle Belgrave and Cheng Zhang 0005 and Laura N. Montoya and Hsuan-Tien Lin and Razvan Pascanu and Piotr Koniusz and Marzyeh Ghassemi and Nancy Chen and Iván Vladimir Meza Ruíz and Arturo Loaiza-Bonilla},
}