Prototypical Reward Network for Data-Efficient Model Alignment

researchr

You are not signed in
Sign in
Sign up

Jinghan Zhang, Xiting Wang, Yiqiao Jin, Changyu Chen, Xinhao Zhang, Kunpeng Liu 0001. Prototypical Reward Network for Data-Efficient Model Alignment. In Lun-Wei Ku, Andre Martins, Vivek Srikumar, editors, Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2024, Bangkok, Thailand, August 11-16, 2024. pages 13871-13884, Association for Computational Linguistics, 2024. [doi]

@inproceedings{ZhangWJCZ024,
  title = {Prototypical Reward Network for Data-Efficient Model Alignment},
  author = {Jinghan Zhang and Xiting Wang and Yiqiao Jin and Changyu Chen and Xinhao Zhang and Kunpeng Liu 0001},
  year = {2024},
  url = {https://aclanthology.org/2024.acl-long.748},
  researchr = {https://researchr.org/publication/ZhangWJCZ024},
  cites = {0},
  citedby = {0},
  pages = {13871-13884},
  booktitle = {Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2024, Bangkok, Thailand, August 11-16, 2024},
  editor = {Lun-Wei Ku and Andre Martins and Vivek Srikumar},
  publisher = {Association for Computational Linguistics},
  isbn = {979-8-89176-094-3},
}

External Links

Cite Key

Statistics

PDF

Researchr

Prototypical Reward Network for Data-Efficient Model Alignment