ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP

researchr

You are not signed in
Sign in
Sign up

Lu Yan, Zhuo Zhang 0002, Guanhong Tao 0001, Kaiyuan Zhang 0002, Xuan Chen, Guangyu Shen, Xiangyu Zhang. ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP. In Alice Oh, Tristan Naumann, Amir Globerson, Kate Saenko, Moritz Hardt, Sergey Levine, editors, Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023. 2023. [doi]

@inproceedings{Yan000CSZ23,
  title = {ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP},
  author = {Lu Yan and Zhuo Zhang 0002 and Guanhong Tao 0001 and Kaiyuan Zhang 0002 and Xuan Chen and Guangyu Shen and Xiangyu Zhang},
  year = {2023},
  url = {http://papers.nips.cc/paper_files/paper/2023/hash/d2b752ed4726286a4b488ae16e091d64-Abstract-Conference.html},
  researchr = {https://researchr.org/publication/Yan000CSZ23},
  cites = {0},
  citedby = {0},
  booktitle = {Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023},
  editor = {Alice Oh and Tristan Naumann and Amir Globerson and Kate Saenko and Moritz Hardt and Sergey Levine},
}

External Links

Cite Key

Statistics

PDF

Researchr

ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP