Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique

Pala Tej Deep, Vernon Toh, Rishabh Bhardwaj, Soujanya Poria. Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique. In Christos Christodoulopoulos 0001, Tanmoy Chakraborty 0002, Carolyn Rose, Violet Peng, editors, Findings of the Association for Computational Linguistics: EMNLP 2025, Suzhou, China, November 4-9, 2025. pages 11845-11860, Association for Computational Linguistics, 2025. [doi]

Authors

Pala Tej Deep

This author has not been identified. Look up 'Pala Tej Deep' in Google

Vernon Toh

This author has not been identified. Look up 'Vernon Toh' in Google

Rishabh Bhardwaj

This author has not been identified. Look up 'Rishabh Bhardwaj' in Google

Soujanya Poria

This author has not been identified. Look up 'Soujanya Poria' in Google