Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference

Junyan Li, Li Lyna Zhang, Jiahang Xu, Yujing Wang, Shaoguang Yan, Yunqing Xia, YuQing Yang, Ting Cao, Hao Sun 0015, Weiwei Deng, Qi Zhang, Mao Yang. Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference. In Ambuj Singh, Yizhou Sun, Leman Akoglu, Dimitrios Gunopulos, Xifeng Yan, Ravi Kumar 0001, Fatma Ozcan, Jieping Ye, editors, Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2023, Long Beach, CA, USA, August 6-10, 2023. pages 1280-1290, ACM, 2023. [doi]

Authors

Junyan Li

This author has not been identified. Look up 'Junyan Li' in Google

Li Lyna Zhang

This author has not been identified. Look up 'Li Lyna Zhang' in Google

Jiahang Xu

This author has not been identified. Look up 'Jiahang Xu' in Google

Yujing Wang

This author has not been identified. Look up 'Yujing Wang' in Google

Shaoguang Yan

This author has not been identified. Look up 'Shaoguang Yan' in Google

Yunqing Xia

This author has not been identified. Look up 'Yunqing Xia' in Google

YuQing Yang

This author has not been identified. Look up 'YuQing Yang' in Google

Ting Cao

This author has not been identified. Look up 'Ting Cao' in Google

Hao Sun 0015

This author has not been identified. Look up 'Hao Sun 0015' in Google

Weiwei Deng

This author has not been identified. Look up 'Weiwei Deng' in Google

Qi Zhang

This author has not been identified. Look up 'Qi Zhang' in Google

Mao Yang

This author has not been identified. Look up 'Mao Yang' in Google