Flash-LLM: Enabling Low-Cost and Highly-Efficient Large Generative Model Inference With Unstructured Sparsity - researchr publication authors

researchr

You are not signed in
Sign in
Sign up

Haojun Xia, Zhen Zheng, Yuchao Li, Donglin Zhuang, Zhongzhu Zhou, Xiafei Qiu, Yong Li, Wei Lin 0016, Shuaiwen Leon Song. Flash-LLM: Enabling Low-Cost and Highly-Efficient Large Generative Model Inference With Unstructured Sparsity. PVLDB, 17(2):211-224, 2023. [doi]

This author has not been identified. Look up 'Haojun Xia' in GoogleThis author has not been identified. Look up 'Zhen Zheng' in GoogleThis author has not been identified. Look up 'Yuchao Li' in GoogleThis author has not been identified. Look up 'Donglin Zhuang' in GoogleThis author has not been identified. Look up 'Zhongzhu Zhou' in GoogleThis author has not been identified. Look up 'Xiafei Qiu' in GoogleThis author has not been identified. Look up 'Yong Li' in GoogleThis author has not been identified. Look up 'Wei Lin 0016' in GoogleThis author has not been identified. Look up 'Shuaiwen Leon Song' in Google

runs on WebDSL