ExeGPT: Constraint-Aware Resource Scheduling for LLM Inference

Hyungjun Oh, Kihong Kim, Jaemin Kim, Sungkyun Kim, Junyeol Lee, Du-Seong Chang, Jiwon Seo 0002. ExeGPT: Constraint-Aware Resource Scheduling for LLM Inference. In Rajiv Gupta 0001, Nael B. Abu-Ghazaleh, Madan Musuvathi, Dan Tsafrir, editors, Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2, ASPLOS 2024, La Jolla, CA, USA, 27 April 2024- 1 May 2024. pages 369-384, ACM, 2024. [doi]

Authors

Hyungjun Oh

This author has not been identified. Look up 'Hyungjun Oh' in Google

Kihong Kim

This author has not been identified. Look up 'Kihong Kim' in Google

Jaemin Kim

This author has not been identified. Look up 'Jaemin Kim' in Google

Sungkyun Kim

This author has not been identified. Look up 'Sungkyun Kim' in Google

Junyeol Lee

This author has not been identified. Look up 'Junyeol Lee' in Google

Du-Seong Chang

This author has not been identified. Look up 'Du-Seong Chang' in Google

Jiwon Seo 0002

This author has not been identified. Look up 'Jiwon Seo 0002' in Google