Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters
Published in Empirical Methods for Natural Language Processing (EMNLP), 2024
Recommended citation: Euiin Yi*, T. Kim*, H. Jeung, DS Chang, and S-Y. Yun. (2024). "Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters." Empirical Methods for Natural Language Processing (EMNLP).
Download Paper
