Gabriele Oliaro
Gabriele Oliaro
Home
Industry Experience
Publications
Contact
Light
Dark
Automatic
3
SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software. Create your slides in Markdown - click the Slides button to check out the dta.
Gabriele Oliaro
,
Xupeng Miao
,
Zhihao Zhang
,
Xinhao Cheng
,
Zeyu Wang
,
Rae Ying Yee Wong
,
Alan Zhu
,
Lijie Yang
,
Xiaoxiang Shi
,
Chunan Shi
,
Zhuoming Chen
,
Daiyaan Arfeen
,
Reyna Abhyankar
,
Zhihao Jia
PDF
Cite
Code
DOI
Cite
×