Gabriele Oliaro
Gabriele Oliaro
Home
Publications
Industry Experience
Contact
CV
Light
Dark
Automatic
1
FlexLLM: Token-Level Co-Serving of LLM Inference and Fine-Tuning with SLO Guarantee
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software. Create your slides in Markdown - click the Slides button to check out the dta.
Gabriele Oliaro
,
Xupeng Miao
,
Xinhao Cheng
,
Vineeth Kada
,
Ruohan Gao
,
Yingyi Huang
,
Remi Delacourt
,
April Yang
,
Yingcheng Wang
,
Mengdi Wu
,
Colin Unger
,
Zhihao Jia
PDF
Cite
Code
DOI
AdaServe: SLO-Customized LLM Serving with Fine-Grained Speculative Decoding
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software. Create your slides in Markdown - click the Slides button to check out the dta.
Zikun Li
,
Zhuofu Chen
,
Remi Delacourt
,
Gabriele Oliaro
,
Zeyu Wang
,
Qinghan Chen
,
Shuhuai Lin
,
April Yang
,
Zhihao Zhang
,
Zhuoming Chen
,
Sean Lai
,
Xupeng Miao
,
Zhihao Jia
PDF
Cite
DOI
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software. Create your slides in Markdown - click the Slides button to check out the dta.
Zhengxin Zhang
,
Dan Zhao
,
Xupeng Miao
,
Gabriele Oliaro
,
Qing Li
,
Yong Jiang
,
Zhihao Jia
PDF
Cite
DOI
SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and Verification
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software. Create your slides in Markdown - click the Slides button to check out the dta.
Xupeng Miao
,
Gabriele Oliaro
,
Zhihao Zhang
,
Xinhao Cheng
,
Zeyu Wang
,
Zhengxin Zhang
,
Rae Ying Yee Wong
,
Alan Zhu
,
Lijie Yang
,
Xiaoxiang Shi
,
Chunan Shi
,
Zhuoming Chen
,
Daiyaan Arfeen
,
Reyna Abhyankar
,
Zhihao Jia
PDF
Cite
Code
DOI
Optimal Kernel Orchestration for Tensor Programs with Korch
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software. Create your slides in Markdown - click the Slides button to check out the dta.
Muyan Hu
,
Ashwin Venkatram
,
Shreyashri Biswas
,
Balamurugan Marimuthu
,
Bohan Hou
,
Gabriele Oliaro
,
Haojie Wang
,
Liyan Zheng
,
Xupeng Miao
,
Jidong Zhai
,
Zhihao Jia
PDF
Code
Direct Telemetry Access
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software. Create your slides in Markdown - click the Slides button to check out the dta.
Jonatan Langlet
,
Ran Ben Basat
,
Gabriele Oliaro
,
Michael Mitzenmacher
,
Minlan Yu
,
Gianni Antichi
PDF
Cite
Code
DOI
Zero-CPU Collection with Direct Telemetry Access
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software. Create your slides in Markdown - click the Slides button to check out the dta.
Jonatan Langlet
,
Ran Ben Basat
,
Sivaram Ramanathan
,
Gabriele Oliaro
,
Michael Mitzenmacher
,
Minlan Yu
,
Gianni Antichi
PDF
Cite
DOI
Cite
×