Jiayi Yuan
I write efficient code at xAI.
About me
I’m Jiayi Yuan ([dʒa-ˈi:], 袁加熠), I received my Ph.D. degree from the Department of Computer Science at Rice University, advised by Dr. Xia "Ben" Hu. I aim to build efficient machine learning algorithms and systems (MLSys) through methods like quantization, sparsity and re-parameterization while enhancing system robustness and security. My research applications span language, vision, time series, graph, and healthcare domains. Previously, I worked on:
- Efficiency problems of long-context LLMs. [BLASST] [KIVI] [KVBench] [Stop Overthinking] [AutoL2S]
- LLM post-training: finetune, RL, and evaluation. [Give Me FP32] [The Science] [DHP]
- LLM Agent, LLM Routing, LLM safety. [Honeypot] [Rethink Router] [RouterArena] [Taylor Unswift] [LoRATK]
Earlier, I received my bachelor’s degree in computer science from Tsinghua University, where I also studied statistics as a minor.
I lived in Beijing for 22 years and in Houston for 4 years.
Education & Experience
- Internship, 2025, NVIDIA
- Internship, 2024, Amazon
- Ph.D. in Computer Science, 2022 - 2025. Rice University
- B.Eng. in Computer Science and Technology, 2017 - 2021. Tsinghua University
Highlights
- BLASST has been integrated into TensorRT-LLM and NVIDIA Model Optimizer.
- "Give me FP32" studies nondeterminism, which has become a heated topic; e.g., it was recently featured in a blog post by Thinking Machines Lab.
- KIVI largely inspires KV Cache quantization in Huggingface and is integrated into Transformers. Full code is available here.
- Rice News: Large language models could be the key to better patient-trial matching - Rice CS Ph.D. student wins AMIA Best Student Paper Award.
- Rice News: Rice CS' Xia Ben Hu investigates LLMs and likely applications.
News
- BLASST accepted by MLSys 2026 and RouterArena accepted by ICLR 2026, good ending
- "Give Me FP32 or Give Me Death" got accepted to NeurIPS 2025 as an Oral (77 out of 21575 submissions) — numerical precision errors have become a hot topic! Code & Talk
- I got three papers at NAACL, ACL, and EMNLP 2025 each, wish I got to visit Albuquerque, Vienna, and Suzhou this year
- One survey on efficient LLM reasoning has been accepted by TMLR! Feel free to UPVOTE
- Check out our recent insights and discussions on LLM evaluation
- Two papers accepted by EMNLP 2024 (Main + Finding). See you in Miami!
- Check out our recent benchmarking works on KV Cache compression, time series foundation models and LLM evaluation!
- KIVI and SEED-GNN got accepted by ICML 2024. See you in Vienna!
- Our LLM-PTM paper is selected as a best student paper at AMIA 2023
- One paper accepted by NeurIPS 2023
- Joined Microsoft Accelerating Foundation Models Research program
- ...