Jiayi Yuan

6100 Main St

Houston, TX 77005

jy101 [at] rice.edu

About me

I’m Jiayi Yuan ([dʒa-ˈi:], 袁加熠), a Ph.D. candidate from the Department of Computer Science at Rice University, advised by Dr. Xia “Ben” Hu. I aim to build efficient machine learning algorithms and systems (MLSys) through methods like quantization, sparsity and re-parameterization while enhancing system robustness and security. My research applications span language, vision, time series, graph, and healthcare domains. Recently, I’ve been working on:

Efficiency problems of long-context LLMs. [KIVI] [KVBench] [Stop Overthinking] [AutoL2S]
LLM post-training: finetune, RL, and evaluation. [Give Me FP32] [The Science] [DHP]
LLM Agent, LLM Routing, LLM safety. [Honeypot] [LTSM] [Taylor Unswift] [LoRATK]

Previously, I received my bachelor’s degree in computer science from Tsinghua University, where I also studied statistics as a minor.

I lived in Beijing for 22 years and in Houston for $YEAR-2022 year(s).

I am seeking full-time research scientist/engineer positions. Please feel free to contact me regarding any opportunities!

Education & Experience

Internship, 2025, NVIDIA
Internship, 2024, Amazon
Ph.D. in Computer Science, 2022 - 2026 (expected). Rice University
B.Eng. in Computer Science and Technology, 2017 - 2021. Tsinghua University

Highlights

“Give me FP32” studies nondeterminism, which has become a heated topic; e.g., it was recently featured in a blog post by Thinking Machines Lab.
KIVI largely inspires KV Cache quantization in Huggingface and is integrated into Transformers. Full code is available here.
Rice News: Large language models could be the key to better patient-trial matching - Rice CS Ph.D. student wins AMIA Best Student Paper Award.
Rice News: Rice CS’ Xia Ben Hu investigates LLMs and likely applications.

News

“Give Me FP32 or Give Me Death” got accepted to NeurIPS 2025 as an Oral (77 out of 21575 submissions) — numerical precision errors have become a hot topic! Code & Talk
I got three papers at NAACL, ACL, and EMNLP 2025 each, wish I got to visit Albuquerque, Vienna, and Suzhou this year
One survey on efficient LLM reasoning has been accepted by TMLR! Feel free to UPVOTE
Check out our recent insights and discussions on LLM evaluation
Two papers accepted by EMNLP 2024 (Main + Finding). See you in Miami!
Check out our recent benchmarking works on KV Cache compression, time series foundation models and LLM evaluation!
KIVI and SEED-GNN got accepted by ICML 2024. See you in Vienna!
Our LLM-PTM paper is selected as a best student paper at AMIA 2023
One paper accepted by NeurIPS 2023
Joined Microsoft Accelerating Foundation Models Research program

Publications

Please refer to publications or Google Scholar.