Publications

2025

NSDI'25

White-Boxing RDMA with Packet-Granular Software Control.
Chenxingyu Zhao, Jaehong Min, Ming Liu, Arvind Krishnamurthy

Efficient Direct-Connect Topologies for Collective Communications.
Liangyu Zhao, Siddharth Pal, Tapan Chugh, Weiyang Wang, Jason Fantl, Prithwish Basu, Joud Khoury, Arvind Krishnamurthy

High-level Programming for Application Networks.
Xiangfeng Zhu, Yuyao Wang, Banruo Liu, Yongtong Wu, Nikola Bojanic, Jingrong Chen, Gilbert Bernstein, Arvind Krishnamurthy, Sam Kumar, Ratul Mahajan, Danyang Zhuo

Preprints

FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving.
Zihao Ye, Lequn Chen, Ruihang Lai, Wuwei Lin, Yineng Zhang, Stephanie Wang, Tianqi Chen, Baris Kasikci, Vinod Grover, Arvind Krishnamurthy and Luis Ceze
Preprint

Argos: Agentic Time-Series Anomaly Detection with Autonomous Rule Generation via Large Language Models.
Yile Gu, Yifan Xiong, Jonathan Mace, Yuting Jiang, Yigong Hu, Baris Kasikci and Peng Cheng
Preprint

EchoLM: Accelerating LLM Serving with Real-time Knowledge Distillation.
Yifan Yu, Yu Gan, Lillian Tsai, Nikhil Sarda, Jiaming Shen, Yanqi Zhou, Arvind Krishnamurthy, Fan Lai, Henry M. Levy and David E. Culler
Preprint

The Streaming Batch Model for Efficient and Fault-Tolerant Heterogeneous Execution.
Frank Sifei Luan, Ziming Mao, Ron Yifeng Wang, Charlotte Lin, Amog Kamsetty, Hao Chen, Cheng Su, Balaji Veeramani, Scott Lee, SangBin Cho, Clark Zinzow, Eric Liang, Ion Stoica and Stephanie Wang
Preprint

2024

ACM Trans. Storage'24

eZNS: Elastic Zoned Namespace for Enhanced Performance Isolation and Device Utilization.
Jaehong Min, Chenxingyu Zhao, Ming Liu and Arvind Krishnamurthy
PDF

ASPLOS'24

Limoncello: Prefetchers for Scale.
Akanksha Jain, Hannah Lin, Carlos Villavieja, Baris Kasikci, Chris Kennelly, Milad Hashemi and Parthasarathy Ranganathan
PDF

RPG2: Robust Profile-Guided Runtime Prefetch Generation.
Yuxuan Zhang, Nathan Sobotka, Soyoon Park, Saba Jamilan, Tanvir Ahmed Khan, Baris Kasikci, Gilles A. Pokam, Heiner Litz and Joseph Devietti
PDF

CC-NIC: a Cache-Coherent Interface to the NIC.
Henry N. Schuh, Arvind Krishnamurthy, David E. Culler, Henry M. Levy, Luigi Rizzo, Samira Manabi Khan and Brent E. Stephens
PDF

DSN'24

ZipChannel: Cache Side-Channel Vulnerabilities in Compression Algorithms.
Marina Minkin and Baris Kasikci
PDF

EuroSys'24

Enoki: High Velocity Linux Kernel Scheduler Development.
Samantha Miller, Anirudh Kumar, Tanay Vakharia, Ang Chen, Danyang Zhuo and Thomas E. Anderson
PDF

FPGA'24

SuperNIC: An FPGA-Based, Cloud-Oriented SmartNIC.
Will Lin, Yizhou Shan, Ryan Kosta, Arvind Krishnamurthy and Yiying Zhang
PDF

HPDC'24

Efficient all-to-all Collective Communication Schedules for Direct-connect Topologies.
Prithwish Basu, Liangyu Zhao, Jason Fantl, Siddharth Pal, Arvind Krishnamurthy and Joud Khoury
PDF

HotCarbon'24

EMPower: The Case for a Cloud Power Control Plane.
Jonggyu Park, Theano Stavrinos, Simon Peter and Thomas Anderson

HotStorage'24

Can Storage Devices be Power Adaptive?
Dedong Xie, Theano Stavrinos, Kan Zhu, Simon Peter, Baris Kasikci and Thomas E. Anderson
PDF

ICML'24

QUEST: Query-Aware Sparsity for Efficient Long-Context LLM Inference.
Jiaming Tang, Yilong Zhao, Kan Zhu, Guangxuan Xiao, Baris Kasikci and Song Han
PDF

IEEE Access'24

LowPaxos: State Machine Replication for Low Resource Settings.
Alex Mwotil, Thomas E. Anderson, Benjamin Kanagwa, Theano Stavrinos and Engineer Bainomugisha
PDF

ISCA'24

(MC)2: Lazy MemCopy at the Memory Controller.
Aditya K. Kamath and Simon Peter
PDF

UDP: Utility-Driven Fetch Directed Instruction Prefetching.
Surim Oh, Mingsheng Xu, Tanvir Ahmed Khan, Baris Kasikci and Heiner Litz
PDF

MICRO'24

Beehive: A Flexible Network Stack for Direct-Attached Accelerators.
Katie Lim, Matthew Giordano, Theano Stavrinos, Irene Zhang, Jacob Nelson, Baris Kasikci and Thomas E. Anderson
PDF

MLSys'24

Atom: Low-Bit Quantization for Efficient and Accurate LLM Serving.
Yilong Zhao, Chien-Yu Lin, Kan Zhu, Zihao Ye, Lequn Chen, Size Zheng, Luis Ceze, Arvind Krishnamurthy, Tianqi Chen and Baris Kasikci
PDF

Punica: Multi-Tenant LoRA Serving.
Lequn Chen, Zihao Ye, Yongji Wu, Danyang Zhuo, Luis Ceze and Arvind Krishnamurthy
PDF

NSDI'24

Sequence Abstractions for Flexible, Line-Rate Network Monitoring.
Andrew Johnson, Ryan Beckett, Xiaoqi Chen, Ratul Mahajan and David Walker
PDF

NeurIPS'24

DataComp-LM: In search of the next generation of training sets for language models.
Jeffrey Li, Alex Fang, Georgios Smyrnis, Maor Ivgi, Matt Jordan, Samir Yitzhak Gadre, Hritik Bansal, Etash Guha, Sedrick Scott Keh, Kushal Arora, Saurabh Garg, Rui Xin, Niklas Muennighoff, Reinhard Heckel, Jean Mercat, Mayee F. Chen, Suchin Gururangan, Mitchell Wortsman, Alon Albalak, Yonatan Bitton, Marianna Nezhurina, Amro Abbas, Cheng-Yu Hsieh, Dhruba Ghosh, Josh Gardner, Maciej Kilian, Hanlin Zhang, Rulin Shao, Sarah M. Pratt, Sunny Sanyal, Gabriel Ilharco, Giannis Daras, Kalyani Marathe, Aaron Gokaslan, Jieyu Zhang, Khyathi Raghavi Chandu, Thao Nguyen, Igor Vasiljevic, Sham M. Kakade, Shuran Song, Sujay Sanghavi, Fartash Faghri, Sewoong Oh, Luke Zettlemoyer, Kyle Lo, Alaaeldin El-Nouby, Hadi Pouransari, Alexander Toshev, Stephanie Wang, Dirk Groeneveld, Luca Soldaini, Pang Wei Koh, Jenia Jitsev, Thomas Kollar, Alex Dimakis, Yair Carmon, Achal Dave, Ludwig Schmidt and Vaishaal Shankar
PDF

SIGCOMM'24

m3: Accurate Flow-Level Performance Estimation using Machine Learning.
Chenning Li, Arash Nasr-Esfahany, Kevin Zhao, Kimia Noorbakhsh, Prateesh Goyal, Mohammad Alizadeh and Thomas E. Anderson
PDF

Relational Network Verification.
Xieyang Xu, Yifei Yuan, Zachary Kincaid, Arvind Krishnamurthy, Ratul Mahajan, David Walker and Ennan Zhai
PDF

Understanding the Host Network.
Midhul Vuppalapati, Saksham Agarwal, Henry Schuh, Baris Kasikci, Arvind Krishnamurthy and Rachit Agarwal
PDF

Principles for Internet Congestion Management.
Lloyd Brown, Albert Gran Alcoz, Frank Cangialosi, Akshay Narayan, Mohammad Alizadeh, Hari Balakrishnan, Eric J. Friedman, Ethan Katz-Bassett, Arvind Krishnamurthy, Michael Schapira and Scott Shenker
PDF

An Architecture For Edge Networking Services.
Lloyd Brown, Emily Marx, Dev Bali, Emmanuel Amaro, Debnil Sur, Ezra Kissel, Inder Monga, Ethan Katz-Bassett, Arvind Krishnamurthy, James Murphy McCauley, Tejas Narechania, Aurojit Panda and Scott Shenker
PDF

Preprints

POD-Attention: Unlocking Full Prefill-Decode Overlap for Faster LLM Inference.
Aditya K. Kamath, Ramya Prabhu, Jayashree Mohan, Simon Peter, Ramachandran Ramjee and Ashish Panwar
Preprint

Accelerator-as-a-Service in Public Clouds: An Intra-Host Traffic Management View for Performance Isolation in the Wild.
Jiechen Zhao, Ran Shu, Katie Lim, Zewen Fan, Thomas E. Anderson, Mingyu Gao and Natalie Enright Jerger
Preprint

Arcus: SLO Management for Accelerators in the Cloud with Traffic Shaping.
Jiechen Zhao, Ran Shu, Katie Lim, Zewen Fan, Tom Anderson, Mingyu Gao and Natalie Enright Jerger
Preprint

Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models.
Keisuke Kamahori, Yile Gu, Kan Zhu and Baris Kasikci
Preprint

Palermo: Improving the Performance of Oblivious Memory using Protocol-Hardware Co-Design.
Haojie Ye, Yuchen Xia, Yuhan Chen, Kuan-Yu Chen, Yichao Yuan, Shuwen Deng, Baris Kasikci, Trevor N. Mudge and Nishil Talati
Preprint

BlendServe: Optimizing Offline Inference for Auto-regressive Large Models with Resource-aware Batching.
Yilong Zhao, Shuo Yang, Kan Zhu, Lianmin Zheng, Baris Kasikci, Yang Zhou, Jiarong Xing and Ion Stoica
Preprint

ForestColl: Efficient Collective Communications on Heterogeneous Network Fabrics.
Liangyu Zhao, Saeed Maleki, Ziyue Yang, Hossein Pourreza, Aashaka Shah, Changho Hwang and Arvind Krishnamurthy
Preprint

Laconic: Streamlined Load Balancers for SmartNICs.
Tianyi Cui, Chenxingyu Zhao, Wei Zhang, Kaiyuan Zhang and Arvind Krishnamurthy
Preprint

2023

ACM Trans. Embed. Comput. Syst.'23

CrossTalk: Making Low-Latency Fault Tolerance Cheap by Exploiting Redundant Networks.
Andrew D. Loveless, Linh Thi Xuan Phan, Lisa Erickson, Ronald G. Dreslinski and Baris Kasikci
PDF

Comput. Commun. Rev.'23

Can We Save the Public Internet?
Marjory Blumenthal, Ramesh Govindan, Ethan Katz-Bassett, Arvind Krishnamurthy, James Murphy McCauley, Nick Merrill, Tejas Narechania, Aurojit Panda and Scott Shenker
PDF

HotCarbon'23

The Case of Unsustainable CPU Affinity.
Jiechen Zhao, Katie Lim, Thomas E. Anderson and Natalie Enright Jerger
PDF

An Agile Pathway Towards Carbon-aware Clouds.
Pratyush Patel, Theo Gregersen and Thomas E. Anderson
PDF

HotNets'23

Application Defined Networks.
Xiangfeng Zhu, Weixin Deng, Banruo Liu, Jingrong Chen, Yongji Wu, Thomas Anderson, Arvind Krishnamurthy, Ratul Mahajan and Danyang Zhuo
PDF

How I Learned to Stop Worrying About CCA Contention.
Lloyd Brown, Yash Kothari, Akshay Narayan, Arvind Krishnamurthy, Aurojit Panda, Justine Sherry and Scott Shenker
PDF

IEEE Comput. Archit. Lett.'23

Towards Improved Power Management in Cloud GPUs.
Pratyush Patel, Zibo Gong, Syeda Rizvi, Esha Choukse, Pulkit A. Misra, Thomas E. Anderson and Akshitha Sriraman
PDF

MobiSys'23

Minimizing a Smartphone's TCB for Security-Critical Programs with Exclusively-Used, Physically-Isolated, Statically-Partitioned Hardware.
Zhihao Yao, Seyed Mohammadjavad Seyed Talebi, Mingyi Chen, Ardalan Amiri Sani and Thomas E. Anderson
PDF

NSDI'23

Scalable Tail Latency Estimation for Data Center Networks.
Kevin Zhao, Prateesh Goyal, Mohammad Alizadeh and Thomas E. Anderson
PDF

Test Coverage for Network Configurations.
Xieyang Xu, Weixin Deng, Ryan Beckett, Ratul Mahajan and David Walker
PDF

OSDI'23

ScaleDB: A Scalable, Asynchronous In-Memory Database.
Syed Akbar Mehdi, Deukyeon Hwang, Simon Peter and Lorenzo Alvisi
PDF

eZNS: An Elastic Zoned Namespace for Commodity ZNS SSDs.
Jaehong Min, Chenxingyu Zhao, Ming Liu and Arvind Krishnamurthy
PDF

SIGCOMM'23

Lessons from the evolution of the Batfish configuration analysis tool.
Matt Brown, Ari Fogel, Daniel Halperin, Victor Heorhiadi, Ratul Mahajan and Todd D. Millstein
PDF

Host Congestion Control.
Saksham Agarwal, Arvind Krishnamurthy and Rachit Agarwal
PDF

Unleashing SmartNIC Packet Processing Performance in P4.
Jiarong Xing, Yiming Qiu, Kuo-Feng Hsu, Songyuan Sui, Khalid Manaa, Omer Shabtai, Yonatan Piasetzky, Matty Kadosh, Arvind Krishnamurthy, T. S. Eugene Ng and Ang Chen
PDF

SOSP'23

Siloz: Leveraging DRAM Isolation Domains to Prevent Inter-VM Rowhammer.
Kevin Loughlin, Jonah Rosenblum, Stefan Saroiu, Alec Wolman, Dimitrios Skarlatos and Baris Kasikci
PDF

A Cloud-Scale Characterization of Remote Procedure Calls.
Korakit Seemakhupt, Brent E. Stephens, Samira Manabi Khan, Sihang Liu, Hassan M. G. Wassel, Soheil Hassas Yeganeh, Alex C. Snoeren, Arvind Krishnamurthy, David E. Culler and Henry M. Levy
PDF

SoCC'23

Dissecting Overheads of Service Mesh Sidecars.
Xiangfeng Zhu, Guozhen She, Bowen Xue, Yu Zhang, Yongsu Zhang, Xuan Kelvin Zou, Xiongchun Duan, Peng He, Arvind Krishnamurthy, Matthew Lentz, Danyang Zhuo and Ratul Mahajan
PDF

Anticipatory Resource Allocation for ML Training.
Tapan Chugh, Srikanth Kandula, Arvind Krishnamurthy, Ratul Mahajan and Ishai Menache
PDF

Preprints

Virtuoso: High Resource Utilization and μs-scale Performance Isolation in a Shared Virtual Machine TCP Network Stack.
Matheus Stolet, Liam Arzola, Simon Peter and Antoine Kaufmann
Preprint

MaxMem: Colocation and Performance for Big Data Applications on Tiered Main Memory Servers.
Amanda Raybuck, Wei Zhang, Kayvan Mansoorshahi, Aditya K. Kamath, Mattan Erez and Simon Peter
Preprint

Hybrid Computing for Interactive Datacenter Applications.
Pratyush Patel, Katie Lim, Kushal Jhunjhunwalla, Ashlie Martinez, Max Demoulin, Jacob Nelson, Irene Zhang and Thomas E. Anderson
Preprint

Agile Development of Linux Schedulers with Ekiben.
Samantha Miller, Anirudh Kumar, Tanay Vakharia, Tom Anderson, Ang Chen and Danyang Zhuo
Preprint

TSoR: TCP Socket over RDMA Container Network for Cloud Native Computing.
Yulin Sun, Qingming Qu, Chenxingyu Zhao, Arvind Krishnamurthy, Hong Chang and Ying Xiong
Preprint

Bandwidth Optimal Pipeline Schedule for Collective Communication.
Liangyu Zhao and Arvind Krishnamurthy
Preprint

Symphony: Optimized Model Serving using Centralized Orchestration.
Lequn Chen, Weixin Deng, Anirudh Canumalla, Yu Xin, Matthai Philipose and Arvind Krishnamurthy
Preprint

Quark: A High-Performance Secure Container Runtime for Serverless Computing.
Chenxingyu Zhao, Yulin Sun, Ying Xiong and Arvind Krishnamurthy
Preprint

Bringing Reconfigurability to the Network Stack.
Akshay Narayan, Aurojit Panda, Mohammad Alizadeh, Hari Balakrishnan, Arvind Krishnamurthy and Scott Shenker
Preprint