Awesome_LLM_System-PaperList

Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of papers on LLMs inference and serving.

Survey

Framework

Serving

Operating System

Transformer accelerate

Model Compression

Quant

Punrning/sparisity

Communication

Energy

Decentralized

Serveless

Last updated