LLM Inference

An opinionated and incomplete survey of LLM inference and serving runtimes from a systems and infrastructure lens.