Inference serving framework with RadixAttention for KV-cache reuse. Stable latency (4-21ms), optimized for multi-turn chat and RAG.
GitHub repository popularity
ThoughtWorks Technology Radar assessment
Primary framework purpose
Package manager install frequency
Community size, growth velocity, and industry usage
Integrations, plugins, and extension availability
Execution speed, latency, and resource efficiency