Skip to main content
Log in

Serving

Our high-performance serving library provides an OpenAI-compatible REST endpoint, enabling a smooth transition from OpenAI services or other libraries like vLLM and SGLang. MAX handles the complete request lifecycle with built-in support for function calling, structured output, and more, plus a Python API for offline inference.

Guides

Tutorials