Skip to main content

FAQ

If this page doesn't answer your question, please ask us on our Modular forum or Discord channel.

Distribution

What operating systems do you support?

You can install modular on Mac and Linux operating systems.

For more details, see the system requirements.

What are the GPU requirements?

The Modular Platform supports both CPUs and GPUs, so you don't always need a GPU to serve a model—although some larger models do require a GPU.

For details about GPU support, see our list of compatible GPUs.

Will MAX be open-sourced?

We want to contribute a lot to open source, but we also want to do it right. Our team has decades of experience building open-source projects, and we believe it's very important to create an inclusive and vibrant community, which takes a lot of work.

We've already begun open-sourcing parts of the MAX framework, including our Python serving library, MAX model architectures, and GPU kernels.

To get the latest updates, sign up for our newsletter.

Functionality

What clouds and services can I deploy MAX onto?

You can deploy our MAX container across a variety of VM and Kubernetes-based cloud services, including AWS, GCP, and Azure. To get started with any of them, check out our tutorials using MAX Serve.

Can I run MAX locally?

Yes. MAX has support for MacOS and ARM hardware, meaning it can be run on your local laptop for exploration and testing purposes.

Will MAX support distributed inference of large models?

Yes, it will support executing large models that do not fit into the memory of a single device. This isn't available yet, so stay tuned!

Installation

Can I install both stable and nightly builds?

Yes, it's safe and easy to use the stable and nightly builds for different projects, each with their own virtual environment and package dependencies. For more information, read the packages guide.

Does the MAX SDK collect telemetry?

Yes, the MAX SDK collects basic system information, session durations, compiler events, and crash reports that enable us to identify, analyze, and prioritize issues. The MAX container for model serving also collects performance metrics such as time to first token and input processing time.

This telemetry is crucial to help us quickly identify problems and improve our products for you. Without this telemetry, we would rely solely on user-submitted bug reports, which are limited and would severely limit our performance insights.

To disable serving telemetry, see the MAX container documentation.

Was this page helpful?