FAQ

If this page doesn't answer your question, please ask us on our Modular forum or Discord channel.

Distribution

What are the system requirements?

Mac
Linux
Windows

macOS Ventura (13) or later
Apple silicon (M1/M2/M3/M4 processor)
Python 3.9 - 3.13
Xcode or Xcode Command Line Tools
We currently don't support Mac GPUs

What are the GPU requirements?

The Modular Platform supports both CPUs and GPUs, so you don't need a GPU to serve a model or program with Mojo. But if you do want to accelerate your model with GPUs or program for GPUs with Mojo, Modular supports many GPU types.

Because we don't test every variant of a GPU architecture, and support for new architectures will improve incrementally, we've divided our list of compatible GPUs into 3 tiers:

Tier 1: Fully supported

We provide full support and testing for the following data center GPUs:

NVIDIA H100 and H200 (Hopper)
NVIDIA A100 and A10 (Ampere)
NVIDIA L4 and L40 (Ada Lovelace)
AMD Instinct MI300X and MI325X (CDNA3)

Tier 2: Confirmed compatibility

We've confirmed full compatibility with the following GPUs but we currently don't maintain tests for them:

NVIDIA RTX 40XX series (Ada Lovelace)
NVIDIA RTX 30XX series (Ampere)

Tier 3: Limited compatibility

We've either confirmed or received reports that the following GPUs work for GPU programming with Mojo and can execute basic graphs with MAX APIs. However, these GPUs currently can't run some GenAI models for various reasons:

NVIDIA RTX 20XX series (Turing)
NVIDIA T4 (Turing)
NVIDIA Jetson Orin and Orin Nano (Ampere)
AMD Radeon 700M series (RDNA3)
AMD Radeon RX 7000 series (RDNA3)
AMD Radeon RX 9000 series (RDNA4)

If you've had success with any GPUs not listed here, please let us know on Discord.

Software requirements

Make sure you have the corresponding GPU software:

If you're on an NVIDIA GPU:
- NVIDIA GPU driver version 550 or higher
  - Check your NVIDIA GPU driver version using nvidia-smi
  - To update, see the NVIDIA driver docs
If you're on an AMD GPU:
- AMD GPU driver version 6.3.3 or higher
  - For datacenter GPUs (MI300X/MI325X), see the Ubuntu native install guide
  - For Radeon GPUs on Ubuntu, see the Linux install guide for Radeon software
  - For Radeon GPUs on WSL, see the WSL install guide for Radeon software

Notes

Many GPUs are available in variants with different amounts of memory, and each AI model has different memory requirements. So even if your GPU architecture is listed as compatible, you must confirm that the available memory is sufficient for the model you're using.
Modular can serve lots of models on either CPU and GPU, but some models do require one or more GPUs. When you browse our model repository, you can filter by models that support either CPU or GPU.

Why bundle Mojo with MAX?

Integrating Mojo and MAX into a single package is the best way to ensure interoperability between Mojo and MAX for all users, and avoid version conflicts that happen when installing them separately.

Moreover, we built Mojo as a core technology for MAX, and you can use it to extend MAX Engine, so MAX clearly depends on Mojo. On the other hand, writing Mojo code that runs on both CPUs and GPUs (and other accelerators) requires runtime components and orchestration logic that falls outside the domain of Mojo, and into the domain of MAX. That is, MAX isn't just a framework for AI development, it's also a framework for general heterogeneous compute. As such, writing Mojo programs that can execute across heterogeneous hardware depends on MAX.

Nothing has changed for Mojo developers—you can still build and develop in Mojo like you always have. The only difference is that you're now able to seamlessly step into general-purpose GPU programming.

Will MAX be open-sourced?

We want to contribute a lot to open source, but we also want to do it right. Our team has decades of experience building open-source projects, and we believe it's very important to create an inclusive and vibrant community, which takes a lot of work.

We've already begun open-sourcing parts of the MAX framework, including our Python serving library, MAX model architectures, and GPU kernels.

To get the latest updates, sign up for our newsletter.

Functionality

What hardware does MAX support?

MAX supports a broad range of CPUs, including Intel, AMD, and ARM variants, as well as GPUs from NVIDIA and AMD. For more specifics, see the above system requirements.

What clouds and services can I deploy MAX onto?

You can deploy our MAX container across a variety of VM and Kubernetes-based cloud services, including AWS, GCP, and Azure. To get started with any of them, check out our tutorials using MAX Serve.

Can I run MAX locally?

Yes. MAX has support for MacOS and ARM hardware, meaning it can be run on your local laptop for exploration and testing purposes.

Will MAX support distributed inference of large models?

Yes, it will support executing large models that do not fit into the memory of a single device. This isn't available yet, so stay tuned!

Installation

Can I install both stable and nightly builds?

Yes, it's safe and easy to use the stable and nightly builds for different projects, each with their own virtual environment and package dependencies. For more information, read the Install guide.

Does the MAX SDK collect telemetry?

Yes, the MAX SDK collects basic system information, session durations, compiler events, and crash reports that enable us to identify, analyze, and prioritize issues. The MAX container for model serving also collects performance metrics such as time to first token and input processing time.

This telemetry is crucial to help us quickly identify problems and improve our products for you. Without this telemetry, we would rely solely on user-submitted bug reports, which are limited and would severely limit our performance insights.

You can opt-out of some telemetry, such as compiler events and crash reports. However, package install/update/uninstall events, basic system information, and session durations (the amount of time spent running MAX Engine) cannot be disabled (see the Terms of use).

To disable telemetry for compiler events and crash reports, run this command in your project environment (you must run this for each project):

magic telemetry --disable
magic telemetry --disable

To disable serving telemetry, see the MAX container documentation.

Distribution​

What are the system requirements?​

What are the GPU requirements?​

Tier 1: Fully supported​

Tier 2: Confirmed compatibility​

Tier 3: Limited compatibility​

Software requirements​

Why bundle Mojo with MAX?​

Will MAX be open-sourced?​

Functionality​

What hardware does MAX support?​

What clouds and services can I deploy MAX onto?​

Can I run MAX locally?​

Will MAX support distributed inference of large models?​

Installation​

Can I install both stable and nightly builds?​

Does the MAX SDK collect telemetry?​