Skip to main content
Log in

MAX FAQ

If this page doesn't answer your question, please ask us on our Modular forum or Discord channel.

Distribution

What are the system requirements?

  • macOS Ventura (13) or later
  • Apple silicon (M1/M2/M3/M4 processor)
  • Python 3.9 - 3.12
  • Xcode or Xcode Command Line Tools

For a list of recommended GPU cloud instances, see the MAX container documentation.

Why bundle Mojo with MAX?

Integrating Mojo and MAX into a single package is the best way to ensure interoperability between Mojo and MAX for all users, and avoid version conflicts that happen when installing them separately.

Moreover, we built Mojo as a core technology for MAX, and you can use it to extend MAX Engine, so MAX clearly depends on Mojo. On the other hand, writing Mojo code that runs on both CPUs and GPUs (and other accelerators) requires runtime components and orchestration logic that falls outside the domain of Mojo, and into the domain of MAX. That is, MAX isn't just a framework for AI development, it's also a framework for general heterogeneous compute. As such, writing Mojo programs that can execute across heterogeneous hardware depends on MAX.

Nothing has changed for Mojo developers—you can still build and develop in Mojo like you always have. The only difference is that you're now able to seamlessly step into general-purpose GPU programming (coming soon).

Will MAX be open-sourced?

We want to contribute a lot to open source, but we also want to do it right. Our team has decades of experience building open-source projects, and we believe it's very important to create an inclusive and vibrant community, which takes a lot of work. We still need to figure out the details, but as we do so, we will share more. Please stay tuned.

Functionality

What hardware does MAX support?

MAX supports a broad range of CPUs, including Intel, AMD, and ARM variants, as well as GPUs from NVIDIA and AMD (coming soon). For more specifics, see the above system requirements.

What clouds and services can I deploy MAX onto?

You can deploy our MAX container across a variety of VM and Kubernetes-based cloud services, including AWS, GCP, and Azure. To get started with any of them, check out our tutorials using MAX Serve.

Can I run MAX locally?

Yes. MAX has support for MacOS and ARM hardware, meaning it can be run on your local laptop for exploration and testing purposes.

Can I use MAX Engine without MAX Serve?

Yes, you can deploy MAX Engine in a custom server or used standalone to accelerate your model inference with our client libraries. Learn more about MAX Engine.

What types of models does MAX support and from which training frameworks?

MAX supports PyTorch, ONNX, and native MAX models, although not all formats currently work on GPU.

On CPU, MAX support TorchScript, ONNX, and native MAX models. You can also convert a wide range of other other model formats to ONNX format, including models from TensorFlow, Keras, Scikit-Learn, and more. On GPU, MAX supports PyTorch Eager and native MAX models. For more information, read about our supported model formats.

Does MAX support quantization and sparsity?

It supports some quantized models today (models with Int data types, and we are working to add support for more) and we will be adding support for sparsity soon. If you're building a model with MAX Graph, you can quantize your model with several different quantization encodings.

Will MAX support distributed inference of large models?

Yes, it will support executing large models that do not fit into the memory of a single device. This isn't available yet, so stay tuned!

Can MAX Engine's runtime be separated from model compilation (for edge deployment)?

Yes, our runtime (and our entire stack) is designed to be modular. It scales down very well, supports heterogeneous configs, and scales up to distributed settings as well. That being said, this isn't available yet, but we'll share details about more deployment scenarios that we support over time.

Installation

Can I install both stable and nightly builds?

Yes, when you use the Magic CLI, it's safe and easy to use the stable and nightly builds for different projects, each with their own virtual environment and package dependencies. For more information, read about MAX packages.

How do I uninstall everything?

See the uninstall instructions.

Does the MAX SDK collect telemetry?

Yes, the MAX SDK collects basic system information, session durations, compiler events, and crash reports that enable us to identify, analyze, and prioritize issues. The MAX container for model serving also collects performance metrics such as time to first token and input processing time.

This telemetry is crucial to help us quickly identify problems and improve our products for you. Without this telemetry, we would rely solely on user-submitted bug reports, which are limited and would severely limit our performance insights.

You can opt-out of some telemetry, such as compiler events and crash reports. However, package install/update/uninstall events, basic system information, and session durations (the amount of time spent running MAX Engine) cannot be disabled (see the MAX SDK terms).

To disable telemetry for compiler events and crash reports, run this command in your project environment (you must run this for each project):

magic telemetry --disable
magic telemetry --disable

To disable serving telemetry, see the MAX container documentation.