Known issues

We're aware of the following major issues and are working to resolve them.

If you encounter other issues, please report them on GitHub.

Error: "undefined reference" to `GLIBCXX_3.4.30` when using conda

If you install max with conda, you might get an error when you run inference with MAX, such as this:

/usr/bin/ld: /home/ubuntu/miniconda3/envs/max-bare-conda/lib/libmodular-framework-common.so: undefined reference to `std::condition_variable::wait(std::unique_lock<std::mutex>&)@GLIBCXX_3.4.30'
collect2: error: ld returned 1 exit status
gmake[2]: *** [CMakeFiles/bert.dir/build.make:105: bert] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/bert.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
/usr/bin/ld: /home/ubuntu/miniconda3/envs/max-bare-conda/lib/libmodular-framework-common.so: undefined reference to `std::condition_variable::wait(std::unique_lock<std::mutex>&)@GLIBCXX_3.4.30'
collect2: error: ld returned 1 exit status
gmake[2]: *** [CMakeFiles/bert.dir/build.make:105: bert] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/bert.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2

If you're on Ubuntu, you can solve it with these additional packages:

conda install -c conda-forge libgcc-ng libstdcxx-ng -y
conda install -c conda-forge libgcc-ng libstdcxx-ng -y

If you're on Mac, we currently don't have a fix and recommend instead installing max with Magic.

For more details and updates, see GitHub issue #218.

Error: "cannot allocate memory in static TLS block"

When executing a model with the MAX Engine Python API, you might encounter an error that says, cannot allocate memory in static TLS block. This happens due to a bug that stems from the order in which Python modules are loaded, and it affects specific targets including aarch64 on systems with a glibc 2.31 or lower, such as Ubuntu 20.04.

When encountering this issue please try the following workarounds:

Re-order the Python import statements so that import max.engine appears first (before torch and transformers).

Or, LD_PRELOAD the shared library that fails to allocate memory in a static TLS block. For example, if you see this error message:

/usr/local/lib/python3.8/dist-packages/max/lib/libgomp-9c79e370.so.1: cannot allocate memory in static TLS block

/usr/local/lib/python3.8/dist-packages/max/lib/libgomp-9c79e370.so.1: cannot allocate memory in static TLS block

Then, re-run the command prefixed with:

LD_PRELOAD=/usr/local/lib/python3.8/dist-packages/max/lib/libgomp-9c79e370.so.1

LD_PRELOAD=/usr/local/lib/python3.8/dist-packages/max/lib/libgomp-9c79e370.so.1

Both workarounds ensure that the MAX Engine library has access to static TLS block memory before it is all used up by the other modules, which may not require static TLS but still use the surplus static TLS as an optimization.

Glibc 2.32 and newer reserve 128 bytes of surplus static TLS for modules that require it (more detail), so this should not be a problem on systems with glibc >= 2.32, such as Ubuntu 22.04.

MAX Serving inputs/outputs must be tensors

Using Triton with MAX Engine does not support model inputs/outputs as a dictionary or any other collection type; you must use tensors.

When converting a PyTorch model to TorchScript format, you must disable dictionary formats in the model configuration. For example:

model = AutoModelForSequenceClassification.from_pretrained(HF_MODEL_NAME)
model.config.return_dict = False
model = AutoModelForSequenceClassification.from_pretrained(HF_MODEL_NAME)
model.config.return_dict = False

For more detail, see this example for PyTorch BERT.

MAX Engine can't load multiple model formats

MAX Engine does not allow you to load models with different formats in the same inference session or server instance. For example, you can't load one model from PyTorch and then another one from ONNX. Doing so results in a failed to load error.

Currently, if you want to load a different model format, you must restart the process with MAX Engine or restart MAX Serving (the Triton server).

MAX Engine Mojo API `symbol lookup error`

Importing the Python torch packages—or other libraries that transitively import them (such as transformers)—does not interoperate with the max.engine Mojo library. Importing these together might result in a symbol resolution error message that starts with mojo: symbol lookup error.

This is a temporary issue that will be fixed, and it applies only to the Mojo API for MAX Engine—you can safely use these Python packages with the Python API for MAX Engine.

Mojo JIT session error

You might encounter certain code configurations that result in a JIT session error, which happens when the Mojo JIT compiler fails to find a specific symbol. We've seen this happen recently when using the MAX Engine API Mojo API. In some cases, you can workaround it with a little luck by simply rearranging the code and moving some of it to a separate function.

We're making significant changes to the way that Mojo generates code, and this is one of the known JIT issues that we're working on.

New extensibility API for custom ops

In v24.3, we released the first version of our extensibility API for custom ops, but we discovered some issues and decided to completely redesign it.

Because MAX is still a preview, we don't want to leave APIs in the platform that we have no intention to support. Stay tuned for an improved extensibility API that works on both CPUs and GPUs.

MAX Graph does not support empty graphs

Currently, graphs that directly return their inputs may return incorrect values on those returns.

Error: "undefined reference" to GLIBCXX_3.4.30 when using conda​

Error: "cannot allocate memory in static TLS block"​

MAX Serving inputs/outputs must be tensors​

MAX Engine can't load multiple model formats​

MAX Engine Mojo API symbol lookup error​

Mojo JIT session error​

New extensibility API for custom ops​

MAX Graph does not support empty graphs​