Python module

weights

APIs for loading weights into a graph.

`GGUFWeights`

class max.graph.weights.GGUFWeights(source, tensors=None, prefix='', allocated=None)

Creates a GGUF weights reader.

Parameters:

source (Union[PathLike, gguf.GGUFReader]) – Path to a GGUF file or a GGUFReader object.
tensors – List of tensors in the GGUF checkpoint.
prefix (str) – Weight name or prefix.
allocated – Dictionary of allocated values.

`allocate()`

allocate(dtype=None, shape=None, quantization_encoding=None, device=cpu:0)

Creates and optionally validates a new Weight.

Parameters:

dtype (DType | None)
shape (Iterable[int | str | Dim | integer] | None)
quantization_encoding (QuantizationEncoding | None)

Return type:

Weight

`allocated_weights`

property allocated_weights: dict[str, ndarray[Any, dtype[_ScalarType_co]]]

Gets the values of all weights that were allocated previously.

`data()`

data()

Returns data loaded from the weights at the current prefix.

Raises:: KeyError if the current prefix isn't present in the checkpoint. –
Return type:: WeightData

`exists()`

exists()

Returns whether a weight with this exact name exists.

Return type:: bool

`items()`

items()

Iterate through all allocable weights that start with the prefix.

`name`

property name: str

The current weight name or prefix.

`PytorchWeights`

class max.graph.weights.PytorchWeights(filepath, tensor_infos=None, prefix='', allocated=None)

Parameters:

filepath (PathLike)
tensor_infos (Optional[dict[str, Any]])
prefix (str)

`allocate()`

allocate(dtype=None, shape=None, quantization_encoding=None, device=cpu:0)

Creates and optionally validates a new Weight.

Parameters:

dtype (DType | None)
shape (Iterable[int | str | Dim | integer] | None)
quantization_encoding (QuantizationEncoding | None)
device (DeviceRef)

Return type:

Weight

`allocated_weights`

property allocated_weights: dict[str, ndarray[Any, dtype[_ScalarType_co]]]

Gets the values of all weights that were allocated previously.

`data()`

data()

Return type:: WeightData

`dtype`

property dtype: DType

The current weight dtype, if this weight exists.

`exists()`

exists()

Return type:: bool

`items()`

items()

Iterate through all allocable weights that start with the prefix.

`name`

property name: str

The current weight name or prefix.

`quantization_encoding`

property quantization_encoding: QuantizationEncoding | None

The current weight quantization encoding, if this weight exists.

`shape`

property shape: Shape

The current weight shape, if this weight exists.

`SafetensorWeights`

class max.graph.weights.SafetensorWeights(filepaths, *, tensors=None, tensors_to_file_idx=None, prefix='', allocated=None, _st_weight_map=None, _st_file_handles=None)

Helper for loading weights into a graph.

A weight (max.graph.Weight) is tensors in a graph which are backed by an external buffer or mmap. Generally weights are used to avoid recompiling the graph when new weights are used (like from finetuning). For large-enough constants, it might be worth using weights for fast compilation times but the graph may be less optimized.

Weight classes can be used to help with graph weight allocation and naming. This protocol defines getter methods __getattr__ and __getitem__ to assist with defining names. For example, weights.a.b[1].c.allocate(…) creates a weight with the name “a.b.1.c”.

Parameters:

filepaths (Sequence[PathLike])
tensors (Optional[Set[str]])
tensors_to_file_idx (Mapping[str, int] | None)
prefix (str)
_st_weight_map (dict[str, Tensor])
_st_file_handles (dict[PathLike, SafeTensor])

`allocate()`

allocate(dtype=None, shape=None, quantization_encoding=None, device=cpu:0)

Creates a Weight that can be added to a graph.

Parameters:

dtype (DType | None)
shape (Iterable[int | str | Dim | integer] | None)
quantization_encoding (QuantizationEncoding | None)
device (DeviceRef)

Return type:

Weight

`allocate_as_bytes()`

allocate_as_bytes(dtype=None)

Create a Weight that can be added to the graph. Has a uint8 representation, instead of the original data type. Last dimension of the scale gets scaled by number of bytes it takes to represent the original data type. For example, [512, 256] float32 weights become [512, 1024] uint8 weights. Scalar weights will be interpreted as weights with shape [1].

Parameters:: dtype (DType | None)
Return type:: Weight

`allocated_weights`

property allocated_weights: dict[str, ndarray[Any, dtype[_ScalarType_co]]]

Gets the values of all weights that were allocated previously.

`data()`

data()

Returns data loaded from the weights at the current prefix.

Raises:: KeyError if the current prefix isn't present in the checkpoint. –
Return type:: WeightData

`exists()`

exists()

Returns whether a weight with this exact name exists.

Return type:: bool

`items()`

items()

Iterate through all allocable weights that start with the prefix.

`name`

property name: str

The current weight name or prefix.

`WeightData`

class max.graph.weights.WeightData(data, name, dtype, shape, quantization_encoding=None)

Data loaded from a checkpoint.

Parameters:

data (DLPackArray)
name (str)
dtype (DType)
shape (Shape)
quantization_encoding (QuantizationEncoding | None)

`astype()`

astype(dtype)

Parameters:: dtype (DType)
Return type:: WeightData

`data`

data: DLPackArray

`dtype`

dtype: DType

`from_numpy()`

classmethod from_numpy(arr, name)

`name`

name: str

`quantization_encoding`

quantization_encoding: QuantizationEncoding | None = None

`shape`

shape: Shape

`Weights`

class max.graph.weights.Weights(*args, **kwargs)

Helper for loading weights into a graph.

`allocate()`

allocate(dtype=None, shape=None, quantization_encoding=None, device=cpu:0)

Creates a Weight that can be added to a graph.

Parameters:

dtype (DType | None)
shape (Iterable[int | str | Dim | integer] | None)
quantization_encoding (QuantizationEncoding | None)
device (DeviceRef)

Return type:

Weight

`allocated_weights`

property allocated_weights: dict[str, ndarray[Any, dtype[_ScalarType_co]]]

Gets the values of all weights that were allocated previously.

`data()`

data()

Returns data loaded from the weights at the current prefix.

Raises:: KeyError if the current prefix isn't present in the checkpoint. –
Return type:: WeightData

`exists()`

exists()

Returns whether a weight with this exact name exists.

Return type:: bool

`items()`

items()

Iterate through all allocable weights that start with the prefix.

Parameters:: self (_Self)
Return type:: Iterator[tuple[str, _Self]]

`name`

property name: str

The current weight name or prefix.

`WeightsFormat`

class max.graph.weights.WeightsFormat(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

`gguf`

gguf = 'gguf'

`pytorch`

pytorch = 'pytorch'

`safetensors`

safetensors = 'safetensors'

`load_weights()`

max.graph.weights.load_weights(paths)

Loads neural network weights from checkpoint files.

Automatically detects checkpoint formats based on file extensions and returns the appropriate Weights implementation, creating a seamless interface for loading weights from different formats.

Supported formats:

Safetensors: .safetensors
PyTorch: .bin, .pt, .pth
GGUF: .gguf

The following example shows how to load weights from a Safetensors file:

from pathlib import Path
from max.graph.weights import load_weights

sharded_paths = [
    Path("model-00001-of-00003.safetensors"),
    Path("model-00002-of-00003.safetensors"),
    Path("model-00003-of-00003.safetensors")
]
weights = load_weights(sharded_paths)
layer_weight = weights.model.layers[23].mlp.gate_proj.weight.allocate(
    dtype=DType.float32,
    shape=[4096, 14336],
    device=DeviceRef.GPU(0)
)
from pathlib import Path
from max.graph.weights import load_weights

sharded_paths = [
    Path("model-00001-of-00003.safetensors"),
    Path("model-00002-of-00003.safetensors"),
    Path("model-00003-of-00003.safetensors")
]
weights = load_weights(sharded_paths)
layer_weight = weights.model.layers[23].mlp.gate_proj.weight.allocate(
    dtype=DType.float32,
    shape=[4096, 14336],
    device=DeviceRef.GPU(0)
)

Parameters:: paths (list[Path]) – List of pathlib.Path objects pointing to checkpoint files. For multi-file checkpoints (e.g., sharded Safetensors), provide all file paths in the list. For single-file checkpoints, provide a list with one path.
Return type:: Weights

`weights_format()`

max.graph.weights.weights_format(weight_paths)

Retrieve the format of the weights files in the provided paths.

Parameters:: weight_paths (list[Path]) – A list of file paths, containing the weights for a single model.
Returns:: A WeightsFormat enum, representing whether the weights are in gguf, safetensors or pytorch format.
Raises:: ValueError – If weights type cannot be inferred from the paths.
Return type:: WeightsFormat

GGUFWeights​

allocate()​

allocated_weights​

data()​

exists()​

items()​

name​

PytorchWeights​

allocate()​

allocated_weights​

data()​

dtype​

exists()​

items()​

name​

quantization_encoding​

shape​

SafetensorWeights​

allocate()​

allocate_as_bytes()​

allocated_weights​

data()​

exists()​

items()​

name​

WeightData​

astype()​

data​

dtype​

from_numpy()​

name​

quantization_encoding​

shape​

Weights​

allocate()​

allocated_weights​

data()​

exists()​

items()​

name​

WeightsFormat​

gguf​

pytorch​

safetensors​

load_weights()​

weights_format()​

`GGUFWeights`

`allocate()`

`allocated_weights`

`data()`

`exists()`

`items()`

`name`

`PytorchWeights`

`allocate()`

`allocated_weights`

`data()`

`dtype`

`exists()`

`items()`

`name`

`quantization_encoding`

`shape`

`SafetensorWeights`

`allocate()`

`allocate_as_bytes()`

`allocated_weights`

`data()`

`exists()`

`items()`

`name`

`WeightData`

`astype()`

`data`

`dtype`

`from_numpy()`

`name`

`quantization_encoding`

`shape`

`Weights`

`allocate()`

`allocated_weights`

`data()`

`exists()`

`items()`

`name`

`WeightsFormat`

`gguf`

`pytorch`

`safetensors`

`load_weights()`

`weights_format()`