Skip to main content

Python module

conv

The conv module provides classes for performing convolution operations in various dimensions (1D, 2D, and 3D) on tensor inputs. These convolution operations are core building blocks for neural networks, especially in computer vision and sequence processing tasks.

Here’s an example demonstrating how to use a 1D convolution:

import max.nn as nn
from max.graph import Graph, ops, Weight, DeviceRef
from max.dtype import DType
import numpy as np

with Graph(name="conv_example") as graph:
# Define dimensions
batch_size = 2
seq_length = 10
in_channels = 16
out_channels = 32
kernel_size = 3

# Create input tensor [batch_size, sequence_length, channels]
x_data = np.zeros((batch_size, seq_length, in_channels), dtype=np.float32)
x = ops.constant(x_data, dtype=DType.float32, device=DeviceRef.CPU())

# Create weights for convolution
filter_1d = Weight(
name="filter_weight",
dtype=DType.float32,
shape=[kernel_size, in_channels, out_channels]
device=DeviceRef.CPU()
)
bias_1d = Weight(
name="bias_weight",
dtype=DType.float32,
shape=[out_channels]
device=DeviceRef.CPU()
)

# Create and apply Conv1D layer
conv1d = nn.Conv1D(
filter=filter_1d,
bias=bias_1d,
stride=1,
padding=1
)

output_1d = conv1d(x)
print(f"Conv1D output shape: {output_1d.shape}")
# Output: Conv1D output shape: [Dim(2), Dim(10), Dim(32)]
import max.nn as nn
from max.graph import Graph, ops, Weight, DeviceRef
from max.dtype import DType
import numpy as np

with Graph(name="conv_example") as graph:
# Define dimensions
batch_size = 2
seq_length = 10
in_channels = 16
out_channels = 32
kernel_size = 3

# Create input tensor [batch_size, sequence_length, channels]
x_data = np.zeros((batch_size, seq_length, in_channels), dtype=np.float32)
x = ops.constant(x_data, dtype=DType.float32, device=DeviceRef.CPU())

# Create weights for convolution
filter_1d = Weight(
name="filter_weight",
dtype=DType.float32,
shape=[kernel_size, in_channels, out_channels]
device=DeviceRef.CPU()
)
bias_1d = Weight(
name="bias_weight",
dtype=DType.float32,
shape=[out_channels]
device=DeviceRef.CPU()
)

# Create and apply Conv1D layer
conv1d = nn.Conv1D(
filter=filter_1d,
bias=bias_1d,
stride=1,
padding=1
)

output_1d = conv1d(x)
print(f"Conv1D output shape: {output_1d.shape}")
# Output: Conv1D output shape: [Dim(2), Dim(10), Dim(32)]

Conv1D

class max.nn.conv.Conv1D(kernel_size, in_channels, out_channels, dtype, stride=1, padding=0, dilation=1, num_groups=1, device=None, has_bias=False, permute=False, name=None)

A 1D convolution over an input signal composed of several input planes.

Example:

conv = nn.Conv1D(
kernel_size=3,
in_channels=64,
out_channels=128,
dtype=DType.float32,
stride=1,
padding=0,
has_bias=False,
name="conv1d_weight",
device=DeviceRef.GPU(),
)
conv = nn.Conv1D(
kernel_size=3,
in_channels=64,
out_channels=128,
dtype=DType.float32,
stride=1,
padding=0,
has_bias=False,
name="conv1d_weight",
device=DeviceRef.GPU(),
)

Initializes the Conv1D layer with weights and optional bias.

Parameters:

  • kernel_size (int) – Size of the convolving kernel (width dimension).
  • in_channels (int) – Number of channels in the input signal.
  • out_channels (int) – Number of channels produced by the convolution.
  • dtype (DType) – The data type for both weights and bias.
  • stride (int) – Stride of the convolution. Controls the step size when sliding the kernel. Default: 1
  • padding (int) – Padding added to both sides of the input sequence. Default: 0
  • dilation (int) – Spacing between kernel elements. Controls the kernel dilation rate. Default: 1
  • num_groups (int) – Number of blocked connections from input channels to output channels. Input channels and output channels are divided into groups. Default: 1
  • device (DeviceRef | None) – The target device for computation. If None, defaults to CPU. Weights are initially stored on CPU and moved to target device during computation.
  • name (Union[str, None]) – Base name for weights. If provided, weights are named {name}.weight and {name}.bias (if bias is enabled). If None, uses “weight” and “bias”.
  • has_bias (bool) – If true, adds a learnable bias vector to the layer. Defaults to False.
  • permute (bool) – If true, permutes weights from PyTorch format to MAX format. PyTorch order: (out_channels, in_channels / num_groups, kernel_size). MAX API order: (kernel_size, in_channels / num_groups, out_channels). Defaults to False.

bias

bias: Weight | None = None

The optional bias vector stored on CPU with shape (out_channels,). Model init moves the bias to device if present.

device

device: DeviceRef | None

The device where matrix operations are performed.

dilation

dilation: int

Controls the dilation rate.

filter

filter: Weight

The weight matrix stored on CPU with shape (kernel_size, in_channels / num_groups, out_channels). Model init moves the weight to device.

num_groups

num_groups: int

Number of blocked connections from input channels to output channels.

padding

padding: int

Controls the amount of padding applied before and after the input.

permute

permute: bool = False

bool controls whether self.filter is permuted from PyTorch order to max order. PyTorch order is: (out_channels, in_channels / num_groups, kernel_size) Max API order: (kernel_size, in_channels / num_groups, out_channels).

stride

stride: int

Controls the stride for the cross-correlation.

Conv1DV1

class max.nn.conv.Conv1DV1(filter, bias=None, stride=1, padding=0, dilation=1, groups=1)

A 1D convolution over an input signal composed of several input planes.

DEPRECATED: Use Conv1D instead.

Parameters:

bias

bias: Value[TensorType] | TensorValue | Shape | Dim | int | float | integer | floating | ndarray | None = None

dilation

dilation: int = 1

filter

filter: Value[TensorType] | TensorValue | Shape | Dim | int | float | integer | floating | ndarray

groups

groups: int = 1

padding

padding: int = 0

stride

stride: int = 1

Conv2D

class max.nn.conv.Conv2D(kernel_size, in_channels, out_channels, dtype, stride=1, padding=0, dilation=1, num_groups=1, device=None, has_bias=False, permute=False, name=None)

A 2D convolution over an input signal composed of several input planes.

Example:

conv = nn.Conv2D(
kernel_size=3,
in_channels=64,
out_channels=128,
dtype=DType.float32,
stride=1,
padding=0,
has_bias=False,
name="conv2d_weight",
device=DeviceRef.GPU(),
)
conv = nn.Conv2D(
kernel_size=3,
in_channels=64,
out_channels=128,
dtype=DType.float32,
stride=1,
padding=0,
has_bias=False,
name="conv2d_weight",
device=DeviceRef.GPU(),
)

Initializes the Conv2D layer with weights and optional bias.

Parameters:

  • kernel_size (Union[int, tuple[int, int]]) – Size of the convolving kernel. Can be a single int (square kernel) or tuple (height, width).
  • in_channels (int) – Number of channels in the input image.
  • out_channels (int) – Number of channels produced by the convolution.
  • dtype (DType) – The data type for both weights and bias.
  • stride (tuple[int, int]) – Stride of the convolution for height and width dimensions. Can be int (applied to both dimensions) or tuple (stride_h, stride_w). Default: 1
  • padding (tuple[int, int, int, int]) – Padding added to input. Can be int (applied to all sides), tuple of 2 ints (pad_h, pad_w), or tuple of 4 ints (pad_top, pad_bottom, pad_left, pad_right). Default: 0
  • dilation (tuple[int, int]) – Spacing between kernel elements for height and width dimensions. Can be int (applied to both dimensions) or tuple (dilation_h, dilation_w). Default: 1
  • num_groups (int) – Number of blocked connections from input channels to output channels. Input channels and output channels are divided into groups. Default: 1
  • device (DeviceRef | None) – The target device for computation. If None, defaults to CPU. Weights are initially stored on CPU and moved to target device during computation.
  • name (Union[str, None]) – Base name for weights. If provided, weights are named {name}.weight and {name}.bias (if bias is enabled). If None, uses “weight” and “bias”.
  • has_bias (bool) – If true, adds a learnable bias vector to the layer. Defaults to False.
  • permute (bool) – If true, permutes weights from PyTorch format to MAX format. PyTorch order: (out_channels, in_channels / num_groups, height, width). MAX API order: (height, width, in_channels / num_groups, out_channels). Defaults to False.

bias

bias: Weight | None = None

The optional bias vector stored on CPU with shape (out_channels,). Model init moves the bias to device if present.

device

device: DeviceRef | None

The device where matrix operations are performed.

dilation

dilation: tuple[int, int]

Controls the dilation rate.

filter

filter: Weight

The weight matrix stored on CPU with shape (height, width, in_channels / num_groups, out_channels). Model init moves the weight to device.

num_groups

num_groups: int

Number of blocked connections from input channels to output channels.

padding

padding: tuple[int, int, int, int]

Controls the amount of padding applied before and after the input for height and width dimensions.

permute

permute: bool = False

bool controls whether self.filter is permuted from PyTorch order to max order. PyTorch order is: (out_channels, in_channels / num_groups, height, width) Max API order: (height, width, in_channels / num_groups, out_channels).

shard()

shard(shard_idx, device)

Creates a sharded view of this Conv2D layer for a specific device.

Parameters:

  • shard_idx (int) – The index of the shard (0 to num_devices-1).
  • device (DeviceRef) – The device where this shard should reside.

Returns:

A sharded Conv2D instance.

Return type:

Conv2D

sharding_strategy

property sharding_strategy: ShardingStrategy | None

Get the Conv2D sharding strategy.

stride

stride: tuple[int, int]

Controls the stride for the cross-correlation.

Conv2DV1

class max.nn.conv.Conv2DV1(filter, bias=None, stride=(1, 1), padding=(0, 0, 0, 0), dilation=(1, 1), groups=1)

A 2D convolution over an input signal composed of several input planes.

DEPRECATED: Use Conv2D instead.

Parameters:

bias

bias: Value[TensorType] | TensorValue | Shape | Dim | int | float | integer | floating | ndarray | None = None

dilation

dilation: int | tuple[int, int] = (1, 1)

filter

filter: Value[TensorType] | TensorValue | Shape | Dim | int | float | integer | floating | ndarray

groups

groups: int = 1

padding

padding: int | tuple[int, int, int, int] = (0, 0, 0, 0)

stride

stride: int | tuple[int, int] = (1, 1)

Conv3D

class max.nn.conv.Conv3D(depth, height, width, in_channels, out_channels, dtype, stride=1, padding=0, dilation=1, num_groups=1, device=None, has_bias=False, permute=False, name=None)

A 3D convolution over an input signal composed of several input planes.

Example:

conv = nn.Conv3D(
depth=,
height=,
width=,
in_channels=,
out_channels=,
dtype=DType.float32,
stride=1,
padding=0,
has_bias=False,
name="conv3d_weight",
device=DeviceRef.GPU(),
)
conv = nn.Conv3D(
depth=,
height=,
width=,
in_channels=,
out_channels=,
dtype=DType.float32,
stride=1,
padding=0,
has_bias=False,
name="conv3d_weight",
device=DeviceRef.GPU(),
)

Initializes the Conv3D layer with weights and optional bias.

Parameters:

  • depth (int) – Depth dimension of the convolution kernel (kernel_size[0]).
  • height (int) – Height dimension of the convolution kernel (kernel_size[1]).
  • width (int) – Width dimension of the convolution kernel (kernel_size[2]).
  • in_channels (int) – Number of channels in the input image.
  • out_channels (int) – Number of channels produced by the convolution.
  • dtype (DType) – The data type for both weights and bias.
  • stride (tuple[int, int, int]) – Stride of the convolution for depth, height, and width dimensions. Can be int (applied to all dimensions) or tuple of 3 ints. Default: 1
  • padding (tuple[int, int, int, int, int, int]) – Padding added to all six sides of the input in order: (pad_front, pad_back, pad_top, pad_bottom, pad_left, pad_right). Can be int (applied to all sides) or tuple of 6 ints. Default: 0
  • dilation (tuple[int, int, int]) – Spacing between kernel elements for depth, height, and width dimensions. Can be int (applied to all dimensions) or tuple of 3 ints. Default: 1
  • num_groups (int) – Number of blocked connections from input channels to output channels. Input channels and output channels are divided into groups. Default: 1.
  • device (DeviceRef | None) – The target device for computation. If None, defaults to CPU. Weights are initially stored on CPU and moved to target device during computation.
  • name (Union[str, None]) – Base name for weights. If provided, weights are named {name}.weight and {name}.bias (if bias is enabled). If None, uses “weight” and “bias”.
  • has_bias (bool) – If true, adds a learnable bias vector to the layer. Defaults to False.
  • permute (bool) – If true, permutes weights from PyTorch format to MAX format. PyTorch order: (out_channels, in_channels / num_groups, depth, height, width). MAX API order: (depth, height, width, in_channels / num_groups, out_channels). Defaults to False.

bias

bias: Weight | None = None

The optional bias vector stored on CPU with shape (out_channels,). Model init moves the bias to device if present.

device

device: DeviceRef | None

The device where matrix operations are performed.

dilation

dilation: tuple[int, int, int]

Controls the dilation rate for depth, height, and width dimensions.

filter

filter: Weight

The weight matrix stored on CPU with shape (depth, height, width, in_channels / num_groups, out_channels). Model init moves the weight to device.

num_groups

num_groups: int

Number of blocked connections from input channels to output channels.

padding

padding: tuple[int, int, int, int, int, int]

Controls the amount of padding applied before and after the input for depth, height, and width dimensions.

permute

permute: bool = False

bool controls whether self.filter is permuted from PyTorch order to max order. PyTorch order is: (out_channels, in_channels / num_groups, depth, height, width) Max API order: (depth, height, width, in_channels / num_groups, out_channels).

stride

stride: tuple[int, int, int]

Controls the stride for the cross-correlation.

Conv3DV1

class max.nn.conv.Conv3DV1(filter, bias=None, stride=(1, 1, 1), padding=(0, 0, 0, 0, 0, 0), dilation=(1, 1, 1), groups=1)

A 3D convolution over an input signal composed of several input planes.

DEPRECATED: Use Conv3D instead.

Parameters:

bias

bias: Value[TensorType] | TensorValue | Shape | Dim | int | float | integer | floating | ndarray | None = None

dilation

dilation: int | tuple[int, int, int] = (1, 1, 1)

filter

filter: Value[TensorType] | TensorValue | Shape | Dim | int | float | integer | floating | ndarray

groups

groups: int = 1

padding

padding: int | tuple[int, int, int, int, int, int] = (0, 0, 0, 0, 0, 0)

stride

stride: int | tuple[int, int, int] = (1, 1, 1)