Skip to main content
Log in

Python module

naive_transformer

NaiveTransformer

class max.pipelines.nn.transformer.naive_transformer.NaiveTransformer(dim: int, n_heads: int, layers: list[max.pipelines.nn.transformer.naive_transformer.NaiveTransformerBlock], norm: RMSNorm, output: Linear, theta: float, embedding: Embedding)

Max-Graph only model consisting of NaiveTransformerBlock layers.

dim

dim*: int*

embedding

embedding*: Embedding*

layers

layers*: list[max.pipelines.nn.transformer.naive_transformer.NaiveTransformerBlock]*

n_heads

n_heads*: int*

norm

norm*: RMSNorm*

output

output*: Linear*

theta

theta*: float*

NaiveTransformerBlock

class max.pipelines.nn.transformer.naive_transformer.NaiveTransformerBlock(attention: NaiveAttentionWithRope, mlp: MLP, attention_norm: RMSNorm, mlp_norm: RMSNorm)

Max-Graph Only Stack of Attention, FeedForward, and RMSNorm layers.

attention

attention*: NaiveAttentionWithRope*

attention_norm

attention_norm*: RMSNorm*

mlp

mlp*: MLP*

mlp_norm

mlp_norm*: RMSNorm*