Lifetimes, origins, and references
The Mojo compiler includes a lifetime checker, a compiler pass that analyzes dataflow through your program. It identifies when variables are valid and inserts destructor calls when a variable's lifetime ends.
The Mojo compiler uses a special value called an origin to track the lifetime of variables and the validity of references.
Specifically, an origin answers two questions:
- What variable "owns" this value?
- Can the value be mutated using this reference?
For example, consider the following code:
fn print_str(s: String):
print(s)
name = String("Joan")
print_str(name)
fn print_str(s: String):
print(s)
name = String("Joan")
print_str(name)
The line name = String("Joan")
declares a variable with an identifier (name
)
and logical storage space for a String
value. When you pass name
into the
print_str()
function, the function gets an immutable reference to the value.
So both name
and s
refer to the same logical storage space, and have
associated origin values that lets the Mojo compiler reason about them.
Most of the time, origins are handled automatically by the compiler. However, in some cases you'll need to interact with origins directly:
-
When working with references—specifically
ref
arguments andref
return values. -
When working with types like
Pointer
orSpan
which are parameterized on the origin of the data they refer to.
This section also covers ref
arguments and
ref
return values, which let functions
take arguments and provide return values as references with parametric
origins.
Working with origins
Mojo's origin values are unlike most other values in the language, because they're primitive values, not Mojo structs.
Likewise, because these values are mostly created by the compiler, you can't just create your own origin value—you usually need to derive an origin from an existing value.
Origin types
Mojo supplies a struct and a set of aliases that you can use to specify
origin types. As the names suggest, the ImmutableOrigin
and
MutableOrigin
aliases represent immutable and mutable origins,
respectively:
struct ImmutableRef[origin: ImmutableOrigin]:
pass
struct ImmutableRef[origin: ImmutableOrigin]:
pass
Or you can use the Origin
struct to specify an origin with parametric mutability:
struct ParametricRef[
is_mutable: Bool,
//,
origin: Origin[is_mutable].type
]:
pass
struct ParametricRef[
is_mutable: Bool,
//,
origin: Origin[is_mutable].type
]:
pass
Note that Origin
isn't an origin value, it's a helper for specifying a
origin type. Origin types carry the mutability of a reference as a
boolean parameter value, indicating whether the origin is mutable, immutable,
or even with mutability depending on a parameter specified by the enclosing API.
The is_mutable
parameter here is an infer-only
parameter. It's never
specified directly by the user, but always inferred from context. The
origin
value is often inferred, as well. For example, the following code
creates a Pointer
to an existing
value, but doesn't need to specify an origin—the origin
is inferred from
the variable passed in to the address_of()
method.
from memory import Pointer
def use_pointer():
a = 10
ptr = Pointer.address_of(a)
from memory import Pointer
def use_pointer():
a = 10
ptr = Pointer.address_of(a)
A final type of origin value is an OriginSet
. As the name suggests, an
OriginSet
represents a group of origins.
Origin values
Most origin values are created by the compiler. As a developer, there are a few ways to specify origin values:
- Static origin. The
StaticConstantOrigin
alias is an origin value representing immutable values that that last for the duration of the program. String literal values have aStaticConstantOrigin
. - The
__origin_of()
magic function, which returns the origin associated with the value (or values) passed in. - Inferred origin. You can use inferred parameters to capture the origin of a value passed in to a function.
- Wildcard origins. The
ImmutableAnyOrigin
andMutableAnyOrigin
aliases are special cases indicating a reference that might access any live value.
Static origins
You can use the static origin StaticConstantOrigin
when you have a
value that exists for the entire duration of the program.
For example, the StringLiteral
method
as_string_slice()
returns a StringSlice
pointing
to the original string literal. String literals are static—they're allocated at
compile time and never destroyed—so the slice is created with an immutable,
static origin.
Derived origins
Use the __origin_of(value)
operator to obtain a value's origin. The
argument to __origin_of()
can take an arbitrary expression:
__origin_of(self)
__origin_of(x.y)
__origin_of(foo())
__origin_of(self)
__origin_of(x.y)
__origin_of(foo())
The __origin_of()
operator is analyzed statically at compile time;
The expression passed to __origin_of()
is never evaluated. (For example,
when the compiler analyzes __origin_of(foo())
, it doesn't run the foo()
function.)
The following struct stores a string value using a
OwnedPointer
: a smart
pointer that holds an owned value. The as_ptr()
method returns a Pointer
to
the stored string, using the same origin as the original OwnedPointer
.
from memory import OwnedPointer, Pointer
struct BoxedString:
var box: OwnedPointer[String]
fn __init__(out self, value: String):
self.box = OwnedPointer(value)
fn as_ptr(self) -> Pointer[String, __origin_of(self.box)]:
return Pointer.address_of(self.box[])
from memory import OwnedPointer, Pointer
struct BoxedString:
var box: OwnedPointer[String]
fn __init__(out self, value: String):
self.box = OwnedPointer(value)
fn as_ptr(self) -> Pointer[String, __origin_of(self.box)]:
return Pointer.address_of(self.box[])
Inferred origins
The other common way to access an origin value is to infer it from the
the arguments passed to a function or method. For example, the Span
type
has an associated origin
:
struct Span[
is_mutable: Bool, //,
T: CollectionElement,
origin: Origin[is_mutable].type,
](CollectionElementNew):
"""A non owning view of contiguous data.
struct Span[
is_mutable: Bool, //,
T: CollectionElement,
origin: Origin[is_mutable].type,
](CollectionElementNew):
"""A non owning view of contiguous data.
One of its constructors creates a Span
from an existing List
, and infers
its origin
value from the list:
fn __init__(out self, ref [origin]list: List[T, *_]):
"""Construct a Span from a List.
Args:
list: The list to which the span refers.
"""
self._data = list.data
self._len = len(list)
fn __init__(out self, ref [origin]list: List[T, *_]):
"""Construct a Span from a List.
Args:
list: The list to which the span refers.
"""
self._data = list.data
self._len = len(list)
Working with references
You can use the ref
keyword with arguments and return values to specify a
reference with parametric mutability. That is, they can be either mutable or
immutable.
From inside the called function, a ref
argument looks like a borrowed
or
inout
argument.
A ref
return value looks like any other return value to the calling function,
but it is a reference to an existing value, not a copy.
ref
arguments
The ref
argument convention lets you specify an argument of parametric
mutability: that is, you don't need to know in advance whether the passed
argument will be mutable or immutable. There are several reasons you might want
to use a ref
argument:
-
You want to accept an argument with parametric mutability.
-
You want to tie the lifetime of one argument to the lifetime of another argument.
-
When you want an argument that is guaranteed to be passed in memory: this can be important and useful for generic arguments that need an identity, irrespective of whether the concrete type is register passable.
The syntax for a ref
argument is:
ref [origin_specifier] arg_name: arg_type
The origin specifier passed inside the square brackets can be either:
-
An origin value.
-
An arbitrary expression, which is treated as shorthand for
__origin_of(expression)
. In other words, the following declarations are equivalent:ref [__origin_of(self)]
ref [self]ref [__origin_of(self)]
ref [self] -
An underscore character (
_
) to indicate that the origin is unbound. You can think of the underscore as a wildcard that will accept any origin:def add_ref(ref a: Int, b: Int) -> Int:
return a+bdef add_ref(ref a: Int, b: Int) -> Int:
return a+b
You can also name the origin explicitly. This is useful if you want to specify
an ImmutableOrigin
or MutableLOrigin
, or if you want to bind to
the is_mutable
parameter.
def take_str_ref[
is_mutable: Bool, //,
origin: Origin[is_mutable].type
](ref [origin] s: String):
@parameter
if is_mutable:
print("Mutable: " + s)
else:
print("Immutable: " + s)
def pass_refs(s1: String, owned s2: String):
take_str_ref(s1)
take_str_ref(s2)
pass_refs("Hello", "Goodbye")
def take_str_ref[
is_mutable: Bool, //,
origin: Origin[is_mutable].type
](ref [origin] s: String):
@parameter
if is_mutable:
print("Mutable: " + s)
else:
print("Immutable: " + s)
def pass_refs(s1: String, owned s2: String):
take_str_ref(s1)
take_str_ref(s2)
pass_refs("Hello", "Goodbye")
ref
return values
Like ref
arguments, ref
return values allow a function to return a mutable
or immutable reference to a value. Like a borrowed
or inout
argument, these
references don't need to be dereferenced.
ref
return values can be an efficient way to handle updating items in a
collection. The standard way to do this is by implementing the __getitem__()
and __setitem__()
dunder methods. These are invoked to read from and write to
a subscripted item in a collection:
value = list[a]
list[b] += 10
value = list[a]
list[b] += 10
With a ref
argument, __getitem__()
can return a mutable reference that can
be modified directly. This has pros and cons compared to using a __setitem__()
method:
-
The mutable reference is more efficient—a single update isn't broken up across two methods. However, the referenced value must be in memory.
-
A
__getitem__()
/__setitem__()
pair allows for arbitrary code to be run when values are retrieved and set. For example,__setitem__()
can validate or constrain input values.
For example, in the following example, NameList
has a __getitem__()
method
that returns a reference:
struct NameList:
var names: List[String]
def __init__(out self, *names: String):
self.names = List[String]()
for name in names:
self.names.append(name[])
def __getitem__(ref self, index: Int) ->
ref [self.names] String:
if (index >=0 and index < len(self.names)):
return self.names[index]
else:
raise Error("index out of bounds")
def use_name_list():
list = NameList("Thor", "Athena", "Dana", "Vrinda")
print(list[2])
list[2] += "?"
print(list[2])
use_name_list()
struct NameList:
var names: List[String]
def __init__(out self, *names: String):
self.names = List[String]()
for name in names:
self.names.append(name[])
def __getitem__(ref self, index: Int) ->
ref [self.names] String:
if (index >=0 and index < len(self.names)):
return self.names[index]
else:
raise Error("index out of bounds")
def use_name_list():
list = NameList("Thor", "Athena", "Dana", "Vrinda")
print(list[2])
list[2] += "?"
print(list[2])
use_name_list()
Note that this update succeeds, even though NameList
doesn't define a
__setitem__()
method:
list[2] += "?"
list[2] += "?"
Also note that the code uses the return value directly each time, rather than assigning the return value to a variable, like this:
name = list[2]
name += "?"
name = list[2]
name += "?"
Since a variable needs to own its value, name
would end up with an owned
copy of the referenced value. Mojo doesn't currently have
syntax to express that you want to keep the original reference in name
. This
will be added in a future release.
If you're working with an API that returns a reference, and you want to avoid
copying the referenced value, you can use a
Pointer
to hold an indirect reference.
You can assign a Pointer
to a variable, but you need to use the dereference
operator ([]
) to access the underlying value.
name_ptr = Pointer.address_of(list[2])
name_ptr[] += "?"
name_ptr = Pointer.address_of(list[2])
name_ptr[] += "?"
Similarly, when designing an API you might want to return a Pointer
instead of
a ref
to allow users to assign the return value to a variable. For example,
iterators for the standard library collections return pointers, so they can be
used in for..in
loops:
nums = List(1, 2, 3)
for item in nums: # List iterator returns a Pointer, which must be dereferenced
print(item[])
for i in range(len(nums)):
print(nums[i]) # List __getitem__() returns a ref
nums = List(1, 2, 3)
for item in nums: # List iterator returns a Pointer, which must be dereferenced
print(item[])
for i in range(len(nums)):
print(nums[i]) # List __getitem__() returns a ref
(You can find the code for the
List
iterator in the Mojo
repo.)
Parametric mutability of return values
Another advantage of ref
return arguments is the ability to support parametric
mutability. For example, recall the signature of the __getitem__()
method
above:
def __getitem__(ref self, index: Int) ->
ref [self] String:
def __getitem__(ref self, index: Int) ->
ref [self] String:
Since the origin
of the return value is tied to the origin of self
, the
returned reference will be mutable if the method was called using a
mutable reference. The method still works if you have an immutable reference
to the NameList
, but it returns an immutable reference:
fn pass_immutable_list(list: NameList) raises:
print(list[2])
# list[2] += "?" # Error, this list is immutable
def use_name_list_again():
list = NameList("Sophie", "Jack", "Diana")
pass_immutable_list(list)
use_name_list_again()
fn pass_immutable_list(list: NameList) raises:
print(list[2])
# list[2] += "?" # Error, this list is immutable
def use_name_list_again():
list = NameList("Sophie", "Jack", "Diana")
pass_immutable_list(list)
use_name_list_again()
Without parametric mutability, you'd need to write two versions of
__getitem__()
, one that accepts an immutable self
and another that accepts
a mutable self
.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!