librt.vecs

The librt.vecs module defines the vec type, a low-level, uniform growable array type. It’s part of the librt package on PyPI.

When constructing a vec, the item type T is always explicitly given via vec[T]:

from librt.vecs import append, vec

v = vec[float]([1.0, 2.5])  # Construct vec[float] with two items

vec supports many sequence operations, though it’s not a full sequence type:

len(v)  # 2
v[0]    # 1.0
v[-1]   # 2.5
for x in v:
    print(x)

The length of each vec value is immutable. Appending an item is still a fast operation, but it returns a new vec value:

v = append(v, -0.5)
print(v)  # vec[float]([1.0, 2.5, -0.5])

vec only supports simple, uniform item types. It uses an efficient packed binary encoding for these value item types:

  • mypy_extensions.i64 (signed 64-bit integer)

  • mypy_extensions.i32 (signed 32-bit integer)

  • mypy_extensions.i16 (signed 16-bit integer)

  • mypy_extensions.u8 (unsigned byte)

  • float (64-bit float)

  • bool

int is not a valid item type, since it has an arbitrary precision, and vec is an efficiency-focused type. Use one of the fixed-length integer types instead.

Class item types (e.g. str or MyNativeClass) are represented as regular object references. Optional class item types (e.g. str | None) are supported for convenience, but arbitrary union types are not supported as item types. Nested vecs are supported, e.g. vec[vec[i64]].

A vec value is often used as an efficient alternative to list or array.array in code compiled using mypyc. Its primary advantages are an efficient packed memory representation for value item types and very fast inlined get and set item operations.

Vec instances perform runtime checking of item types. Since values of type variables are not available at runtime (they are erased), type variables can’t be used as item types.

A vec value is effectively an immutable (length, buffer) pair. This means that any operation that changes the length of a vec, including append as we saw above, returns a modified value.

Note

An immutable length allows more efficient code to be generated by mypyc, and vec values can be allocated to machine registers effectively. However, vec values must be boxed if used in a non-native context, such as if added to a list or dict.

Here are some examples of valid vec types:

Type

Item representation

vec[i32]

Packed 32-bit integers

vec[float]

Packed 64-bit floats

vec[str]

Object references

vec[vec[u8]]

Packed vec values

The vec class

class vec[T](items: Iterable[T] = ..., *, capacity: i64 = ...)

A generic growable array type. The runtime type parameter T used when calling vec[T](...) determines the element type.

The capacity parameter allows defining the minimum initial capacity of the buffer, some of which may be unused after construction. Unused capacity allows fast append and extend operations that don’t need to reallocate the buffer. Actual capacity will be larger than capacity if items has more than capacity items.

Construction from list and tuple objects is optimized. Also, for value item types, construction from an object that implements the buffer protocol is optimized (such as bytes), if the format is compatible with the vec item type.

Mypyc treats vec[T]([x] * n) as a special form. For example, vec[u8]([0] * n) constructs a zero-initialized vec object efficiently, without building an intermediate list. There are also other constructor-related special forms – see Special forms below.

It’s an error to construct a vec object without providing an item type: vec() raises an exception.

len(v) i64

Return the length of v.

v[i] T

Return item at index i (index may be negative).

v[i:j] vec[T]

Return a slice. This constructs a new vec object. i and j may be negative.

v[i] = o

Assign to an item (index may be negative).

o in v bool

Return True if v contains o.

for o in v

Iterate over items.

memoryview(v)

vec implements the buffer protocol, but only for value item types that use a packed representation.

Functions

Since the following operations return a modified value, they are module-level functions instead of methods.

append(v: vec[T], o: T) vec[T]

Return v with item o appended to it. If v has unused capacity, reuse the existing buffer. The time complexity is O(1) on average. Example:

v = vec[i32]()
v = append(v, 1)
extend(v: vec[T], it: Iterable[T]) vec[T]

Return v with all items from iterable it appended to it. If v has sufficient unused capacity, reuse the existing buffer. The time complexity is O(n) on average, where n is the length of it. Example:

v = vec[u8]()
v = extend(v, b"foo")
remove(v: vec[T], o: T) vec[T]

Return v with the first instance of item o removed. Reuse the buffer from v. Raise ValueError if value doesn’t exist. Example:

v = vec[i32]([1, 2, 3])
v = remove(v, 2)
# v has items [1, 3]
pop(v: vec[T], i: i64 = -1) tuple[vec[T], T]

Return (new_v, item), where item is the value at index i and new_v is v with that item removed. Reuse the buffer from v. Example:

v = vec[i32]([1, 2, 3])
v, x = pop(v)
# x is 3; v has items [1, 2]

Special forms

Certain combinations of operations that would be multiple separate operations in regular Python are guaranteed to be compiled by mypyc to direct operations with no unnecessary temporary objects.

Special form

Description

vec[T]()

Construct empty vec with no buffer. This doesn’t perform any dynamic allocation (at least for non-nested vecs).

vec[T]([element1, ...])

Directly construct a vec object with given items, without a temporary list.

vec[T]([element1] * n)

Directly construct a vec with length n, without any temporary list.

vec[T]([<expr> for ... in <expr>])

Vec comprehension creates no temporary list.

Thread safety

In free-threaded Python builds, it’s unsafe to write or modify an item if other threads might be concurrently accessing the same item. For example, writing v[4] is not safe to do if another thread might be reading v[4]. Similarly, two threads concurrently calling append or remove on the same vec object is not safe.

This is different from list objects, since vec is a lower-level type where implicit synchronization would have a significant performance cost. However, since vec lengths are immutable, some race conditions that lists can be susceptible to are not possible with vecs.

Implementation details

In a native context, such as in a local variable or a parameter in a native function, or in an attribute of a native class, vec values are implemented as value objects with two fields: length and buffer. The buffer is a normal Python object, but it’s not directly accessible to users. If a vec object is empty, no buffer object is required. This means that empty vecs are particularly efficient in a native context (usually 16 bytes).

A packed representation is used for buffers with supported value item types, including for nested vecs. The packed representation is much more efficient than a Python list object, and it’s also significantly more efficient than array.array for small sequences.

Multiple vec values can share the same underlying buffer. For example, assigning a vec to another variable creates an alias that refers to the same buffer:

v = vec[i32]([1, 2, 3], capacity=3)
w = v  # v and w share the same buffer

w[0] = 99
print(v[0])  # 99 -- both see the change

However, this sharing is not guaranteed to persist if there are operations that change the length (such as append). These may reallocate the buffer, breaking the sharing silently:

v = append(v, 4)  # reallocates the buffer since there is no free capacity
v[0] = 0
print(w[0])  # still 99 -- v and w no longer share a buffer

If you need independent copies, use slicing (v[:]) to explicitly create a vec with its own buffer. It’s not recommended to rely on the details of buffer reallocation, as these might change between librt releases.

Using vecs outside compiled code

vec is fully supported in non-compiled code, but vec values will be boxed in such non-native contexts. There will be always two objects, a boxed vec object and a buffer object, whereas in native contexts usually only the buffer is a dynamically allocated object. vec is primarily useful in code compiled using mypyc, and it’s been heavily optimized for this use case. There may be no performance benefit in interpreted code over using list or array.array.