Skip to content

Null Handling

Explicit nullability

In freezeFrame, nullability is a property of the column's declared type — not an inferred runtime behaviour. This is one of the most important correctness guarantees the library provides.

Annotation Nullable? Arrow nullability
score: float No not null
score: float \| None Yes nullable

Enforcement at construction

Passing a None into a non-nullable column raises SchemaValidationError immediately, at the point of construction — not silently downstream:

class Order(FrozenFrame):
    order_id: int
    amount:   float   # non-nullable

# This raises SchemaValidationError:
Order.from_dict({"order_id": [1], "amount": [None]})

Working with nullable columns

class Order(FrozenFrame):
    order_id:   int
    shipped_at: str | None  # nullable

df = Order.from_dict({
    "order_id":   [1, 2, 3],
    "shipped_at": ["2024-01-01", None, "2024-01-03"],
})

FrozenSeries[str | None] is returned for nullable columns. Arrow's native null bitmap tracks which values are valid — no sentinel values like NaN or -1 are used.

No silent coercion

freezeFrame never silently converts NaN, empty strings, or other sentinel values to None. What you put in is what the frame stores. Nulls must be represented as Python None at construction time.

This is a deliberate departure from Pandas, which silently promotes integer columns to float when nulls are introduced, and uses NaN as a null sentinel in ways that frequently cause unexpected behaviour.