Defining Schemas¶
Class-level type annotations¶
A FrozenFrame schema is a Python class. Each field is a class-level type
annotation. The metaclass reads __annotations__ at class creation time and
builds the corresponding pyarrow.Schema.
from freezeframe import FrozenFrame
class Order(FrozenFrame):
order_id: int
customer: str
amount: float
shipped: bool
Nullable columns¶
By default, all columns are non-nullable. A None value in a non-nullable
column raises SchemaValidationError at construction.
To allow nulls, annotate with T | None:
class Order(FrozenFrame):
order_id: int
customer: str
amount: float
shipped_at: str | None # nullable — may be None if not yet shipped
This is enforced at construction time. There is no silent coercion.
Python → Arrow type mapping¶
| Python annotation | Arrow type |
|---|---|
int |
pa.int64() |
float |
pa.float64() |
str |
pa.large_utf8() |
bool |
pa.bool_() |
bytes |
pa.large_binary() |
datetime.datetime |
pa.timestamp("us", tz="UTC") |
datetime.date |
pa.date32() |
datetime.timedelta |
pa.duration("us") |
Decimal |
pa.decimal128(38, 18) |
Overriding Arrow types with field()¶
When the default mapping isn't precise enough, use field(arrow_type=...):
import pyarrow as pa
from freezeframe import FrozenFrame, field
import datetime
class Event(FrozenFrame):
event_id: int
score: float = field(arrow_type=pa.float32())
ts: datetime.datetime = field(arrow_type=pa.timestamp("ms"))
Registering custom types¶
Third-party types can be registered globally with register_type():
from freezeframe import register_type
import pyarrow as pa
import numpy as np
register_type(np.float32, pa.float32())
Registered types are then available as bare annotations in any FrozenFrame
subclass.
Schema introspection¶
The compiled pyarrow.Schema is available as a class attribute: