Schema
- Apply a dictionary of
FieldType
andDataType
to encode columnar data - Mostly relevant to SQL databases, but can also be used with MongoDB
Schema
leverages encoding functionality of containedDataType
instances
Dependencies
Usage pattern
(Learn how to build a DataType
here)
Vanilla usage
Table can potentially include more columns which don't need encoding:
from superduper import Schema
schema = Schema(
'my-schema',
fields={
'img': dt_1, # A `DataType`
'video': dt_2, # Another `DataType`
}
)
db.apply(schema)
Usage with SQL
All columns should be flagged with either DataType
or dtype
:
from superduper.backends.ibis import dtype
schema = Schema(
'my-schema',
fields={
'img': dt_1, # A `DataType`
'video': dt_2, # Another `DataType`
'txt', dtype('str'),
'numer', dtype('int'),
}
)
db.apply(schema)
Usage with MongoDB
In MongoDB, the non-DataType
columns/ fields can be omitted:
schema = Schema(
'my-schema',
fields={
'img': dt_1, # A `DataType`
'video': dt_2, # Another `DataType`
}
)
db.apply(schema)
Usage with Model
descendants (MongoDB only)
If used together with Model
, the model is assumed to emit tuple
outputs, and these
need differential encoding. The Schema
is applied to the columns of output,
to get something which can be saved in the db.databackend
.
from superduper import ObjectModel
m = Model(
'my-model',
object=my_object,
output_schema=schema
)
db.apply(m) # adds model and schema
See also