Overview
AI functionality in superduper revolves around creating AI models, and configuring them to interact with data via the datalayer.
There are many decisions to be made and configured; for this superduper
provides the Component
abstraction.
The typical process is:
1. Create a component
Build your components, potentially including other subcomponents.
from superduper import <ComponentClass>
component = <ComponentClass>(
'identifier',
**kwargs # can include other components
)
2. Apply the component to the datalayer
"Applying" the component the db
datalayer, also
applies all sub-components. So only 1 call is needed.
db.apply(component)
3. Reload the component (if necessary)
The .apply
command saves everything necessary to reload the component
from the Superduper system.
reloaded = db.load('type_id', 'identifier') # `type_id`
4. Export the component (to share/ migrate)
The .export
command saves the entirety of the component's parameters,
inline code and artifacts in a directory:
component.export('my_component')
The directory structure looks like this. It contains the meta-data of the component as well as a "mini-artifact-store". Together these items make the export completely portable.
my_component
|_component.json // meta-data and imports of component
|_blobs // directory of blobs used in the component
| |_234374364783643764
| |_574759874238947320
|_files // directory of files used in the component
|_182372983721897389
|_982378978978978798
You can read about the serialization mechanism here.
Read more
📄️ Overview
In this section we re-use the datalayer variable db without further explanation.
📄️ Component
Information about the base class here.
📄️ Model
- Wrap a standard AI model with functionality necessary for superduper
📄️ Listener
- apply a model to compute outputs on a query
📄️ VectorIndex
- Wrap a Listener so that outputs are searchable
📄️ DataType
- Convert objects which should be added to the database or model outputs to encoded bytes
📄️ Schema
- Apply a dictionary of FieldType and DataType to encode columnar data
📄️ Table
- Use a table in your databackend database, which optionally has a Schema attached
📄️ Dataset
- An immutable snapshot of a query saved to db.artifact_store
📄️ Metric
- Wrapper around a function intended to validate model outputs
📄️ Validation
- Validate a Model by attaching a Validation component
📄️ Trainer
- Train a Model by attaching a Trainer component
📄️ Template
- Wraps a Component containing placeholders flagged with ``
📄️ Plugin
- Supports a plugin system that dynamically loads Python modules and packages at runtime.
📄️ Application
- An Application ships a pre-configured functionality in a compact and easy to understand way
📄️ CDC
- Listen for update, inserts and deletes
📄️ Cron Job
- Iterate computations, queries and actions on a crontab