Skip to main content
Version: 0.6

model

superduper.components.model

Source code

init_decorator​

init_decorator(func)
ParameterDescription
funcinit function.

Decorator to set _is_setup to True after init method is called.

method_wrapper​

method_wrapper(method,
item,
signature: 'str')
ParameterDescription
methodMethod to execute.
itemItem to wrap.
signatureSignature of the method.

Wrap the item with the model.

serve​

serve(f)
ParameterDescription
fMethod to serve.

Decorator to serve the model on the associated cluster.

APIBaseModel​

APIBaseModel(self,
identifier: str,
upstream: Optional[List[ForwardRef('Component')]] = None,
db: dataclasses.InitVar[typing.Optional[ForwardRef('Datalayer')]] = None,
*,
datatype: 'str | None' = None,
model_update_kwargs: 't.Dict' = <factory>,
predict_kwargs: 't.Dict' = <factory>,
compute_kwargs: 't.Dict' = <factory>,
validation: 't.Optional[Validation]' = None,
metric_values: 't.Dict' = <factory>,
num_workers: 'int' = 0,
serve: 'bool' = False,
trainer: 't.Optional[Trainer]' = None,
model: 't.Optional[str]' = None,
max_batch_size: 'int' = 8,
postprocess: 't.Optional[t.Callable]' = None) -> None
ParameterDescription
identifierIdentifier of the instance.
upstreamA list of upstream components.
dbDatalayer instance. Datalayer instance.
datatypeDataType instance.
model_update_kwargsThe kwargs to use for model update.
predict_kwargsAdditional arguments to use at prediction time.
compute_kwargsKwargs used for compute backend job submit. Example (Ray backend): compute_kwargs = dict(resources=...).
validationThe validation Dataset instances to use.
metric_valuesThe metrics to evaluate on.
num_workersNumber of workers to use for parallel prediction.
serveCreates an http endpoint and serve the model with compute_kwargs on a distributed cluster.
trainerTrainer instance to use for training.
modelThe Model to use, e.g. 'text-embedding-ada-002'
max_batch_sizeMaximum batch size.
postprocessPostprocess function to use on the output of the API request

APIBaseModel component which is used to make the type of API request.

Model​

Model(self,
identifier: str,
upstream: Optional[List[ForwardRef('Component')]] = None,
db: dataclasses.InitVar[typing.Optional[ForwardRef('Datalayer')]] = None,
*,
datatype: 'str | None' = None,
model_update_kwargs: 't.Dict' = <factory>,
predict_kwargs: 't.Dict' = <factory>,
compute_kwargs: 't.Dict' = <factory>,
validation: 't.Optional[Validation]' = None,
metric_values: 't.Dict' = <factory>,
num_workers: 'int' = 0,
serve: 'bool' = False,
trainer: 't.Optional[Trainer]' = None) -> None
ParameterDescription
identifierIdentifier of the instance.
upstreamA list of upstream components.
dbDatalayer instance. Datalayer instance.
datatypeDataType instance.
model_update_kwargsThe kwargs to use for model update.
predict_kwargsAdditional arguments to use at prediction time.
compute_kwargsKwargs used for compute backend job submit. Example (Ray backend): compute_kwargs = dict(resources=...).
validationThe validation Dataset instances to use.
metric_valuesThe metrics to evaluate on.
num_workersNumber of workers to use for parallel prediction.
serveCreates an http endpoint and serve the model with compute_kwargs on a distributed cluster.
trainerTrainer instance to use for training.

Base class for components which can predict.

ObjectModel​

ObjectModel(self,
identifier: str,
upstream: Optional[List[ForwardRef('Component')]] = None,
db: dataclasses.InitVar[typing.Optional[ForwardRef('Datalayer')]] = None,
*,
datatype: 'str | None' = None,
model_update_kwargs: 't.Dict' = <factory>,
predict_kwargs: 't.Dict' = <factory>,
compute_kwargs: 't.Dict' = <factory>,
validation: 't.Optional[Validation]' = None,
metric_values: 't.Dict' = <factory>,
num_workers: 'int' = 0,
serve: 'bool' = False,
trainer: 't.Optional[Trainer]' = None,
object: 't.Callable',
method: 't.Optional[str]' = None) -> None
ParameterDescription
identifierIdentifier of the instance.
upstreamA list of upstream components.
dbDatalayer instance. Datalayer instance.
datatypeDataType instance.
model_update_kwargsThe kwargs to use for model update.
predict_kwargsAdditional arguments to use at prediction time.
compute_kwargsKwargs used for compute backend job submit. Example (Ray backend): compute_kwargs = dict(resources=...).
validationThe validation Dataset instances to use.
metric_valuesThe metrics to evaluate on.
num_workersNumber of workers to use for parallel processing
serveCreates an http endpoint and serve the model with compute_kwargs on a distributed cluster.
trainerTrainer instance to use for training.
objectModel/ computation object
methodMethod to call on the object

Model component which wraps a Model to become serializable.

# Example:
# -------
m = ObjectModel('test', lambda x: x + 2)
m.predict(2)
# 4

QueryModel​

QueryModel(self,
identifier: str,
upstream: Optional[List[ForwardRef('Component')]] = None,
db: dataclasses.InitVar[typing.Optional[ForwardRef('Datalayer')]] = None,
*,
datatype: 'str | None' = None,
model_update_kwargs: 't.Dict' = <factory>,
predict_kwargs: 't.Dict' = <factory>,
compute_kwargs: 't.Dict' = <factory>,
validation: 't.Optional[Validation]' = None,
metric_values: 't.Dict' = <factory>,
num_workers: 'int' = 0,
serve: 'bool' = False,
trainer: 't.Optional[Trainer]' = None,
preprocess: 't.Optional[t.Callable]' = None,
postprocess: 't.Optional[t.Callable]' = None,
select: 'Query',
signature: 'Signature' = '**kwargs') -> None
ParameterDescription
identifierIdentifier of the instance.
upstreamA list of upstream components.
dbDatalayer instance. Datalayer instance.
datatypeDataType instance.
model_update_kwargsThe kwargs to use for model update.
predict_kwargsAdditional arguments to use at prediction time.
compute_kwargsKwargs used for compute backend job submit. Example (Ray backend): compute_kwargs = dict(resources=...).
validationThe validation Dataset instances to use.
metric_valuesThe metrics to evaluate on.
num_workersNumber of workers to use for parallel prediction.
serveCreates an http endpoint and serve the model with compute_kwargs on a distributed cluster.
trainerTrainer instance to use for training.
preprocessPreprocess callable
postprocessPostprocess callable
selectquery used to find data (can include like)
signaturesignature to use

QueryModel component.

Model which can be used to query data and return those precomputed queries as Results.

Trainer​

Trainer(self,
identifier: str,
upstream: Optional[List[ForwardRef('Component')]] = None,
db: dataclasses.InitVar[typing.Optional[ForwardRef('Datalayer')]] = None,
*,
key: 'st.JSON',
select: 'st.BaseType',
transform: 't.Optional[t.Callable]' = None,
metric_values: 't.Dict' = <factory>,
in_memory: 'bool' = True,
compute_kwargs: 't.Dict' = <factory>,
validation: 't.Optional[Validation]' = None) -> None
ParameterDescription
identifierIdentifier of the instance.
upstreamA list of upstream components.
dbDatalayer instance. Datalayer instance.
keyModel input type key.
selectModel select query for training.
transform(optional) transform callable.
metric_valuesDictionary for metric defaults.
in_memoryIf training in memory.
compute_kwargsKwargs for compute backend.
validationValidation object to measure training performance

Trainer component to train a model.

Training configuration object, containing all settings necessary for a particular learning task use-case to be serialized and initiated. The object is callable and returns a class which may be invoked to apply training.

Validation​

Validation(self,
identifier: str,
upstream: Optional[List[ForwardRef('Component')]] = None,
db: dataclasses.InitVar[typing.Optional[ForwardRef('Datalayer')]] = None,
*,
metrics: 't.List[Metric]' = <factory>,
key: 'st.JSON',
datasets: 't.List[Dataset]' = <factory>) -> None
ParameterDescription
identifierIdentifier of the instance.
upstreamA list of upstream components.
dbDatalayer instance. Datalayer instance.
metricsList of metrics for validation
keyModel input type key
datasetsSequence of dataset.

Component which represents Validation definition.

APIModel​

APIModel(self,
identifier: str,
upstream: Optional[List[ForwardRef('Component')]] = None,
db: dataclasses.InitVar[typing.Optional[ForwardRef('Datalayer')]] = None,
*,
datatype: 'str | None' = None,
model_update_kwargs: 't.Dict' = <factory>,
predict_kwargs: 't.Dict' = <factory>,
compute_kwargs: 't.Dict' = <factory>,
validation: 't.Optional[Validation]' = None,
metric_values: 't.Dict' = <factory>,
num_workers: 'int' = 0,
serve: 'bool' = False,
trainer: 't.Optional[Trainer]' = None,
model: 't.Optional[str]' = None,
max_batch_size: 'int' = 8,
postprocess: 't.Optional[t.Callable]' = None,
url: 'str') -> None
ParameterDescription
identifierIdentifier of the instance.
upstreamA list of upstream components.
dbDatalayer instance. Datalayer instance.
datatypeDataType instance.
model_update_kwargsThe kwargs to use for model update.
predict_kwargsAdditional arguments to use at prediction time.
compute_kwargsKwargs used for compute backend job submit. Example (Ray backend): compute_kwargs = dict(resources=...).
validationThe validation Dataset instances to use.
metric_valuesThe metrics to evaluate on.
num_workersNumber of workers to use for parallel prediction.
serveCreates an http endpoint and serve the model with compute_kwargs on a distributed cluster.
trainerTrainer instance to use for training.
modelThe Model to use, e.g. 'text-embedding-ada-002'
max_batch_sizeMaximum batch size.
postprocessPostprocess function to use on the output of the API request
urlThe url to use for the API request

APIModel component which is used to make the type of API request.

SequentialModel​

SequentialModel(self,
identifier: str,
upstream: Optional[List[ForwardRef('Component')]] = None,
db: dataclasses.InitVar[typing.Optional[ForwardRef('Datalayer')]] = None,
*,
datatype: 'str | None' = None,
model_update_kwargs: 't.Dict' = <factory>,
predict_kwargs: 't.Dict' = <factory>,
compute_kwargs: 't.Dict' = <factory>,
validation: 't.Optional[Validation]' = None,
metric_values: 't.Dict' = <factory>,
num_workers: 'int' = 0,
serve: 'bool' = False,
trainer: 't.Optional[Trainer]' = None,
models: 't.List[Model]') -> None
ParameterDescription
identifierIdentifier of the instance.
upstreamA list of upstream components.
dbDatalayer instance. Datalayer instance.
datatypeDataType instance.
model_update_kwargsThe kwargs to use for model update.
predict_kwargsAdditional arguments to use at prediction time.
compute_kwargsKwargs used for compute backend job submit. Example (Ray backend): compute_kwargs = dict(resources=...).
validationThe validation Dataset instances to use.
metric_valuesThe metrics to evaluate on.
num_workersNumber of workers to use for parallel prediction.
serveCreates an http endpoint and serve the model with compute_kwargs on a distributed cluster.
trainerTrainer instance to use for training.
modelsA list of models to use

Sequential model component which wraps a model to become serializable.