Basic insertion
Superduper supports inserting data wrapped as dictionaries in Python.
These dictionaries may contain basic JSON-compatible data, but also
other data-types to be handled with DataType
components. All data inserts are wrapped with the Document
wrapper:
data = ... # an iterable of dictionaries
For example, first get some sample data:
!curl -O https://superduperdb-public-demo.s3.amazonaws.com/text.json
Then load the data:
with open('./text.json') as f:
data = json.load(f)
Usage pattern​
ids, jobs = db['collection-name'].insert(data).execute()
MongoDB​
ids, jobs = db['collection-name'].insert_many(data).execute()
A Schema
which differs from the standard Schema
used by "collection-name"
may
be used with:
ids, jobs = db['collection-name'].insert_many(data).execute(schema=schema_component)
Read about this here Schema
here.
SQL​
ids, jobs = db['table-name'].insert(data)
If no Schema
has been set-up for table-name"
a Schema
is auto-inferred.
Data not handled by the db.databackend
is encoded by default with pickle
.
Monitoring jobs​
The second output of this command gives a reference to the job computations
which are triggered by the Component
instances already applied with db.apply(...)
.
If users have configured a ray
cluster, the jobs may be monitored at the
following uri:
from superduper import CFG
print(CFG.cluster.compute.uri)