Images

Splitgraph's main purpose is to build, extend, manipulate and share data images. Data images are essentially snapshots of a PostgreSQL schema at a given point in time, much like Docker images are snapshots of the filesystem.

An image consists of tables. Tables are composed of content-addressable objects, so that when multiple images contain the same table (or overlapping subsets of it), Splitgraph only stores one copy of the underlying data.

You can create Splitgraph images in multiple ways. You can “check out” any image into a PostgreSQL schema and interact with it using any PostgreSQL client. Splitgraph will capture your changes to the data, and then you can commit them as delta-compressed changesets that you can package into new images.

You can also create Splitgraph images using a declarative Splitfile language, which offers a similar experience to Dockerfiles, including a caching system for efficient rebuilds that execute only when upstream sources change.

Under the hood, Splitgraph image metadata (including parents and provenance) is stored in splitgraph_meta.images table and table metadata is stored in splitgraph_meta.tables.