dbt

The recommended way of building Splitgraph data images is Splitfiles that offer Dockerfile-like caching, provenance tracking and efficient rebuilds.

However, there are plenty of other great tools for building datasets and, as long as they work with PostgreSQL, they too can benefit from Splitgraph's data versioning, packaging and sharing capabilities.

One such tool is dbt that assembles data transformations from small building blocks, decreasing the amount of boilerplate. In a sense, dbt can be used as an advanced SQL templating engine.

Turning the source and the target schemas that dbt uses into Splitgraph repositories opens up a lot of opportunities:

Splitgraph dbt adapter

You don't need any extra plugins to use dbt with Splitgraph, since you can use dbt's native PostgreSQL support to query the Splitgraph engine. However, if you install the Splitgraph dbt adapter, you will be able to reference Splitgraph images directly from your dbt code. For example:

{{ config(materialized='table') }}

with source_data as (

    select domain, count(domain) as count
    from "splitgraph/socrata:latest".datasets
    group by domain

)

select *
from source_data

See our GitHub page for instructions on how to install and use the Splitgraph dbt adapter as well as a sample dbt project.

Example

The dbt example showcases running dbt against the Splitgraph engine, using Splitgraph to swap between different versions of the source dataset and looking at their effect on the built dbt model.