Working with Splitgraph

A sample Splitgraph query

Your application will mostly interact with Splitgraph by running SQL queries on data that you add or public data.

Here's a sample Splitgraph query:

SELECT COUNT(*) FROM "splitgraph/socrata:20200809".datasets

Splitgraph organizes data in collections of tables called repositories. In this case, splitgraph/socrata is the repository we're querying. Repository names have two parts:

Namespace, in this case splitgraph (this is similar to a GitHub/Docker organization)
Repository, in this case socrata

Splitgraph repositories can be versioned or live.

A live repository acts as a "proxy" to a remote database. When you query a live repository, Splitgraph translates the inbound query to the remote database's query language and forwards it.

A versioned repository consists of multiple versions, or images. Each image is stored in a columnar format, inspired by modern cloud data warehouses like Snowflake.

The above splitgraph/socrata repository is versioned. In the example query, we're querying a certain human-readable tag (20200809) that Splitgraph attached to the image to denote its version.

If you omit the version, Splitgraph will use the latest version of the dataset. These are equivalent:

SELECT COUNT(*) FROM "splitgraph/socrata".datasets

SELECT COUNT(*) FROM "splitgraph/socrata:latest".datasets

If you're familiar with PostgreSQL, it might help to treat repositories as schemas (in fact, "splitgraph/socrata" is a schema in the above query).

Discovering data

You can attach metadata like READMEs or topics to Splitgraph repositories to make them discoverable by other people. You can also make a repository private and control who can access it.

You can use Splitgraph's data catalog to search for repositories, or add your own.

Adding data

There are multiple ways to add data to Splitgraph:

Uploading a CSV file from the Web or the sgr CLI
Setting up one of the over 100 SaaS sources or live queries to popular databases
Writing to the Splitgraph DDN
Pushing a data image from the sgr CLI (advanced)

Splitgraph can also run dbt for you on a schedule or on-demand, offering a simple way to transform repositories.

Once your dataset is published, you can add metadata like topics or a README file to make it easier for data consumers to discover. You can also use the splitgraph.yml format to programmatically manage your repositories.

Finally, you can manage who can access or edit a given repository using Splitgraph's sharing options.

Consuming data

Splitgraph allows you to query data using a variety of methods:

Built-in Web IDE that offers CSV downloads
Data Delivery Network (DDN), a PostgreSQL-compatible endpoint that SQL clients can connect to
HTTP API that lets you run SQL queries over HTTP

Splitgraph has been acquired by EDB! Read the blog post.

Working with Splitgraph

A sample Splitgraph query

Discovering data

Adding data

Consuming data

Table of contents

Product

Support

Company

Splitgraph

Splitgraph has been acquired by EDB! Read the blog post.

Working with Splitgraph

A sample Splitgraph query

Discovering data

Adding data

Consuming data

Table of contents

Product

Support

Company

Community

Splitgraph