Syntax

Splitgraph DDN is based on PostgreSQL and supports the PostgreSQL SELECT statement, DML statements (INSERT, UPDATE, DELETE), some DDL statements (CREATE/ALTER/DROP table), as well as all PostgreSQL functions and operators.

In an SQL query, you can reference Splitgraph datasets as PostgreSQL schemas using the form namespace/repository:image_hash_or_tag. If the hash or tag is missing, Splitgraph will assume the :latest tag.

This means that these are equivalent:

SELECT * FROM "splitgraph/london_wards".city_merged_2018;
SELECT * FROM "splitgraph/london_wards:latest".city_merged_2018;
SELECT * FROM "splitgraph/london_wards:34802db1d71f69973d5e5845ab358b3e339077ed4f3f3fc790d2f89e87b4f433".city_merged_2018;

You can also run JOINs across multiple images or data sources. For example:

SELECT
    cambridge_cases.date AS date,
    chicago_cases.cases_total AS chicago_daily_cases,
    cambridge_cases.new_positive_cases AS cambridge_daily_cases
FROM
    "cityofchicago/covid19-daily-cases-deaths-and-hospitalizations-naz8-j4nc".covid19_daily_cases_deaths_and_hospitalizations chicago_cases
FULL OUTER JOIN
    "cambridgema-gov/covid19-case-count-by-date-axxk-jvk8".covid19_case_count_by_date cambridge_cases
ON
    date_trunc('day', chicago_cases.lab_report_date) = cambridge_cases.date::timestamp
ORDER BY date ASC;

PostGIS

The Splitgraph DDN also supports PostGIS, so querying PostGIS-enabled datasets will pass geographical data back to the client.

For example:

SELECT name, gss_code, ST_Transform(ST_SetSRID(geom, 27700), 4326)
    FROM "splitgraph/london_wards".city_merged_2018

PostGIS data on Splitgraph SQL endpoint

Restrictions

We currently explicitly limit all queries to 10,000 rows and 30 seconds of execution. We might review these restrictions in the future.

We filter queries, and besides SELECT and EXPLAIN statements, we allow write statements such as INSERT/UPDATE/DELETE and CREATE/ALTER/DROP (of table objects only) as well as some utility statements like SHOW and SET used by some SQL clients.

Introspection queries

The DDN also supports introspection queries (to the information_schema/pg_catalog schemas). These queries show you a list of schemas that correspond to repositories that you own, as well as repositories that are featured on Splitgraph.

This means SQL clients with a UI will show you a list of Splitgraph repositories in the sidebar.

Introspection queries on Splitgraph return only a subset of available datasets available: you can query any dataset on Splitgraph using SQL, not just the ones that you can see in the client.

Splitgraph workspace example

Alternative database names

By default, Splitgraph uses the slash to separate namespaces and repositories in the schema name, which can cause problems with some clients. Instead of using the ddn database name, you can use altenative names for compatibility with some clients:

  • ddn-t: Use the ~ character to separate namespaces and repositories, for example, splitgraph~2016-election. This is useful for clients like Metabase that sometimes use schemas as part of URL paths.
  • ddn-g: Use GraphQL-compatible identifiers. This is recommended for clients like Hasura.