Ingestion

Since the Splitgraph engine is also a PostgreSQL instance, you can connect to it with any client, including DataGrip, pgAdmin or other clients like pgcli or DBeaver to ingest and explore data. Applications can use any PostgreSQL driver to write data to Splitgraph and snapshot it as Splitgraph images.

However, Splitgraph comes with some features that make data ingestion easier or redundant.

Foreign Data Wrappers

Foreign Data Wrappers are a PostgreSQL feature that allows users to write a custom handler for any other database or data source. This turns the source into a set of foreign tables that act like normal tables and can be queried by any PostgreSQL client.

The sgr mount CLI command takes care of PostgreSQL foreign data wrapper boilerplate for you, creating a "mountpoint" on the engine and letting you query the remote database directly through Splitgraph, snapshot it as an image or use it in a Splitfile.

Splitgraph comes with with several open-source foreign data wrappers:

Splitgraph also comes with a fork of Multicorn, an extension that allows you to write foreign data wrappers in Python and add them as custom mount handlers to Splitgraph.

Layered querying

Layered querying is implemented as a foreign data wrapper. It allows any PostgreSQL client to query large remote Splitgraph datasets by downloading just the required table regions on the fly, using bloom filters and other metadata.

In addition, layered querying runs directly against cstore_fdw fragments without checking data out, which can sometimes result in faster read performance and lower IO load than normal PostgreSQL tables.

Socrata

Splitgraph has first-class support for querying datasets on the Socrata open data platform through SQL and using them in Splitfiles.

Support for Socrata is also implemented using a foreign data wrapper. This allows Splitgraph to be used as a PostgreSQL-to-Socrata connector. Any PostgreSQL application, client or dashboarding tool can query Socrata datasets through Splitgraph and even run joins on datasets from different Socrata endpoints or between Socrata datasets and Splitgraph images.

Replication

It is possible to add a Splitgraph engine as a logical replication client to a production PostgreSQL database, occasionally committing the changes as new Splitgraph images. See the replication guide for an example.