Port 5432 is open...
Query 40k+ datasets with SQL

-- Join across two tables at different government data portals (Chicago and Cambridge)
-- Splitgraph will rewrite the queries into the providers' query language, get the data
-- and run the JOIN, returning the results over the PostgreSQL protocol.
SELECT
    cambridge_cases.date AS date,
    chicago_cases.cases_total AS chicago_daily_cases,
    cambridge_cases.new_positive_cases AS cambridge_daily_cases
FROM
    "cityofchicago/covid19-daily-cases-deaths-and-hospitalizations-naz8-j4nc".covid19_daily_cases_deaths_and_hospitalizations chicago_cases
FULL OUTER JOIN
    "cambridgema-gov/covid19-cumulative-cases-by-date-tdt9-vq5y".covid19_cumulative_cases_by_date cambridge_cases
ON
    date_trunc('day', chicago_cases.lab_report_date) = cambridge_cases.date
ORDER BY date ASC;
Join across tables from live datasets

Connect to the
Data Delivery Network
with any PostgreSQL client.

Host
data.splitgraph.com
 
Port
5432
 
Database Name
ddn
 
Connection URI
postgresql://data.splitgraph.com/ddn
 
Username / Password
Or, Sign Up with Email & Password

What is Splitgraph?
Splitgraph is an integrated
data catalog and database proxy.


 

Discover Data

The Splitgraph catalog indexes 40k+ data sources, including both live databases and versioned data snapshots called "data images." Discover data and explore it with features like an auto-generated REST API, schema documentation, and provenance tracking.

Explore the Catalog
 

Query Data

Connect to the Data Delivery Network (DDN) to query the catalog like it's a Postgres database. The DDN is a distributed SQL caching proxy built on the PostgreSQL wire protocol. It can route queries to any data in the catalog, whether that's a live database or a specific version of a data image.

Connect Now
 

Build & Share Data

Build versioned datasets from your own data, package them and push them to the Splitgraph catalog for other people to discover and query. Store the data as column oriented, delta-compressed objects in an S3-compatible object store. Push the metadata to Splitgraph peers, like Splitgraph.com.

Learn more about Splitgraph Cloud.

Built Around an Open Core

Splitgraph.com is a hosted service built around Splitgraph Core.
It adds features like a public SQL proxy and data catalog.

Discover Data in the Catalog

We index 40k+ public datasets
& make them queryable with SQL.

Explore over 40,000 datasets »
Connect to DDNRead the Docs »

Build Reproducible Data Snapshots

Combine data sources into reproducible data "images"
using a CI-friendly build process.

FROM demo/weather IMPORT rdu AS source_data

SQL CREATE TABLE monthly_summary AS ( \
    SELECT to_char(date, 'YYYYMM') AS month, \
        AVG(precipitation) AS average_precipitation, \
        AVG(snowfall) AS average_snowfall \
    FROM source_data \
    GROUP BY month \
    ORDER BY month ASC)
Import specific tables from upstream sources
  •  
    SplitfilesDefine transformations on data using a declarative syntax that will be familiar to anyone who has written a Dockerfile. Enjoy full access to the SQL language, and reference other Splitgraph data images or foreign tables with a simple JOIN.Discover Splitfiles
  •  
    ProvenanceDatasets built with Splitfiles have all their sources recorded, meaning Splitgraph knows exactly where your data came from and when to rebuild it. Easily stay on top of your data, without drifting out of date when upstream data sources change.See an example of provenance in the catalogLearn more about provenance
  •  
    CachingRebuild data only if the sources have changed. Easily integrate Splitfiles into your CI pipeline to keep your data up to date and only download the changes to upstream datasets.See how Splitfiles can fit in your CI pipeline
  •  
    Data VersioningSwitch between different versions of your data, capture changes, send and receive revisions and do it without rewriting any of your tools — just like Git.Discover how change tracking works
EXAMPLE
Import data from a CSV, then reference it in a Splitfile to build a derivative image.

Push Data to Splitgraph

Push images to Splitgraph using an
immutable and content-addressable
storage format.

  •  
    Peer-to-PeerAny Splitgraph engine can act as a remote peer. Push and pull data between Splitgraph installations, or publish it to Splitgraph Cloud using the same protocol.Try a decentralized demo
  •  
    Auto-generated REST APIGet an instant, auto-generated OpenAPI-compatible REST API for every version of your data when you push to Splitgraph Cloud, thanks to the power of PostgREST. Query any version of your data with a simple HTTP request. More tools coming soon.Try the splitgraph/socrata REST API
  •  
    S3 Compatible Blob StorageSplitgraph stores your data as columnar chunks in any S3-compatible object store, and Postgres only needs to keep track of lightweight metadata until you're ready to query it. Download data only when you need it, without the need for a bulky always-on warehouse.Try an example of pushing to object storage
$ sgr push votes_by_state
Pushing votes_by_state to splitgraph-demo/votes_by_state on remote data.splitgraph.com
Gathering remote metadata...
No objects to upload.
Uploaded metadata for 2 images, 1 table, 0 objects and 0 tags.
Setting upstream for votes_by_state to splitgraph-demo/votes_by_state.
Push data to Splitgraph
  •  
    Delta CompressionSplitgraph tables are composed of delta compressed objects. Keep track of how your data changed through history at low storage cost and bring your datasets up to date without redownloading them.Learn how Splitgraph stores objects
  •  
    Content addressable chunksSplitgraph objects are immutable and content-addressable, allowing Splitgraph to automatically deduplicate data and store multiple versions efficiently. Focus on what to put into your data warehouse, not how to store it.See content addressability in action
  •  
    Layered queryingDon't download the whole dataset just to run one SELECT. Splitgraph lets your software query remote data by lazily downloading only the required fragments.Learn about Layered Querying

Want Splitgraph for your business?

Contact Us

We're developing a "Private Cloud" product.
Want in on the beta? Get in touch.

Read About the Beta »

Run Splitgraph Locally

Run a local Splitgraph Engine
on top of Postgres
to mount or clone data into tables.

Powered by Postgres

Plug into a growing ecosystem.

  •  
    Ingest data from anywhereSplitgraph "mounting" is built on Postgres Foreign Data Wrappers (FDW). You can "mount" and import data from all major databases. You can setup Splitgraph as a Postgres replication client. Or you can write a custom mount handler to cover your unique use case. Transform the data into a Splitgraph image, or leave it as-is and query it on demand.Read the FDW Documentation
  •  
    Keep Your Existing ToolsAnything that works with Postgres will work with Splitgraph. As far as your tools are concerned, a Splitgraph image is just another Postgres database. You can adopt Splitgraph incrementally while keeping your existing workflows and benefitting from the Postgres ecosystem.See examples of common integrations
Explore Data »
 
Connect to the DDN

Join the Community