Installation

You can install Splitgraph locally to build and query versioned data images. However, you can get started with Splitgraph without installing anything. Our one minute demo will show you how to connect to the Splitgraph Data Delivery Network with any SQL client and get immediate access to over 40,000 datasets hosted or proxied by Splitgraph.

Requirements

A Splitgraph installation consists of two components: the Splitgraph engine and the Splitgraph client (sgr).

The client ships as a self-contained binary that works on Linux, Mac and Windows without any prerequisites. If you wish to use Splitgraph as a Python library, it can also be installed through pip.

To run the Splitgraph engine, you will need to be able to run Docker. There are instructions available for installing Docker on Linux, Mac and Windows (use Docker Desktop for Windows 10 Pro/Enterprise (15063+) / Windows 10 Home (19018+) or Docker Toolbox for earlier versions of Windows.)

To check that Docker is installed and working, run:

$ docker run hello-world

Installation

Single script

For Linux and OSX, there's a single script available:

$ bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"

This script will:

  • Download the sgr binary from the releases page
  • Set up and initialize the Splitgraph Engine
  • Register you on Splitgraph Cloud (data.splitgraph.com). This is completely optional: Splitgraph can be used in a decentralized manner, like Git.

If you're running Windows or would like to follow the steps manually, read on.

Manual

First, get the binary for sgr from the releases page, available for the three major platforms.

Then, create a new Splitgraph engine. Do:

$ sgr engine add

This will:

  • Prompt you for a password: this is just the password used to protect your local engine and will be stored in the generated configuration file.
  • Pull and start the latest Splitgraph engine image
  • By default, sgr will name the engine container splitgraph_engine_default and create two Docker volumes:

    • splitgraph_engine_default_data to store the physical data
    • splitgraph_engine_default_metadata to store the metadata about composition of Splitgraph images.
  • Initialize the engine
  • Generate a minimal .sgconfig file for you in your home directory (~/.splitgraph/.sgconfig).

There are extra options in sgr engine for advanced users. To see them, run sgr engine --help or see the CLI reference.

Note for Windows users

If you're running sgr in a MINGW terminal (for example, Git Bash), you have to prefix its invocations with winpty to avoid output errors and make sure password inputs work properly.

The simplest way to do this is by adding alias sgr='winpty sgr' to your .bashrc.

Advanced installation

sgr

You can also install the Splitgraph library from pip. This is useful if you want to use the Splitgraph Python API to manipulate images from your code:

$ pip install splitgraph

If you wish to contribute to Splitgraph itself or get the bleeding-edge version, you can follow the development instructions on our GitHub.

Engine

Docker

You can use Docker directly to pull and start the engine:

$ docker run -d \
    -e POSTGRES_PASSWORD=supersecure \
    -p 5432:5432 \
    -v $PWD/splitgraph_data:/var/lib/splitgraph/objects \
    -v $PWD/splitgraph_metadata:/var/lib/postgresql/data \
    splitgraph/engine

By default, sgr is configured to speak to the engine running on localhost:5432 with a superuser account called sgr and a password supersecure against a database called splitgraph. You can change the credentials used by sgr by editing the configuration file.

To complete the installation, run:

$ sgr init

Docker Compose

You can also use Docker Compose. A sample Compose configuration for the service is as follows:

version: "3"
services:
  engine:
    image: splitgraph/engine:${DOCKER_TAG-stable}
    ports:
      - "0.0.0.0:5432:5432"
    environment:
      - POSTGRES_USER=sgr
      - POSTGRES_PASSWORD=supersecure
      - POSTGRES_DB=splitgraph
      - SG_LOGLEVEL=INFO
      - SG_CONFIG_FILE=/.sgconfig
    expose:
      - 5432
    volumes:
      - splitgraph_data:/var/lib/splitgraph/objects
      - splitgraph_metadata:/var/lib/postgresql/data
      - ${HOME}/.splitgraph/.sgconfig:/.sgconfig

Upgrading

You can upgrade the single-binary sgr by running sgr upgrade which will download the new binary and upgrade the engine (if the engine is managed by sgr engine).

sgr engine upgrade upgrades an engine that's managed by sgr engine by deleting the current Docker container and starting a new one, keeping all data volumes intact. While the versions of sgr and of the engine do not need to be the same, we recommend you match them, as we currently test and release the engine and sgr in lockstep.

Configuration file

The Splitgraph configuration file is usually stored in the user's home directory (~/.splitgraph/.sgconfig).

For more information on configuring Splitgraph, see the configuration reference.

Next Steps

You can take a look at the five minute demo that will go through the basics of building and extending data images. You can also try out some self-contained example Splitgraph projects on our GitHub.