Installation
You can install Splitgraph locally to build and query versioned data images. However, you can get started with Splitgraph without installing anything. Our one minute demo will show you how to connect to the Splitgraph Data Delivery Network with any SQL client and get immediate access to over 40,000 datasets hosted or proxied by Splitgraph.
Requirements
A Splitgraph installation consists of two components: the Splitgraph engine and the Splitgraph client (sgr
).
The client ships as a self-contained binary that works on Linux, Mac and Windows without any prerequisites. If you wish to use
Splitgraph as a Python library, it can also be installed through pip
.
To run the Splitgraph engine, you will need to be able to run Docker. There are instructions available for installing Docker on Linux, Mac and Windows (use Docker Desktop for Windows 10 Pro/Enterprise (15063+) / Windows 10 Home (19018+) or Docker Toolbox for earlier versions of Windows.)
To check that Docker is installed and working, run:
$ docker run hello-world
Installation
Single script
For Linux and OSX, there's a single script available:
$ bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"
This script will:
- Download the
sgr
binary from the releases page - Set up and initialize the Splitgraph Engine
- Register you on Splitgraph Cloud (data.splitgraph.com). This is completely optional: Splitgraph can be used in a decentralized manner, like Git.
If you're running Windows or would like to follow the steps manually, read on.
Manual
First, get the binary for sgr
from the releases page, available for the three major platforms.
Then, create a new Splitgraph engine. Do:
$ sgr engine add
This will:
- Prompt you for a password: this is just the password used to protect your local engine and will be stored in the generated configuration file.
- Pull and start the latest Splitgraph engine image
By default,
sgr
will name the engine containersplitgraph_engine_default
and create two Docker volumes:splitgraph_engine_default_data
to store the physical datasplitgraph_engine_default_metadata
to store the metadata about composition of Splitgraph images.
- Initialize the engine
- Generate a minimal
.sgconfig
file for you in your home directory (~/.splitgraph/.sgconfig
).
There are extra options in sgr engine
for advanced users. To see them, run sgr engine --help
or
see the CLI reference.
Note for Windows users
If you're running sgr
in a MINGW terminal (for example, Git Bash), you have to prefix its invocations with winpty
to avoid output errors and make sure password inputs work properly.
The simplest way to do this is by adding alias sgr='winpty sgr'
to your .bashrc
.
Advanced installation
sgr
You can also install the Splitgraph library from pip. This is useful if you want to use the Splitgraph Python API to manipulate images from your code:
$ pip install splitgraph
If you wish to contribute to Splitgraph itself or get the bleeding-edge version, you can follow the development instructions on our GitHub.
Engine
Docker
You can use Docker directly to pull and start the engine:
$ docker run -d \
-e POSTGRES_PASSWORD=supersecure \
-p 5432:5432 \
-v $PWD/splitgraph_data:/var/lib/splitgraph/objects \
-v $PWD/splitgraph_metadata:/var/lib/postgresql/data \
splitgraph/engine
By default, sgr
is configured to speak to the engine running on
localhost:5432
with a superuser account called sgr
and a password
supersecure
against a database called splitgraph
. You can change the credentials used by sgr
by editing the configuration file.
To complete the installation, run:
$ sgr init
Docker Compose
You can also use Docker Compose. A sample Compose configuration for the service is as follows:
version: "3"
services:
engine:
image: splitgraph/engine:${DOCKER_TAG-stable}
ports:
- "0.0.0.0:5432:5432"
environment:
- POSTGRES_USER=sgr
- POSTGRES_PASSWORD=supersecure
- POSTGRES_DB=splitgraph
- SG_LOGLEVEL=INFO
- SG_CONFIG_FILE=/.sgconfig
expose:
- 5432
volumes:
- splitgraph_data:/var/lib/splitgraph/objects
- splitgraph_metadata:/var/lib/postgresql/data
- ${HOME}/.splitgraph/.sgconfig:/.sgconfig
Upgrading
You can upgrade the single-binary sgr
by running sgr upgrade
which
will download the new binary and upgrade the engine (if the engine is managed by sgr engine
).
sgr engine upgrade
upgrades an engine that's managed by sgr engine
by deleting the current Docker container and starting a new one, keeping all data volumes intact.
While the versions of sgr
and of the engine do not need to be the same, we recommend you match them, as we currently test and release the engine and sgr
in lockstep.
Configuration file
The Splitgraph configuration file is usually stored in the user's home directory (~/.splitgraph/.sgconfig
).
For more information on configuring Splitgraph, see the configuration reference.
Next Steps
You can take a look at the five minute demo that will go through the basics of building and extending data images. You can also try out some self-contained example Splitgraph projects on our GitHub.