splitgraph.engine.postgres package

Submodules

splitgraph.engine.postgres.engine module

Default Splitgraph engine: uses PostgreSQL to store metadata and actual objects and an audit stored procedure to track changes, as well as the Postgres FDW interface to upload/download objects to/from other Postgres engines.

class splitgraph.engine.postgres.engine.AuditTriggerChangeEngine(conn_params, name)

Bases: splitgraph.engine.postgres.engine.PsycopgEngine, splitgraph.engine.ChangeEngine

Change tracking based on an audit trigger stored procedure

discard_pending_changes(schema, table=None)

Discard recorded pending changes for a tracked schema / table

get_changed_tables(schema)

Get list of tables that have changed content

get_pending_changes(schema, table, aggregate=False)

Return pending changes for a given tracked table

Parameters
  • schema – Schema the table belongs to

  • table – Table to return changes for

  • aggregate – Whether to aggregate changes or return them completely

Returns

If aggregate is True: tuple with numbers of (added_rows, removed_rows, updated_rows). If aggregate is False: List of (primary_key, change_type, change_data)

get_tracked_tables()

Return a list of tables that the audit trigger is working on.

has_pending_changes(schema)

Return True if the tracked schema has pending changes and False if it doesn’t.

track_tables(tables)

Install the audit trigger on the required tables

untrack_tables(tables)

Remove triggers from tables and delete their pending changes

class splitgraph.engine.postgres.engine.PostgresEngine(conn_params, name)

Bases: splitgraph.engine.postgres.engine.AuditTriggerChangeEngine, splitgraph.engine.ObjectEngine

An implementation of the Postgres engine for Splitgraph

apply_fragments(objects, target_schema, target_table, extra_quals=None, extra_qual_args=None, schema_spec=None)

Apply multiple fragments to a target table as a single-query batch operation.

Parameters
  • objects – List of tuples (object_schema, object_table) that the objects are stored in.

  • target_schema – Schema to apply the fragment to

  • target_table – Table to apply the fragment to

  • extra_quals – Optional, extra SQL (Composable) clauses to filter new rows in the fragment on (e.g. SQL(“a = %s”))

  • extra_qual_args – Optional, a tuple of arguments to use with extra_quals

  • schema_spec – Optional, list of (ordinal, column_name, column_type, is_pk). If not specified, uses the schema of target_table.

delete_objects(object_ids)

Delete one or more objects from the engine.

Parameters

object_ids – IDs of objects to delete

download_objects(objects, remote_engine)

Download objects from the remote engine to the local cache

Parameters
  • objects – List of object IDs to download

  • remote_engine – A remote ObjectEngine to download the objects from.

:return List of object IDs that were downloaded.

dump_object(object_id, stream, schema)

Dump an object into a series of SQL statements

Parameters
  • object_id – Object ID

  • stream – Text stream to dump the object into

  • schema – Schema the object lives in

get_object_schema(object_id)

Get the schema of a given object, returned as a list of (ordinal, column_name, column_type, is_pk).

Parameters

object_id – ID of the object

get_object_size(object_id)

Return the on-disk footprint of this object, in bytes :param object_id: ID of the object

store_fragment(inserted, deleted, schema, table, source_schema, source_table)

Store a fragment of a changed table in another table

Parameters
  • inserted – List of PKs that have been updated/inserted

  • deleted – List of PKs that have been deleted

  • schema – Schema to store the change in

  • table – Table to store the change in

  • source_schema – Schema the source table is located in

  • source_table – Name of the source table

store_object(object_id, source_schema, source_table)

Stores a Splitgraph object located in a staging table in the actual format implemented by this engine.

At the end of this operation, the staging table must be deleted.

Parameters
  • source_schema – Schema the staging table is located in.

  • source_table – Name of the staging table

  • object_id – Name of the object

upload_objects(objects, remote_engine)

Upload objects from the local cache to the remote engine

Parameters
  • objects – List of object IDs to upload

  • remote_engine – A remote ObjectEngine to upload the objects to.

class splitgraph.engine.postgres.engine.PsycopgEngine(conn_params, name)

Bases: splitgraph.engine.SQLEngine

Postgres SQL engine backed by a Psycopg connection.

close()

Commit and close the engine’s backing connection

commit()

Commit the engine’s backing connection

property connection

Engine-internal Psycopg connection.

delete_database(database)

Helper function to drop a database using the admin connection

Parameters

database – Database name to drop

dump_table_sql(schema, table_name, stream, columns='*', where='', where_args=None, target_schema=None, target_table=None)

Dump the table contents in the SQL format :param schema: Schema the table is located in :param table_name: Name of the table :param stream: A file-like object to write the result into. :param columns: SQL column spec. Default ‘*’. :param where: Optional, an SQL WHERE clause :param where_args: Arguments for the optional WHERE clause. :param target_schema: Schema to create the table in (default same as schema) :param target_table: Name of the table to insert data into (default same as table_name)

get_primary_keys(schema, table)

Inspects the Postgres information_schema to get the primary keys for a given table.

initialize(skip_audit=False)

Create the Splitgraph Postgres database and install the audit trigger

lock_table(schema, table)

Acquire an exclusive lock on a given table, released when the transaction commits / rolls back.

rollback()

Rollback the engine’s backing connection

run_sql(statement, arguments=None, return_shape=<ResultShape.MANY_MANY: 4>, named=False)

Run an arbitrary SQL statement with some arguments, return an iterator of results. If the statement doesn’t return any results, return None. If named=True, return named tuples when possible.

run_sql_batch(statement, arguments, schema=None)

Run a parameterized SQL statement against multiple sets of arguments.

Parameters
  • statement – Statement to run

  • arguments – Query arguments

  • schema – Schema to run the statement in

splitgraph.engine.postgres.engine.make_conn(server, port, username, password, dbname)

Initializes a connection a Splitgraph Postgres engine.

Returns

Psycopg connection object

Module contents