splitgraph.hooks.data_source package

Module contents

splitgraph.hooks.data_source.get_data_source(data_source: str)Type[splitgraph.hooks.data_source.base.DataSource]

Returns a class for a given data source

splitgraph.hooks.data_source.get_data_sources()List[str]

Returns the names of all registered data sources.

splitgraph.hooks.data_source.register_data_source(name: str, data_source_class: Type[splitgraph.hooks.data_source.base.DataSource])None

Returns a data source under a given name.

Submodules

splitgraph.hooks.data_source.base module

class splitgraph.hooks.data_source.base.DataSource(engine: PostgresEngine, credentials: Dict[str, Any], params: Dict[str, Any])

Bases: abc.ABC

credentials_schema: Dict[str, Any]
abstract classmethod get_description()str
abstract classmethod get_name()str
abstract introspect()Dict[str, List[splitgraph.core.types.TableColumn]]
params_schema: Dict[str, Any]
supports_load = False
supports_mount = False
supports_sync = False
class splitgraph.hooks.data_source.base.LoadableDataSource(engine: PostgresEngine, credentials: Dict[str, Any], params: Dict[str, Any])

Bases: splitgraph.hooks.data_source.base.DataSource, abc.ABC

credentials_schema: Dict[str, Any]
load(repository: Repository, tables: Optional[Union[List[str], Dict[str, List[splitgraph.core.types.TableColumn]]]] = None)str
params_schema: Dict[str, Any]
supports_load = True
class splitgraph.hooks.data_source.base.MountableDataSource(engine: PostgresEngine, credentials: Dict[str, Any], params: Dict[str, Any])

Bases: splitgraph.hooks.data_source.base.DataSource, abc.ABC

credentials_schema: Dict[str, Any]
abstract mount(schema: str, tables: Optional[Union[List[str], Dict[str, List[splitgraph.core.types.TableColumn]]]] = None, overwrite: bool = True)

Instantiate the data source as foreign tables in a schema

params_schema: Dict[str, Any]
supports_mount = True
class splitgraph.hooks.data_source.base.SyncableDataSource(engine: PostgresEngine, credentials: Dict[str, Any], params: Dict[str, Any])

Bases: splitgraph.hooks.data_source.base.LoadableDataSource, splitgraph.hooks.data_source.base.DataSource, abc.ABC

credentials_schema: Dict[str, Any]
params_schema: Dict[str, Any]
supports_load = True
supports_sync = True
sync(repository: Repository, image_hash: Optional[str], tables: Optional[Union[List[str], Dict[str, List[splitgraph.core.types.TableColumn]]]] = None)str
splitgraph.hooks.data_source.base.get_ingestion_state(repository: Repository, image_hash: Optional[str])Optional[Dict[str, Any]]
splitgraph.hooks.data_source.base.getrandbits(k)x.  Generates an int with k random bits.
splitgraph.hooks.data_source.base.prepare_new_image(repository: Repository, hash_or_tag: Optional[str])Tuple[Optional[splitgraph.core.image.Image], str]

splitgraph.hooks.data_source.fdw module

class splitgraph.hooks.data_source.fdw.ElasticSearchDataSource(engine: PostgresEngine, credentials: Optional[Dict[str, Any]] = None, params: Optional[Dict[str, Any]] = None)

Bases: splitgraph.hooks.data_source.fdw.ForeignDataWrapperDataSource

commandline_help: str = 'Mount an ElasticSearch instance.\n\nMount a set of tables proxying to a remote ElasticSearch index.\n\nThis uses a fork of postgres-elasticsearch-fdw behind the scenes. You can add a column\n`query` to your table and set it as `query_column` to pass advanced ES queries and aggregations.\nFor example:\n\n```\nsgr mount elasticsearch target_schema -c elasticsearch:9200 -o@- <<EOF\n    &lbrace;\n      "tables": &lbrace;\n        "table_1": &lbrace;\n          "schema": &lbrace;\n            "id": "text",\n            "@timestamp": "timestamp",\n            "query": "text",\n            "col_1": "text",\n            "col_2": "boolean"\n          &rbrace;,\n          "options": &lbrace;\n              "index": "index-pattern*",\n              "rowid_column": "id",\n              "query_column": "query"\n          &rbrace;\n        &rbrace;\n      &rbrace;\n    &rbrace;\nEOF\n```\n'
credentials_schema: Dict[str, Any] = &lbrace;'properties': &lbrace;'password': &lbrace;'type': ['string', 'null']&rbrace;, 'username': &lbrace;'type': ['string', 'null']&rbrace;&rbrace;, 'type': 'object'&rbrace;
classmethod get_description()str
get_fdw_name()
classmethod get_name()str
get_server_options()
params_schema: Dict[str, Any] = &lbrace;'properties': &lbrace;'host': &lbrace;'type': 'string'&rbrace;, 'port': &lbrace;'type': 'integer'&rbrace;, 'tables': &lbrace;'additionalProperties': &lbrace;'options': &lbrace;'properties': &lbrace;'index': &lbrace;'description': 'ES index name or pattern to use, for example, "events-*"', 'type': 'string'&rbrace;, 'query_column': &lbrace;'description': 'Name of the column to use to pass queries in', 'type': 'string'&rbrace;, 'score_column': &lbrace;'description': 'Name of the column with the document score', 'type': 'string'&rbrace;, 'scroll_duration': &lbrace;'description': 'How long to hold the scroll context open for, default 10m', 'type': 'string'&rbrace;, 'scroll_size': &lbrace;'description': 'Fetch size, default 1000', 'type': 'integer'&rbrace;, 'type': &lbrace;'description': 'Pre-ES7 doc_type, not required in ES7 or later', 'type': 'string'&rbrace;&rbrace;, 'type': 'object'&rbrace;&rbrace;, 'type': 'object'&rbrace;&rbrace;, 'required': ['host', 'port', 'tables'], 'type': 'object'&rbrace;
class splitgraph.hooks.data_source.fdw.ForeignDataWrapperDataSource(engine: PostgresEngine, credentials: Optional[Dict[str, Any]] = None, params: Optional[Dict[str, Any]] = None)

Bases: splitgraph.hooks.data_source.base.MountableDataSource, splitgraph.hooks.data_source.base.LoadableDataSource, abc.ABC

commandline_help: str = ''
commandline_kwargs_help: str = ''
credentials_schema: Dict[str, Any] = &lbrace;'properties': &lbrace;'password': &lbrace;'type': 'string'&rbrace;, 'username': &lbrace;'type': 'string'&rbrace;&rbrace;, 'required': ['username', 'password'], 'type': 'object'&rbrace;
classmethod from_commandline(engine, commandline_kwargs)splitgraph.hooks.data_source.fdw.ForeignDataWrapperDataSource

Instantiate an FDW data source from commandline arguments.

abstract get_fdw_name()
get_remote_schema_name()str

Override this if the FDW supports IMPORT FOREIGN SCHEMA

get_server_options()Mapping[str, str]
get_table_options(table_name: str)Mapping[str, str]
get_table_schema(table_name: str, table_schema: List[splitgraph.core.types.TableColumn])List[splitgraph.core.types.TableColumn]
get_user_options()Mapping[str, str]
introspect()Dict[str, List[splitgraph.core.types.TableColumn]]
mount(schema: str, tables: Optional[Union[List[str], Dict[str, List[splitgraph.core.types.TableColumn]]]] = None, overwrite: bool = True)

Instantiate the data source as foreign tables in a schema

params_schema: Dict[str, Any] = &lbrace;'properties': &lbrace;'tables': &lbrace;'additionalProperties': &lbrace;'options': &lbrace;'additionalProperties': &lbrace;'type': 'string'&rbrace;, 'type': 'object'&rbrace;&rbrace;, 'type': 'object'&rbrace;&rbrace;, 'type': 'object'&rbrace;
preview(schema: Dict[str, List[splitgraph.core.types.TableColumn]])Dict[str, Union[str, List[Dict[str, Any]]]]
supports_load = True
supports_mount = True
class splitgraph.hooks.data_source.fdw.MongoDataSource(engine: PostgresEngine, credentials: Optional[Dict[str, Any]] = None, params: Optional[Dict[str, Any]] = None)

Bases: splitgraph.hooks.data_source.fdw.ForeignDataWrapperDataSource

commandline_help: str = 'Mount a Mongo database.\n\nMounts one or more collections on a remote Mongo database as a set of foreign tables locally.'
commandline_kwargs_help: str = 'tables: A dictionary of form\n```\n&lbrace;\n    "table_name": &lbrace;\n        "schema": &lbrace;"col1": "type1"...&rbrace;,\n        "options": &lbrace;"db": <dbname>, "coll": <collection>&rbrace; \n    &rbrace; \n&rbrace;\n```\n'
classmethod get_description()str
get_fdw_name()
classmethod get_name()str
get_server_options()
get_table_options(table_name: str)
get_table_schema(table_name, table_schema)
get_user_options()
params_schema: Dict[str, Any] = &lbrace;'properties': &lbrace;'host': &lbrace;'type': 'string'&rbrace;, 'port': &lbrace;'type': 'integer'&rbrace;, 'tables': &lbrace;'additionalProperties': &lbrace;'options': &lbrace;'properties': &lbrace;'coll': &lbrace;'type': 'string'&rbrace;, 'db': &lbrace;'type': 'string'&rbrace;, 'required': ['db', 'coll']&rbrace;, 'type': 'object'&rbrace;, 'required': ['options']&rbrace;, 'type': 'object'&rbrace;&rbrace;, 'required': ['host', 'port', 'tables'], 'type': 'object'&rbrace;
tables: Optional[Union[List[str], Dict[str, List[splitgraph.core.types.TableColumn]]]]
class splitgraph.hooks.data_source.fdw.MySQLDataSource(engine: PostgresEngine, credentials: Optional[Dict[str, Any]] = None, params: Optional[Dict[str, Any]] = None)

Bases: splitgraph.hooks.data_source.fdw.ForeignDataWrapperDataSource

commandline_help: str = 'Mount a MySQL database.\n\nMounts a schema on a remote MySQL database as a set of foreign tables locally.'
commandline_kwargs_help: str = 'remote_schema: Remote schema name (required)\ntables: Tables to mount (default all). If a list, then will use IMPORT FOREIGN SCHEMA.\nIf a dictionary, must have the format\n    &lbrace;"table_name": &lbrace;"schema": &lbrace;"col_1": "type_1", ...&rbrace;,\n                    "options": &lbrace;[get passed to CREATE FOREIGN TABLE]&rbrace;&rbrace;&rbrace;.\n        '
classmethod get_description()str
get_fdw_name()
classmethod get_name()str
get_remote_schema_name()str

Override this if the FDW supports IMPORT FOREIGN SCHEMA

get_server_options()
get_table_options(table_name: str)
get_user_options()
params_schema: Dict[str, Any] = &lbrace;'properties': &lbrace;'host': &lbrace;'type': 'string'&rbrace;, 'port': &lbrace;'type': 'integer'&rbrace;, 'remote_schema': &lbrace;'type': 'string'&rbrace;, 'tables': &lbrace;'additionalProperties': &lbrace;'options': &lbrace;'additionalProperties': &lbrace;'type': 'string'&rbrace;, 'type': 'object'&rbrace;&rbrace;, 'type': 'object'&rbrace;&rbrace;, 'required': ['host', 'port', 'remote_schema'], 'type': 'object'&rbrace;
class splitgraph.hooks.data_source.fdw.PostgreSQLDataSource(engine: PostgresEngine, credentials: Optional[Dict[str, Any]] = None, params: Optional[Dict[str, Any]] = None)

Bases: splitgraph.hooks.data_source.fdw.ForeignDataWrapperDataSource

commandline_help: str = 'Mount a Postgres database.\n\nMounts a schema on a remote Postgres database as a set of foreign tables locally.'
commandline_kwargs_help: str = 'dbname: Database name (required)\nremote_schema: Remote schema name (required)\nextra_server_args: Dictionary of extra arguments to pass to the foreign server\ntables: Tables to mount (default all). If a list, then will use IMPORT FOREIGN SCHEMA.\nIf a dictionary, must have the format\n    &lbrace;"table_name": &lbrace;"schema": &lbrace;"col_1": "type_1", ...&rbrace;,\n                    "options": &lbrace;[get passed to CREATE FOREIGN TABLE]&rbrace;&rbrace;&rbrace;.\n    '
classmethod get_description()str
get_fdw_name()
classmethod get_name()str
get_remote_schema_name()str

Override this if the FDW supports IMPORT FOREIGN SCHEMA

get_server_options()
get_table_options(table_name: str)
get_user_options()
params_schema: Dict[str, Any] = &lbrace;'properties': &lbrace;'dbname': &lbrace;'description': 'Database name', 'type': 'string'&rbrace;, 'host': &lbrace;'description': 'Remote hostname', 'type': 'string'&rbrace;, 'port': &lbrace;'description': 'Port', 'type': 'integer'&rbrace;, 'remote_schema': &lbrace;'description': 'Remote schema name', 'type': 'string'&rbrace;, 'tables': &lbrace;'additionalProperties': &lbrace;'options': &lbrace;'additionalProperties': &lbrace;'type': 'string'&rbrace;, 'type': 'object'&rbrace;&rbrace;, 'type': 'object'&rbrace;&rbrace;, 'required': ['host', 'port', 'dbname', 'remote_schema'], 'type': 'object'&rbrace;
splitgraph.hooks.data_source.fdw.create_foreign_table(schema: str, server: str, table_name: str, schema_spec: List[splitgraph.core.types.TableColumn], extra_options: Optional[Dict[str, str]] = None)
splitgraph.hooks.data_source.fdw.init_fdw(engine: PostgresEngine, server_id: str, wrapper: str, server_options: Optional[Mapping[str, Optional[str]]] = None, user_options: Optional[Mapping[str, str]] = None, overwrite: bool = True)None

Sets up a foreign data server on the engine.

Parameters
  • engine – PostgresEngine

  • server_id – Name to call the foreign server, must be unique. Will be deleted if exists.

  • wrapper – Name of the foreign data wrapper (must be installed as an extension on the engine)

  • server_options – Dictionary of FDW options

  • user_options – Dictionary of user options

  • overwrite – If the server already exists, delete and recreate it.