Splitgraph has been acquired by EDB! Read the blog post.

splitgraph.ingestion.socrata package

Submodules

splitgraph.ingestion.socrata.fdw module

Module imported by Multicorn on the Splitgraph engine server: a foreign data wrapper that communicates to Socrata datasets using sodapy.

class splitgraph.ingestion.socrata.fdw.SocrataForeignDataWrapper(fdw_options, fdw_columns)

Bases: object

can_sort(sortkeys)
Parameters

sortkeys – List of SortKey

Returns

List of SortKey the FDW can sort on

execute(quals, columns, sortkeys=None)

Main Multicorn entry point.

explain(quals, columns, sortkeys=None, verbose=False)
get_rel_size(quals, columns)

Method called from the planner to estimate the resulting relation size for a scan. It will help the planner in deciding between different types of plans, according to their costs. Args:

quals (list): A list of Qual instances describing the filters

applied to this scan.

columns (list): The list of columns that must be returned.

Returns:

A tuple of the form (expected_number_of_rows, avg_row_width (in bytes))

property table_meta
splitgraph.ingestion.socrata.fdw.to_json(row, columns, column_map)

splitgraph.ingestion.socrata.mount module

Splitgraph mount handler for Socrata datasets

class splitgraph.ingestion.socrata.mount.SocrataDataSource(engine: PostgresEngine, credentials: Credentials, params: Params, tables: Optional[Union[List[str], Dict[str, Tuple[List[splitgraph.core.types.TableColumn], TableParams]]]] = None)

Bases: splitgraph.hooks.data_source.fdw.ForeignDataWrapperDataSource

credentials_schema: Dict[str, Any] = {'properties': {'app_token': {'description': 'Socrata app token', 'type': 'string'}}, 'type': 'object'}
classmethod from_commandline(engine, commandline_kwargs) splitgraph.ingestion.socrata.mount.SocrataDataSource

Instantiate an FDW data source from commandline arguments.

classmethod get_description() str
get_fdw_name()
classmethod get_name() str
get_raw_url(tables: Optional[Union[List[str], Dict[str, Tuple[List[splitgraph.core.types.TableColumn], TableParams]]]] = None, expiry: int = 3600) Dict[str, List[Tuple[str, str]]]

Get a list of public URLs for each table in this data source, e.g. to export the data as CSV. These may be temporary (e.g. pre-signed S3 URLs) but should be accessible without authentication. :param tables: A TableInfo object overriding the table params of the source :param expiry: The URL should be valid for at least this many seconds :return: Dict of table_name -> list of (mimetype, raw URL)

get_server_options()
params_schema: Dict[str, Any] = {'properties': {'batch_size': {'default': 1000, 'description': 'Amount of rows to fetch from Socrata per request (limit parameter)', 'maximum': 50000, 'minimum': 1, 'type': 'integer'}, 'domain': {'description': 'Socrata domain, for example, data.albanyny.gov', 'type': 'string'}}, 'required': ['domain'], 'type': 'object'}
table_params_schema: Dict[str, Any] = {'properties': {'socrata_id': {'description': 'Socrata dataset ID, e.g. xzkq-xp2w', 'type': 'string'}}, 'required': ['socrata_id'], 'type': 'object'}
splitgraph.ingestion.socrata.mount.generate_socrata_mount_queries(sought_ids, datasets, mountpoint, server_id, tables: Union[List[str], Dict[str, Tuple[List[splitgraph.core.types.TableColumn], TableParams]]])

splitgraph.ingestion.socrata.querying module

splitgraph.ingestion.socrata.querying.cols_to_socrata(cols, column_map: Optional[Dict[str, str]] = None)
splitgraph.ingestion.socrata.querying.estimate_socrata_rows_width(columns, metadata, column_map=None)

Estimate number of rows required for a query and each row’s width from the table metadata.

splitgraph.ingestion.socrata.querying.quals_to_socrata(quals, column_map: Optional[Dict[str, str]] = None)

Convert a list of Multicorn quals to a SoQL query

splitgraph.ingestion.socrata.querying.socrata_to_sg_schema(metadata: Dict[str, Any]) Tuple[List[splitgraph.core.types.TableColumn], Dict[str, str]]
splitgraph.ingestion.socrata.querying.sortkeys_to_socrata(sortkeys, column_map: Optional[Dict[str, str]] = None)

Module contents