wa-gov/illicit-discharge-detection-elimination-idde-67t4-gp46
Loading...

Query the Data Delivery Network

Query the DDN

The easiest way to query any data on Splitgraph is via the "Data Delivery Network" (DDN). The DDN is a single endpoint that speaks the PostgreSQL wire protocol. Any Splitgraph user can connect to it at data.splitgraph.com:5432 and query any version of over 40,000 datasets that are hosted or proxied by Splitgraph.

For example, you can query the illicit_discharge_detection_elimination_idde table in this repository, by referencing it like:

"wa-gov/illicit-discharge-detection-elimination-idde-67t4-gp46:latest"."illicit_discharge_detection_elimination_idde"

or in a full query, like:

SELECT
    ":id", -- Socrata column ID
    "_17_other_helpful_or_relevant_information", -- This field may be used to include any other helpful or relevant information about the problem, response, investigation, and resolution process. If you can estimate the amount of pollutant in pounds or discharge in gallons that was eliminated as a result of your IDDE actions, please include it here.
    "_15_final_resolution_date", -- If in process go to 15A.
    "_11_indicator_testing", -- CHECK ALL THAT APPLY. Sharing what indicators you used to identify the problem will be valuable for understanding what indicators are most used and useful. If you wish to include field results or specific indicator values in Field 17 you may, but they will not be included in the SIDIR analysis. If you use other indicators, explain in Field 17.
    "_5b_precipitation_in_previous_24hrs", -- Estimated, in Inches
    "_14a_enforcement", -- If selected in #14
    "_9a_erts_number", -- If Selected in #9
    "_10_source_tracing_methods", -- CHECK ALL THAT APPLY. If you use other methods explain your approach in Field 16.
    "_7b_g3_notification",
    "_2a_new_or_edited_entry", -- If editing or updating, only enter fields below that you are changing.
    "_3_date_incident_initially_reported",
    "_4_location", -- This information will be helpful for mapping incidents to identify geographic clusters of various problems. If you prefer not to provide the entire address, please consider providing the zip code.
    "_4b_and_or_zip_code",
    "_15a_in_process", -- Check here if Final Resolution Date in process
    "_6_frequency",
    "_5a_temperature", -- Estimated in Degrees Farenheit
    "_5_raining_at_time_of_report",
    "_13_source_or_cause", -- CHECK ALL THAT APPLY. If the source cannot be described using the list provided, explain in Field 16. For commercial and industrial sources, provide additional, more detailed information about the activity in Field 16. More specific information will support better analysis of the information to prioritize activities for adaptive management. If more specific information is not provided in Field 16, we may follow up with a request for additional information.
    "_14_correction_elimination_method", -- CHECK ALL THAT APPLY.
    "_12_pollutant_s_identified", -- CHECK ALL THAT APPLY. If the problem cannot be described using the list provided, explain in Field 17.
    "_9_how_did_you_learn_about_the_problem", -- CHECK ALL THAT APPLY. Information about how you learned about the problem will be valuable for understanding what methods are most used and useful.
    "_7a_immediate_response",
    "_7_threat_determination_and_g3_notification", -- Required by permit condition Phase I S5.C.8.d.iv and G3, and Phase II S5.C.3.d.iv and G3. Provide documentation of any delay in Field 16. If G3 notification is required, check that it was done or provide an explanation in Field 16. Constituted a threat to human health or the environment?
    "_4c_and_or_nearest_intersection",
    "_4a_street_address",
    "_4d_is_the_structure_mapped_inventoried",
    "_8a_if_suspected_illicit_connection_investigated_within_21_days",
    "_8b_final_resolution_of_illicit_connection_within_6_months",
    "_8_investigated_within_7_days_per_program_procedures",
    "jurisdiction_name",
    "unique_identifier", -- Provide a number unique to this incident within your jurisdiction for the reporting year. You may use a work order number or an ERTS number. The unique number may be auto-generated or manually assigned. Numbering may be continual or restarted each calendar (reporting) year.
    "_13a_commercial", -- If Selected in #13
    "_16_field_notes_explanations_and_other_comments", -- Provide documentation of any delay in response or resolution in this field. Explain any required response labeled “other” in this field. If you have no notes to add, do not leave the field blank; write ‘N/A’ or ‘None’.
    "email"
FROM
    "wa-gov/illicit-discharge-detection-elimination-idde-67t4-gp46:latest"."illicit_discharge_detection_elimination_idde"
LIMIT 100;

Connecting to the DDN is easy. All you need is an existing SQL client that can connect to Postgres. As long as you have a SQL client ready, you'll be able to query wa-gov/illicit-discharge-detection-elimination-idde-67t4-gp46 with SQL in under 60 seconds.

Query Your Local Engine

Install Splitgraph Locally
bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"
 

Read the installation docs.

Splitgraph Cloud is built around Splitgraph Core (GitHub), which includes a local Splitgraph Engine packaged as a Docker image. Splitgraph Cloud is basically a scaled-up version of that local Engine. When you query the Data Delivery Network or the REST API, we mount the relevant datasets in an Engine on our servers and execute your query on it.

It's possible to run this engine locally. You'll need a Mac, Windows or Linux system to install sgr, and a Docker installation to run the engine. You don't need to know how to actually use Docker; sgrcan manage the image, container and volume for you.

There are a few ways to ingest data into the local engine.

For external repositories, the Splitgraph Engine can "mount" upstream data sources by using sgr mount. This feature is built around Postgres Foreign Data Wrappers (FDW). You can write custom "mount handlers" for any upstream data source. For an example, we blogged about making a custom mount handler for HackerNews stories.

For hosted datasets (like this repository), where the author has pushed Splitgraph Images to the repository, you can "clone" and/or "checkout" the data using sgr cloneand sgr checkout.

Cloning Data

Because wa-gov/illicit-discharge-detection-elimination-idde-67t4-gp46:latest is a Splitgraph Image, you can clone the data from Spltgraph Cloud to your local engine, where you can query it like any other Postgres database, using any of your existing tools.

First, install Splitgraph if you haven't already.

Clone the metadata with sgr clone

This will be quick, and does not download the actual data.

sgr clone wa-gov/illicit-discharge-detection-elimination-idde-67t4-gp46

Checkout the data

Once you've cloned the data, you need to "checkout" the tag that you want. For example, to checkout the latest tag:

sgr checkout wa-gov/illicit-discharge-detection-elimination-idde-67t4-gp46:latest

This will download all the objects for the latest tag of wa-gov/illicit-discharge-detection-elimination-idde-67t4-gp46 and load them into the Splitgraph Engine. Depending on your connection speed and the size of the data, you will need to wait for the checkout to complete. Once it's complete, you will be able to query the data like you would any other Postgres database.

Alternatively, use "layered checkout" to avoid downloading all the data

The data in wa-gov/illicit-discharge-detection-elimination-idde-67t4-gp46:latest is 0 bytes. If this is too big to download all at once, or perhaps you only need to query a subset of it, you can use a layered checkout.:

sgr checkout --layered wa-gov/illicit-discharge-detection-elimination-idde-67t4-gp46:latest

This will not download all the data, but it will create a schema comprised of foreign tables, that you can query as you would any other data. Splitgraph will lazily download the required objects as you query the data. In some cases, this might be faster or more efficient than a regular checkout.

Read the layered querying documentation to learn about when and why you might want to use layered queries.

Query the data with your existing tools

Once you've loaded the data into your local Splitgraph Engine, you can query it with any of your existing tools. As far as they're concerned, wa-gov/illicit-discharge-detection-elimination-idde-67t4-gp46 is just another Postgres schema.

Related Documentation:

Loading...