wa-gov/pdc-enforcement-cases-a4ma-dq6s
Loading...

Query the Data Delivery Network

Query the DDN

The easiest way to query any data on Splitgraph is via the "Data Delivery Network" (DDN). The DDN is a single endpoint that speaks the PostgreSQL wire protocol. Any Splitgraph user can connect to it at data.splitgraph.com:5432 and query any version of over 40,000 datasets that are hosted or proxied by Splitgraph.

For example, you can query the pdc_enforcement_cases table in this repository, by referencing it like:

"wa-gov/pdc-enforcement-cases-a4ma-dq6s:latest"."pdc_enforcement_cases"

or in a full query, like:

SELECT
    ":id", -- Socrata column ID
    "attachments", -- A JSON structured collection of all documents associated with this case. The URLs can be used to retrieve the documents. This structured data is intended primarily for readability by computer systems. The documents are more readily available by viewing the case information on the PDC website.
    "url", -- The URL of the complete case information on the PDC website.
    "updated_at", -- The last time that the case record was updated.
    "balance_due", -- If the commission has issued an order in this case that includes penalties to be paid, this is the total penalties still due by all parties. See the Penalties column for a JSON structured breakdown of the penalty information.
    "status", -- The current disposition of the case. This field will be updated as the case progresses to final disposition.
    "closed", -- If the case has been closed, the date it was closed.
    "total_penalties", -- If the commission has issued an order in this case that includes penalties to be paid, this is the total penalties for all parties. See the Penalties column for a JSON structured breakdown of the penalty information.
    "areas_of_law", -- The areas of law included in the alleged violations, usually by RCW or WAC number.
    "penalties", -- A JSON structured collection of all penalties associated with this case. The This structured data is intended primarily for readability by computer systems. The penalty information is more readily available by viewing the case information on the PDC website.
    "subject", -- A short description of the case
    "respondent", -- The party or parties subject to the complaint
    "earliest_complaint", -- The date that the earliest complaint attached to this case was perfected. This is usually the date the complaint was submitted but a complaint may be lacking information and becomes "perfected" on the date all necessary information is provided.
    "opened", -- The date the case was opened by staff. This may be later than the date a complaint was first filed.
    "case", -- The PDC internal case number. This number will not change for a given action and is unique.
    "group_enforcement", -- When a case is the result of a group enforcement action, this field will indicate the type of group enforcement. Group enforcement actions are typically part of an audit process such as a review of all candidates that should have filed a candidate registration to determine those who have not.
    "total_paid", -- If the commission has issued an order in this case that includes penalties to be paid, this is the total penalties paid for all parties. See the Penalties column for a JSON structured breakdown of the penalty information.
    "total_suspended", -- If the commission has issued an order in this case that includes suspended penalties, this is the total suspended penalties for all parties. See the Penalties column for a JSON structured breakdown of the penalty information.
    "id", -- The unique identifier for this row. The value has no meaning but it can be used to track changes to the row.
    "total_reinstated", -- If the commission has issued an order in this case that included suspended penalties and any amount has been reinstated, this is the total reinstated penalties for all parties. See the Penalties column for a JSON structured breakdown of the penalty information.
    "complainant", -- The person initiating the public complaint or "PDC Staff" when the case is initiated by the PDC.
    "sent_to_collections", -- If any of the penalties in this case have been sent to a collection agency for collection of the debit, this column will be True. There may be more than one penalty, each with a different obligation. Review the individual penalties to see which have been sent to collections.
    "description" -- A full description of the case. The description text includes HTML tags, primarily intended to be read by computer systems. The description is more readable by viewing the case information on the PDC website.
FROM
    "wa-gov/pdc-enforcement-cases-a4ma-dq6s:latest"."pdc_enforcement_cases"
LIMIT 100;

Connecting to the DDN is easy. All you need is an existing SQL client that can connect to Postgres. As long as you have a SQL client ready, you'll be able to query wa-gov/pdc-enforcement-cases-a4ma-dq6s with SQL in under 60 seconds.

Query Your Local Engine

Install Splitgraph Locally
bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"
 

Read the installation docs.

Splitgraph Cloud is built around Splitgraph Core (GitHub), which includes a local Splitgraph Engine packaged as a Docker image. Splitgraph Cloud is basically a scaled-up version of that local Engine. When you query the Data Delivery Network or the REST API, we mount the relevant datasets in an Engine on our servers and execute your query on it.

It's possible to run this engine locally. You'll need a Mac, Windows or Linux system to install sgr, and a Docker installation to run the engine. You don't need to know how to actually use Docker; sgrcan manage the image, container and volume for you.

There are a few ways to ingest data into the local engine.

For external repositories, the Splitgraph Engine can "mount" upstream data sources by using sgr mount. This feature is built around Postgres Foreign Data Wrappers (FDW). You can write custom "mount handlers" for any upstream data source. For an example, we blogged about making a custom mount handler for HackerNews stories.

For hosted datasets (like this repository), where the author has pushed Splitgraph Images to the repository, you can "clone" and/or "checkout" the data using sgr cloneand sgr checkout.

Cloning Data

Because wa-gov/pdc-enforcement-cases-a4ma-dq6s:latest is a Splitgraph Image, you can clone the data from Spltgraph Cloud to your local engine, where you can query it like any other Postgres database, using any of your existing tools.

First, install Splitgraph if you haven't already.

Clone the metadata with sgr clone

This will be quick, and does not download the actual data.

sgr clone wa-gov/pdc-enforcement-cases-a4ma-dq6s

Checkout the data

Once you've cloned the data, you need to "checkout" the tag that you want. For example, to checkout the latest tag:

sgr checkout wa-gov/pdc-enforcement-cases-a4ma-dq6s:latest

This will download all the objects for the latest tag of wa-gov/pdc-enforcement-cases-a4ma-dq6s and load them into the Splitgraph Engine. Depending on your connection speed and the size of the data, you will need to wait for the checkout to complete. Once it's complete, you will be able to query the data like you would any other Postgres database.

Alternatively, use "layered checkout" to avoid downloading all the data

The data in wa-gov/pdc-enforcement-cases-a4ma-dq6s:latest is 0 bytes. If this is too big to download all at once, or perhaps you only need to query a subset of it, you can use a layered checkout.:

sgr checkout --layered wa-gov/pdc-enforcement-cases-a4ma-dq6s:latest

This will not download all the data, but it will create a schema comprised of foreign tables, that you can query as you would any other data. Splitgraph will lazily download the required objects as you query the data. In some cases, this might be faster or more efficient than a regular checkout.

Read the layered querying documentation to learn about when and why you might want to use layered queries.

Query the data with your existing tools

Once you've loaded the data into your local Splitgraph Engine, you can query it with any of your existing tools. As far as they're concerned, wa-gov/pdc-enforcement-cases-a4ma-dq6s is just another Postgres schema.

Related Documentation:

Loading...