Splitgraph has been acquired by EDB!

cityofchicago/transportation-department-permits-pubx-yq2d

Query the Data Delivery Network

The easiest way to query any data on Splitgraph is via the "Data Delivery Network" (DDN). The DDN is a single endpoint that speaks the PostgreSQL wire protocol. Any Splitgraph user can connect to it at data.splitgraph.com:5432 and query any version of over 40,000 datasets that are hosted or proxied by Splitgraph.

For example, you can query the transportation_department_permits table in this repository, by referencing it like:

"cityofchicago/transportation-department-permits-pubx-yq2d:latest"."transportation_department_permits"

or in a full query, like:

SELECT
    ":id", -- Socrata column ID
    "lastinspectiontype", -- Code for the type of inspection generated for the issued permit. Typically this will match the type of permit although special inspection types are available for stop work.
    "location", -- Location of the start of the address range.
    "lastinsptypedescr", -- Description of the last inspection completed against this permit.
    "primarycontactmiddle", -- Middle initial of the primary applicant, if not a company.
    "waivedfees", -- If any fees were waived for this permit, the amount waived will be here. Possible reasons for waived fees include city contracts, city departments, ordinances, and franchise agreements with the City. 
    "streetclosure", -- Type of street closure, if any, associated with the location.
    "placement", -- Additional information about the permit location. This column may also include detour information.
    "streetnumberto", -- The end of the address range for the permit location. For multiple locations on a permit, there will be multiple rows in the dataset.
    "suffix", -- The street name suffix for the permit location. For multiple locations on a permit, there will be multiple rows in the dataset.
    ":@computed_region_vrxf_vc4k",
    ":@computed_region_6mkv_f3dw",
    "streetname", -- The street name for the permit location. For multiple locations on a permit, there will be multiple rows in the dataset.
    "direction", -- The street direction for the permit location. For multiple locations on a permit, there will be multiple rows in the dataset.
    "primarycontactstreet", -- Street address of the primary applicant.
    "detail", -- Detail information about the permit application.
    ":@computed_region_bdys_3d7i",
    "xcoordinate", -- X coordinate of the start of the address range in State Plane Illinois East NAD 1983 projection.
    "parkingmeterpostingorbagging", -- Indicates if any parking meter bagging or posting of no parking signs  is requested on the permit application.
    "ycoordinate", -- Y coordinate of the start of the address range in State Plane Illinois East NAD 1983 projection.
    "ward", -- The ward (City Council district) of the start of the address range.
    "applicationtype", -- Code for the type of permit. Permits may be categorized further by the WORKTYPE.
    "primarycontactcity", -- City of the primary applicant.
    "applicationdescription", -- Type of permit.
    "location_address",
    "worktype", -- Code for the sub-category, if any, of the permit application. Only one work type per application is allowed.
    "location_city",
    "primarycontactlast", -- The last name of the primary applicant for the permit. If an employee applied on behalf of a company, the company name will appear. 
    "worktypedescription", -- Sub-category, if any, of the permit application. Only one work type per application is allowed.
    "applicationstatus", -- Overall status of the application. Typically open during the application process and issued, then closed once the permit has expired.
    "primarycontactstreet2", -- Street address 2nd line of the primary applicant.
    "currentmilestone", -- The current step in the permitting life cycle.
    "location_state",
    "location_zip",
    "applicationstartdate", -- Permit start date. Most often requested by the applicant but could be updated as the permit is reviewed by CDOT.
    "streetnumberfrom", -- The start of the address range for the permit location. For multiple locations on a permit, there will be multiple rows in the dataset.
    "lastinspectionresult", -- The result of the last inspection, if completed.
    "applicationenddate", -- Permit End date. Most often requested by the applicant but could be updated as the permit is reviewed by CDOT.
    "lastinspectiondate", -- Date of the last inspection.
    "applicationnumber", -- Unique number for the application that becomes the permit number when issued.
    "uniquekey", -- A unique identifier for each record.
    ":@computed_region_rpca_8um6",
    "longitude", -- Longitude of the start of the address range.
    "latitude", -- Latitude of the start of the address range.
    "totalfees", -- Total permit fees for this permit calculated based on the work being carried out and location. See fee schedule for further information, available on the CDOT permit portal, http://www.cityofchicago.org/city/en/depts/cdot/provdrs/construction_information/svcs/online-permit-portal.html.
    "applicationprocesseddate", -- This is the date when the permit reaches the review milestone, means the date CDOT began reviewing the application.
    "applicationissueddate", -- Date the permit was issued by CDOT. Set once all the reviews, fees, etc. are paid (where applicable).
    "applicationfinalizeddate", -- Date the permit was closed (either after the permit has expired plus a grace period or after an inspector marks the work complete).
    "lastinspectionnumber", -- Once a permit is issued, typically an inspection is created for CDOT Inspectors to use onsite. Once an inspection is completed, another inspection may be generated or if the work is finished, the permit is closed. This is the number of the last inspection.
    "applicationexpiredate", -- Date the permit will expire (typically same as permit end date). Applicant may ask for date extensions where the end date is updated and will be reflected here once re-approved by CDOT.
    "emergencycontactname", -- Person to contact in an emergency as listed by the applicant during the permit application process.
    "applicationname", -- Applicants can put a project name on their applications in this field. For converted data 1/1/2013 to 11/13/2015, the old permit number will be placed in this field.
    ":@computed_region_awaf_s7ux",
    ":@computed_region_43wa_7qmu",
    "primarycontactzip", -- ZIP Code of the primary applicant.
    "primarycontactstate", -- State of the primary applicant.
    "primarycontactfirst", -- First name of the primary applicant, if not a company.
    "comments" -- Applicants can place a description of the work here.
FROM
    "cityofchicago/transportation-department-permits-pubx-yq2d:latest"."transportation_department_permits"
LIMIT 100;

Connecting to the DDN is easy. All you need is an existing SQL client that can connect to Postgres. As long as you have a SQL client ready, you'll be able to query cityofchicago/transportation-department-permits-pubx-yq2d with SQL in under 60 seconds.

Query Your Local Engine

Install Splitgraph Locally

bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"

Read the installation docs.

Splitgraph Cloud is built around Splitgraph Core (GitHub), which includes a local Splitgraph Engine packaged as a Docker image. Splitgraph Cloud is basically a scaled-up version of that local Engine. When you query the Data Delivery Network or the REST API, we mount the relevant datasets in an Engine on our servers and execute your query on it.

It's possible to run this engine locally. You'll need a Mac, Windows or Linux system to install sgr, and a Docker installation to run the engine. You don't need to know how to actually use Docker; sgrcan manage the image, container and volume for you.

There are a few ways to ingest data into the local engine.

For external repositories, the Splitgraph Engine can "mount" upstream data sources by using sgr mount. This feature is built around Postgres Foreign Data Wrappers (FDW). You can write custom "mount handlers" for any upstream data source. For an example, we blogged about making a custom mount handler for HackerNews stories.

For hosted datasets (like this repository), where the author has pushed Splitgraph Images to the repository, you can "clone" and/or "checkout" the data using sgr cloneand sgr checkout.

Cloning Data

Because cityofchicago/transportation-department-permits-pubx-yq2d:latest is a Splitgraph Image, you can clone the data from Spltgraph Cloud to your local engine, where you can query it like any other Postgres database, using any of your existing tools.

First, install Splitgraph if you haven't already.

Clone the metadata with `sgr clone`

This will be quick, and does not download the actual data.

sgr clone cityofchicago/transportation-department-permits-pubx-yq2d

Checkout the data

Once you've cloned the data, you need to "checkout" the tag that you want. For example, to checkout the latest tag:

sgr checkout cityofchicago/transportation-department-permits-pubx-yq2d:latest

This will download all the objects for the latest tag of cityofchicago/transportation-department-permits-pubx-yq2d and load them into the Splitgraph Engine. Depending on your connection speed and the size of the data, you will need to wait for the checkout to complete. Once it's complete, you will be able to query the data like you would any other Postgres database.

Alternatively, use "layered checkout" to avoid downloading all the data

The data in cityofchicago/transportation-department-permits-pubx-yq2d:latest is 0 bytes. If this is too big to download all at once, or perhaps you only need to query a subset of it, you can use a layered checkout.:

sgr checkout --layered cityofchicago/transportation-department-permits-pubx-yq2d:latest

This will not download all the data, but it will create a schema comprised of foreign tables, that you can query as you would any other data. Splitgraph will lazily download the required objects as you query the data. In some cases, this might be faster or more efficient than a regular checkout.

Read the layered querying documentation to learn about when and why you might want to use layered queries.

Query the data with your existing tools

Once you've loaded the data into your local Splitgraph Engine, you can query it with any of your existing tools. As far as they're concerned, cityofchicago/transportation-department-permits-pubx-yq2d is just another Postgres schema.

cityofchicago/transportation-department-permits-pubx-yq2d

Query the Data Delivery Network

Query Your Local Engine

Cloning Data

Clone the metadata with `sgr clone`

Checkout the data

Alternatively, use "layered checkout" to avoid downloading all the data

Query the data with your existing tools

Related Documentation:

Product

Support

Company

Community

Query the Data Delivery Network

Query Your Local Engine

Cloning Data

Clone the metadata with sgr clone

Checkout the data

Alternatively, use "layered checkout" to avoid downloading all the data

Query the data with your existing tools

Related Documentation:

Product

Support

Company

Community

Clone the metadata with `sgr clone`