Query the Data Delivery Network
Query the DDNThe easiest way to query any data on Splitgraph is via the "Data Delivery Network" (DDN). The DDN is a single endpoint that speaks the PostgreSQL wire protocol. Any Splitgraph user can connect to it at data.splitgraph.com:5432 and query any version of over 40,000 datasets that are hosted or proxied by Splitgraph.
For example, you can query the current_descriptive_data_of_municipal_wastewater table in this repository, by referencing it like:
"ny-gov/current-descriptive-data-of-municipal-wastewater-g5a2-qa6a:latest"."current_descriptive_data_of_municipal_wastewater"or in a full query, like:
SELECT
    ":id", -- Socrata column ID
    "spdes_number", -- Refers to the New York State Pollution Discharge Elimination System permit number; additional information may be found at http://www.dec.ny.gov/permits/6054.html. 
    "pretreatment", -- .: Not Specified;  A: AERATION;  C: CLARIFIER;  CO: COAGULATION;  G: CHLORINE GAS;  H: HYPOCHLORITE;  O: OTHER MEANS;  OA: GREASE REMOVAL DIFFUSED AIR;  OG: GREASE TRAP;  P: POTASSIUM PERMANGANATE;  SB: SEQUENCING BATCH REACTOR;  ST: SETTLING TANK. 
    "year_built", -- Refers to the year when the existing treatment plant was first constructed.
    "sludge_thickening", -- .: NOT SPECIFIED; A: FLOTATION--AIR; C: IN CLARIFIER ("CLARITHICKENER"); P: GRAVITY; SC: SLUDGE CONCENTRATOR.
    "number_employees", -- Refers to the number of employees at the facility. The term “+P/T” refers to the fact the facility also has some part time employees.
    "phone_number", -- The telephone number used to contact the chief operator.
    "grit_removal", -- .: NOT SPECIFIED; A: AERATED GRIT CHAMBER; C: GRIT CHAMBER (GRAVITY SEPARATION); CC: GRIT CLASSIFIER; CM: GRIT CHAMBER MECHANICAL GRIT REMOVAL; D: CYCLONE DEGRITTER; DT: DETRITUS TANKS; M: MECHANICALLY MIXED AIRLIFT GRIT CHAMBER; V: VORTEX GRIT CHAMBER.
    "sludge_storage", -- . :	NOT SPECIFIED; A:	HOLDING TANK-AERATED; C:	COVERED STORAGE TANKS; CM:	COVERED WITH MIXING; DT:	DECANT TANKS; L:	LAGOON; M:	WITH MIXING; O:	OPEN STORAGE TANKS.
    "lagoons", -- . :	NOT SPECIFIED; A:	AERATED LAGOONS; FA:	FACULTATIVE AERATED; FB:	FLOATING BIOLOGICAL CONTACTOR; H:	HOLDING LAGOON; O:	STABILIZATION; P:	SETTLING LAGOON; RB:	RECHARGE BED; RF:	OVERFLOW RETENTION FACILITY; S:	SEEPAGE (NO DISCHARGE); T:	EMERGENCY STORAGE ONLY.
    "disinfection", -- .: NOT SPECIFIED;  C: CONTACT TANK;  DG: DECHLORINATION;  G: CHLORINE GAS DISINFECTION;  GB: CHLORINE GAS-BACKUP SYSTEM;  GC: CHLORINE GAS – CONTACT TANK;  GP: CHLORINE GAS – IN DISCHARGE PIPE;  GS: CHLORINE GAS – SEASONAL;  H: HYPOCHLORITE DISINFECTION;  HB: HYPOCHLORITE – BACKUP SYSTEM;  HC: HYPOCHLORITE – CONTACT TANK;  HP: HYPOCHLORITE – IN DISCHARGE PIPE;  HS: HYPOCHLORITE – SEASONAL;  S: SEASONAL;  T: CHLORINE TABLETS;  TB: CHLORINE TABLETS – BACKUP SYSTEM;  TC: CHLORINE TABLETS – CONTACT TANK;  UV: ULTRAVIOLET. 
    "filters", -- .: NOT SPECIFIED;  AC: ACTIVATED CARBON COLUMNS;  DE: DIATOMACEOUS EARTH;  MC: MICROFILTRATION (CBUDS);  MF: MICROFILTRATION (MEMBRANE);  MM: MULTIMEDIA (DUAL MEDIA) FILTRATION;  P: PRESSURE FILTERS;  PF: PLATE & FRAME;  RD: ROTATING DISK;  RS: RECIRCULATING SAND FILTERS;  S: SAND FILTERS (NOT SPECIFIED);  SB: BURIED SAND FILTER;  SC: CONTINUOUS BACKWASH SAND FILTERS SI INTERMITTENT SAND FILTERS;  SR: RAPID SAND (HIGH RATE) FILTERS. 
    "facility_name", -- The facility names are given using the notations: (C) - City; (T) - Town; (V) - Village; (H) - Hamlet; SD - Sewer District; WWTP - Waste Water Treatment Plant; EFC - New York State Environmental Facilities Corporation; NYC DEP - New York City Department on Environmental Protection. Privately owned facilities and those municipally-owned treatment plants which provide service other than to the general populace have been excluded.
    "sludge_disposal", -- . :	NOT SPECIFIED; A:	ASH LAGOONS C:	COMPOSTING; CO:	COMPOSTING - OFF SITE; D:	USED FOR FERTILIZER; I:	INCINERATED (NOT SPECIFIED); IF:	INCINERATED FLUIDIZED BED; IM:	MULTIPLE HEARTH INCINERATOR; L:	LAGOON; LS:	LANDSPREADING; N:	LANDFILL; NC:	LANDFILL COVER; OS:	ON SITE; R:	LAND RECLAMATION; S:	TO ANOTHER TREATMENT PLANT; T:	SCAVENGER.
    "permittee_name", -- The permittee names are given using the notations: (C) - City; (T) - Town; (V) - Village; (H) - Hamlet; SD - Sewer District; WWTP - Waste Water Treatment Plant; EFC - New York State Environmental Facilities Corporation; NYC DEP - New York City Department on Environmental Protection. Privately owned facilities and those municipally-owned treatment plants which provide service other than to the general populace have been excluded.
    "county", -- Refers to the county in which the treatment plant is located
    "receiving_waterbody", -- Indicates the name of the receiving waters to which the facility discharges.
    "drainage_basin", -- Indicates the major basin in which the discharge is located according to the following code: 01 - Lake Erie - Niagara River; 02 - Allegheny River; 03 - Lake Ontario & Minor Tribs; 04 - Genesee River; 05 - Chemung River; 06- Susquehanna River; 07 - Seneca - Oneida - Oswego Rivers; 08 - Black River; 09 - St. Lawrence River; 10 - Lake Champlain; 11 - Upper Hudson River; 12 - Mohawk River; 13 - Lower Hudson River; 14 - Delaware River; 15 - Passaic - Newark Rivers; 16 - Housatonic River; 17 - Atlantic Ocean - Long Island Sound
    "biological_treatment", -- .: NOT SPECIFIED;  A: HIGH RATE ACTIVATED SLUDGE;  C: CONTACT STABILIZATION;  D: OXIDATION DITCH; E: EXTENDED AERATION;  G: STAGED AERATION;  MC: MIXING – COARSE BUBBLE; MF: MIXING – FINE BUBBLE;  MM: MIXING – MECHANICAL;  N: CONVENTIONAL ACTIVATED SLUDGE;  O: PURE OXYGEN ACTIVATED SLUDGE;  P: PLUG FLOW ACTIVATED SLUDGE;  R: COMPLETE MIX ACTIVATED SLUDGE;  RD: ROTATING BIOLOGICAL CONTACTOR;  S: STEP AERATION ACTIVATED SLUDGE;  SB: SEQUENCING BATCH REACTOR;  SC: SECONDARY CLARIFIER;  T2: TWO-STAGE TRICKLING FILTER;  TC: COVERED TRICKLING FILTER;  TF: TRICKLING FILTER; TH: HIGH RATE TRICKLING FILTER; TL: LOW RATE TRICKLING FILTER;  TS: SYNTHETIC MEDIA TRICKLING FILTER (BIOTOWER).
    "design_flow", -- Indicates the design flow of the treatment plant in units of million gallons per day (gpd), as indicated in the most recent SPDES permit.
    "sludge_conditioning", -- . :	NOT SPECIFIED; FC:	FERRIC CHLORIDE; FS:	FERROUS CHLORIDE; L:	LIME TREATMENT; O:	WET AIR OXIDATION; P:	POLYMER; PP:	POTASSIUM PERMANGANATE; S:	IRON SALT OR ALUM; SC:	SODIUM CHLORITE; Y:	ELUTRIATION.
    "primary_settling", -- .: NOT SPECIFIED;  D: TWO STORY (CLARIGESTER);  I: IMHOFF TANK/DOTEN TANK;  M: MECHANICALLY CLEANED CLARIFIER;  P: PLAIN CLARIFIER WITH HOPPER BOTTOM;  PR: PRECIPITATION;  S: SEPTIC TANK;  SG: SPIRAGESTER;  SI: INDIVIDUAL SEPTIC SYSTEM;  ST: SETTLING TANK;  T: TUBE SETTLER;  TS: TRAY SETTLERS;  UB: FACILITY UNBUILT.  
    "collection_system", -- “S” indicates that the tributary collection system is a separate one; “C” indicates a combined sewer system.
    "metering", -- .: NOT SPECIFIED;  DM: DOPPLER FLOWMETER;  DS: DOSING SIPHON COUNTER;  KN: KENNISON NOZZLE;  M: FLOWMETER – UNSPECIFIED;  MF: MAGNETIC FLOWMETER;  OF: OTHER FLUME TYPES;  P: PROPELLER/TURBINE;  PF: PARSHALL FLUME;  PO: PUMP OPERATION TIME;  PR: PUMP RECORDS;  TP: TRIBUTARY POPULATION;  U: ULTRASONIC;  VM: VELOCITY METER;  VT: VENTURI TUBE;  W: WEIR. 
    "sludge_dewatering", -- . :	NOT SPECIFIED; B:	BAG SYSTEM; BP:	BELT FILTER PRESS; C:	CENTRIFUGE; DD:	DRUM DRYER; GT:	GRAVITY BELT THICKENER; PF:	PLATE & FRAME PRESS; V:	VACUUM FILTER (NOT SPECIFIED); VC:	VACUUM FILTER (COIL); VF:	VACUUM FILTER (FABRIC).
    "population_served", -- Refers to the estimation of people living within the service area and connected to the facility.
    "plant_class", -- Refers to the certification required for the chief operator based on scoring of the plant’s treatment train (additional information may be found at http://www.dec.ny.gov/chemical/8707.html): Activated Sludge Treatment, with a definition of a biological treatment process in which a mixture of wastewater and activated sludge is agitated and aerated. The activated sludge is subsequently separated from the treated wastewater by sedimentation and wasted or returned to the process as needed. 4A: plant score greater than 75 points; 3A: plant score between 56 and 75 points; 2A: plant score between 31 and 55 points; 1A: plant score or less than 30 points. Any biological oxidation process other than activated sludge. 4: plant score greater than 75 points; 3: plant score between 56 and 75 points; 2: plant score between 31 and 55 points; 1: plant score or less than 30 points. Exemptions: Septic systems followed by subsurface leaching facilities with eventual discharge to ground waters, regardless of design capacity; Septic tanks followed by open or covered intermittent sand filters, with design capacity less than 50,000 gpd; and wastewater treatment plants which treat industrial wastes exclusively are exempt from having a certified chief operator or assistant/shift operator.
    "equalization", -- .: NOT SPECIFIED; A: AERATED; CS: CSO SPECIFY TREATMENT; S: SUMP; T: SURGE TANK; VF: VARIABLE FREQUENCY DRIVE PUMPS.
    "additional_treatment", -- . :	NOT SPECIFIED; A:	CARBON ASORPTION; AS:	AIR STRIPPING; B:	BREAKPOINT CHLORINATION; BM:	STORMWATER BMP; C:	CHEMICAL COAGULATION AND SETTLING; CL:	COOLING WATER TREATMENT; F:	FILTRATION; L:	POLISHING LAGOONS; M:	MICROSCREENING; N:	NITROGEN REMOVAL (NOT SPECIFIED); NA:	ONE STAGE BIOLOGICAL NITRIFICATION; NB:	TWO STATE BIOLOGICAL NITRIFICATION; ND:	DENITRIFICATION; NF:	NUTRIENT FEED (AMMONIA); OF:	OVERLAND FLOW; OS:	OIL SEPARATION; PA:	POST AERATION; PB:	PHOSPHORUS REMOVAL (BIOLOGICAL); PC:	PHOSPHORUS REMOVAL (CHEMICAL); PR:	PHOSPHORUS REMOVAL (NOT 	SPECIFIED); SC:	ADDITIONAL CLARIFICATION; SI:	SPRAY IRRIGATION; ST:	SETTLING TANK; TC:	TERTIARY CLARIFIER; TS:	TUBE SETTLER EFFLUENT POLISHING; U:	NEUTRALIZATION; WL:	ARTIFICIAL WETLAND; X:	CHEMICAL OXIDATION.
    "year_updated", -- Refers to the year in which the most recent improvement to the treatment facility were made, if any; field will be blank if no information on recent improvement is available.
    "intermediate_treatment", -- .: NOT SPECIFIED;  AM: MODIFIED AERATION;  C: INTERMEDIATE SETTLING;  F: FLOCCULATION;  G: COAGULATION;  P: PRECIPITATION;  RB: ROTATING BIOLOGICAL CONTACTOR;  RF: ROUGHING TRICKLING FILTERS.
    "sludge_digestion", -- .: NOT SPECIFIED;  A: AEROBIC SLUDGE DIGESTION MECHANICAL AERATION;  D: AEROBIC SLUDGE DIGESTION DIFFUSED AERATION;  DA: AEROBIC (NOT SPECIFIED);  DD: AEROBIC (USING AERATION TANKS);  DO: AEROBIC PURE OXYGEN; N: ANAEROBIC SLUDGE DIGESTION (UNSPECIFIED);  NP: ANAEROBIC SLUDGE DIGESTION SINGLE STAGE;  NS: ANAEROBIC SLUDGE DIGESTION TWO STAGE. 
    "stream_class", -- Indicates the water quality classification of the receiving waters.  Additional information regarding stream class designations may be found at http://www.dec.ny.gov/regs/2485.html and https://data.ny.gov/Energy-Environment/Waterbody-Classifications/8xz8-5u5u.
    "chief_operator", -- Refers to the person or business designated as chief operator by the permittee.
    "sludge_drying_beds", -- . :	NOT SPECIFIED; C:	COVERED DRYING BEDS; CA:	COVERED ASPHALT; CS:	COVERED SAND; H:	HEATED; L:	LAGOON DRYING BED; O:	OPEN DRYING BED; OA:	OPEN-ASPHALT OS:	OPEN-SAND RB:	REED BEDS V:	VACUUM ASSISTED.
    "screening" -- .: NOT SPECIFIED;  B: BAR RACK/SCREEN (1/2" TO 2" OPENINGS);  BH: BAR RACK/SCREEN (HAND CLEANED);  BM: BAR RACK/SCREEN (MECHANICALLY CLEANED);  C: COMMINUTOR/BAR MUNITOR;  CB: BAR MUNITOR;  CC: COMMUNITOR;  CI: IN PIPE COMMUNITOR;  F: FINE SCREEN (LESS THAN 1/8" OPENINGS);  FH: FINE SCREEN, HAND CLEANED;  FM: FINE SCREEN, MECHANICALLY CLEANED;  R: COARSE RACK;  RH: COARSE SCREEN-HAND CLEANED;  RM: COARSE SCREEN-MECHANICALLY CLEANED;  SR: SCREEN RACK (SELF CLEANING);  WW: WEDGE WIRE SCREEN (ROTATING/FIXED). 
FROM
    "ny-gov/current-descriptive-data-of-municipal-wastewater-g5a2-qa6a:latest"."current_descriptive_data_of_municipal_wastewater"
LIMIT 100;Connecting to the DDN is easy. All you need is an existing SQL client that can connect to Postgres. As long as you have a SQL client ready, you'll be able to query ny-gov/current-descriptive-data-of-municipal-wastewater-g5a2-qa6a with SQL in under 60 seconds.
Query Your Local Engine
bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"Read the installation docs.
Splitgraph Cloud is built around Splitgraph Core (GitHub), which includes a local Splitgraph Engine packaged as a Docker image. Splitgraph Cloud is basically a scaled-up version of that local Engine. When you query the Data Delivery Network or the REST API, we mount the relevant datasets in an Engine on our servers and execute your query on it.
It's possible to run this engine locally. You'll need a Mac, Windows or Linux system to install sgr, and a Docker installation to run the engine. You don't need to know how to actually use Docker; sgrcan manage the image, container and volume for you.
There are a few ways to ingest data into the local engine.
For external repositories, the Splitgraph Engine can "mount" upstream data sources by using sgr mount. This feature is built around Postgres Foreign Data Wrappers (FDW). You can write custom "mount handlers" for any upstream data source. For an example, we blogged about making a custom mount handler for HackerNews stories.
For hosted datasets (like this repository), where the author has pushed Splitgraph Images to the repository, you can "clone" and/or "checkout" the data using sgr cloneand sgr checkout.
Cloning Data
Because ny-gov/current-descriptive-data-of-municipal-wastewater-g5a2-qa6a:latest is a Splitgraph Image, you can clone the data from Spltgraph Cloud to your local engine, where you can query it like any other Postgres database, using any of your existing tools.
First, install Splitgraph if you haven't already.
Clone the metadata with sgr clone
This will be quick, and does not download the actual data.
sgr clone ny-gov/current-descriptive-data-of-municipal-wastewater-g5a2-qa6aCheckout the data
Once you've cloned the data, you need to "checkout" the tag that you want. For example, to checkout the latest tag:
sgr checkout ny-gov/current-descriptive-data-of-municipal-wastewater-g5a2-qa6a:latestThis will download all the objects for the latest tag of ny-gov/current-descriptive-data-of-municipal-wastewater-g5a2-qa6a and load them into the Splitgraph Engine. Depending on your connection speed and the size of the data, you will need to wait for the checkout to complete. Once it's complete, you will be able to query the data like you would any other Postgres database.
Alternatively, use "layered checkout" to avoid downloading all the data
The data in ny-gov/current-descriptive-data-of-municipal-wastewater-g5a2-qa6a:latest is 0 bytes. If this is too big to download all at once, or perhaps you only need to query a subset of it, you can use a layered checkout.:
sgr checkout --layered ny-gov/current-descriptive-data-of-municipal-wastewater-g5a2-qa6a:latestThis will not download all the data, but it will create a schema comprised of foreign tables, that you can query as you would any other data. Splitgraph will lazily download the required objects as you query the data. In some cases, this might be faster or more efficient than a regular checkout.
Read the layered querying documentation to learn about when and why you might want to use layered queries.
Query the data with your existing tools
Once you've loaded the data into your local Splitgraph Engine, you can query it with any of your existing tools. As far as they're concerned, ny-gov/current-descriptive-data-of-municipal-wastewater-g5a2-qa6a is just another Postgres schema.