wa-gov/cleanup-sites-with-media-and-contaminants-pe6s-sg9w
Icon for Socrata external plugin

Query the Data Delivery Network

Query the DDN

The easiest way to query any data on Splitgraph is via the "Data Delivery Network" (DDN). The DDN is a single endpoint that speaks the PostgreSQL wire protocol. Any Splitgraph user can connect to it at data.splitgraph.com:5432 and query any version of over 40,000 datasets that are hosted or proxied by Splitgraph.

For example, you can query the cleanup_sites_with_media_and_contaminants table in this repository, by referencing it like:

"wa-gov/cleanup-sites-with-media-and-contaminants-pe6s-sg9w:latest"."cleanup_sites_with_media_and_contaminants"

or in a full query, like:

SELECT
    ":id", -- Socrata column ID
    "lat_nad83harn", -- Latitude (North American Datum 1983 HARN)
    "contaminantmedia", -- Media are primary environmental components that may be affected by or carriers for contaminants, and from which targets such as people, plants, and wildlife may come in contact with contaminants.  Air: The MTCA does not provide a definition for "air." A common definition is: the gaseous mixture above the soil or surface waters.  Bedrock: The MTCA does not provide a definition for "bedrock." A common definition is: The upper portion of the Earth's core, underlying soil and surface water or, at times exposed to the atmosphere, and composed solely of mineral or mineralized materials in relatively homogeneous formations.  Groundwater: The MTCA definition of ground water is found in Ch. 173-340-200: “…water in a saturated zone or stratum beneath the surface of land or below a surface water.” Under or near surface waters, it is important to establish the boundary between surface water and ground water, primarily through physical/chemical characteristic differences.  Sediment: The Sediment Management Standards definition of sediments is found at Ch. 173-204-200: “…settled particulate matter located in the predominant biologically active aquatic zone, or exposed to the water column. Sediment(s) also includes settled particulate matter exposed by human activity (e.g., dredging) to the biologically active aquatic zone or to the water column."  Soil: The MTCA definition of soil appears in Ch. 173-340-200: “…a mixture of organic and inorganic solids, air, water and the biota that exists on the earth’s surface above bedrock, including materials of anthropogenic sources such as slag, sludge, etc.” Soil is generally defined based on the particle size composition, and mineral and organic content.  Surface Water: The MTCA definition of surface water is found at Ch. 173-340-200: “…lakes, rivers, ponds, streams, inland waters, salt waters, and all other surface waters and water courses within the state of Washington or under the jurisdiction of the state of Washington.”
    "contaminant", -- Any hazardous substance that does not occur naturally or occurs at greater than natural background levels. (See WAC 173-340-200)  For a list of contaminants and their groups, see the contaminant cross reference spreadsheet.  Halogentated Organics:  Halogenated refers to a chemical compound or mixture that contains halogen atoms, i.e. fluorine, chlorine, bromine, or iodine. Bonding to carbon the halogens form a class of compounds called covalent halides, i.e. they are not electroylytic in nature. Some other terms for these compounds are halocarbons (halogenated hydrocarbons), chlorocarbons and chlorofluorocarbons (CFCs, a class of molecules containing chlorine, fluorine, and carbon).  Metals:  These derive from a number of sources, including lead in petrol, industrial effluents, and leaching of metal ions from the soil into lakes and rivers by acid rain. They are easily incorporated into biological molecules and exert their toxic effects by displacing essential metals of a lower binding power in biologically active molecules or by acting as noncompetitive inhibitors of enzymes.  Non-Halogenated Organics:  A class of organic compounds that contain at least one halogen (fluorine [F], chlorine [Cl], bromine [Br], or iodine [I]) bonded to carbon. They are subdivided into alkyl, vinylic, aryl, and acyl halides. In alkyl halides all four bonds to the carbon that bears the halogen are single bonds; in vinylic halides the carbon that bears the halogen is doubly bonded to another carbon; in aryl halides the halogen-bearing carbon is part of an aromatic ring; and in acyl halides (also called acid halides) the halogen-bearing carbon is doubly bonded to oxygen.  Other Contaminants:  Miscellaneous contaminants that do not fall under the other contaminant categories.  Pesticides:  A substance used for destroying insects or other organisms harmful to cultivated plants or to animals.  Reactive Wastes:  Chemical waste that are unstable and readily undergo violent changes or react with water to form potentially explosive mixtures or generate toxic gases, vapors, or fumes. Chemicals that contain cyanide or sulfide that can generate toxic gases, vapors, or fumes.
    "sitename", -- Cleanup Site Name
    "facilitysiteid", -- Unique identifier for the agency Facility/Site record. This number is assigned through the Facility/Site database.
    "cleanupsiteid", -- Unique identifier for the cleanup site. This number is assigned through the ISIS database.
    "sitestatus", -- The current status of the cleanup for the Site.  Awaiting Cleanup: The site has been discovered. There may have been an initial investigation, Phase I or Phase II site assessment. A remedial investigation has not been started. No independent, VCP, or Ecology supervised work has occurred toward cleaning up the site.  In general, sites will NOT be put back into "Awaiting Cleanup" status once they have started. Sites with terminated VCP projects will not be put back into "Awaiting Cleanup" status.  Cleanup Complete-Active O&M/Monitoring: Use only for sites where all construction and physical cleanup work has been done and cleanup standards met, but some active operation, maintenance, and/or monitoring work is required on an ongoing basis to maintain cleanup standards. Examples include: groundwater pump & treat, hydraulic containment, active gas or vapor control systems. This is used for sites that are essentially stable but are not eligible for delisting. (See Policy 330B, #4.)  If the construction has been completed, and the site is eligible to be delisted, then de-list the site and use the "No Further Action Required" site status.  A periodic review must be scheduled when in this category.  Cleanup Started: Site remedial investigation or cleanup work has begun. Includes completed interim actions. Includes Ecology or EPA supervised sites, VCP sites, and independent sites where an emergency action, remedial investigation, interim action, feasibility study, site characterization, or any other type of cleanup work has begun.  Construction Complete - Performance Monitoring: Cleanup construction and source control are complete. Performance monitoring is underway to “confirm that the cleanup action has attained cleanup standards, remediation levels, or other performance standards." This would be used for an unit that is undergoing natural attenuation, natural sediment recovery, or evaluating a treatment system to be sure it is going to work. This should not be used for units in confirmation monitoring, which is for evaluating the long-term effectiveness of an action once cleanup and performance standards are confirmed to be met. Refer to WAC 173-340-410 (b) and (c).  In most cases, a periodic review should be scheduled when this unit status is selected.  No Further Action Required: For sites that have received a No Further Action determination (NFA) at the Initial Investigation or Site Hazard Assessment, received an NFA letter (not ranked), or have been delisted (if ranked). If site is ranked, it must not be put into this status until the formal delisting is completed.  Do not use this status for sites that have been transferred to other programs or other agencies. Only use this for sites that have actually received an NFA.  Sites in this status may still need a periodic review, passive O & M, monitoring or inspections, or periodic repairs.  NOTE: The Site Status cannot be changed to NFA at the Site Details screen. It must be changed by using the "NFA this Cleanup Site" wizard, available under the ISIS "Workflows" link.  Tracked by EPA: For sites where EPA is the Responsible Unit and Superfund lead, cleanup occurs under CERCLA only, and the Unit Process Type is Federal supervised/conducted cleanup. The Site Manager should be listed as EPA unless there is an Ecology staff person assigned to the site to verify compliance with state ARARs.  Do not use this status for sites that are under joint authority. If the site is transitioned back to Ecology for monitoring, the site status should be changed.
    "lon_nad83harn", -- Longitude (North American Datum 1983 HARN)
    "contaminantstatus" -- Status of the contaminant found in impacted media.  Below Cleanup Levels: The contaminant was tested and found to be below cleanup levels. Generally, we would not enter each and every contaminant that was tested; for example if an SVOC analysis was done we would not enter each SVOC with a status of "below". We would use this for contaminants that were believed likely to be present but were found to be below standards when tested.  Confirmed Above Cleanup Levels: The contaminant is confirmed to be present above any cleanup level or likely cleanup level. For example - above MTCA method A, B, or C; above Sediment Quality Standards; or above a presumed site-specific cleanup level (such as human health criteria for a sediment contaminant).  Remediated - Above: The contaminant was remediated, but remains on site above the cleanup standards, such as under a cap, building, or in a containment cell. If media are Remediated-Above, the site must have institutional controls.  Remediated - Below: The contaminant was remediated, and no area of the site contains this contaminant above cleanup standards (for example - complete removal of contaminated soils).  Suspected: The contaminant is suspected to be present based on some knowledge about the history of the site, knowledge of regional contaminants, or based on other contaminants known to be present.
FROM
    "wa-gov/cleanup-sites-with-media-and-contaminants-pe6s-sg9w:latest"."cleanup_sites_with_media_and_contaminants"
LIMIT 100;

Connecting to the DDN is easy. All you need is an existing SQL client that can connect to Postgres. As long as you have a SQL client ready, you'll be able to query wa-gov/cleanup-sites-with-media-and-contaminants-pe6s-sg9w with SQL in under 60 seconds.

This repository is an "external" repository. That means it's hosted elsewhere, in this case at data.wa.gov. When you querywa-gov/cleanup-sites-with-media-and-contaminants-pe6s-sg9w:latest on the DDN, we "mount" the repository using the socrata mount handler. The mount handler proxies your SQL query to the upstream data source, translating it from SQL to the relevant language (in this case SoQL).

We also cache query responses on the DDN, but we run the DDN on multiple nodes so a CACHE_HIT is only guaranteed for subsequent queries that land on the same node.

Query Your Local Engine

Install Splitgraph Locally
bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"
 

Read the installation docs.

Splitgraph Cloud is built around Splitgraph Core (GitHub), which includes a local Splitgraph Engine packaged as a Docker image. Splitgraph Cloud is basically a scaled-up version of that local Engine. When you query the Data Delivery Network or the REST API, we mount the relevant datasets in an Engine on our servers and execute your query on it.

It's possible to run this engine locally. You'll need a Mac, Windows or Linux system to install sgr, and a Docker installation to run the engine. You don't need to know how to actually use Docker; sgrcan manage the image, container and volume for you.

There are a few ways to ingest data into the local engine.

For external repositories (like this repository), the Splitgraph Engine can "mount" upstream data sources by using sgr mount. This feature is built around Postgres Foreign Data Wrappers (FDW). You can write custom "mount handlers" for any upstream data source. For an example, we blogged about making a custom mount handler for HackerNews stories.

For hosted datasets, where the author has pushed Splitgraph Images to the repository, you can "clone" and/or "checkout" the data using sgr cloneand sgr checkout.

Mounting Data

This repository is an external repository. It's not hosted by Splitgraph. It is hosted by data.wa.gov, and Splitgraph indexes it. This means it is not an actual Splitgraph image, so you cannot use sgr clone to get the data. Instead, you can use the socrata adapter with the sgr mount command. Then, if you want, you can import the data and turn it into a Splitgraph image that others can clone.

First, install Splitgraph if you haven't already.

Mount the table with sgr mount

sgr mount socrata \
  "wa-gov/cleanup-sites-with-media-and-contaminants-pe6s-sg9w" \
  --handler-options '{
    "domain": "data.wa.gov",
    "tables": {
        "cleanup_sites_with_media_and_contaminants": "pe6s-sg9w"
    }
}'

That's it! Now you can query the data in the mounted table like any other Postgres table.

Query the data with your existing tools

Once you've loaded the data into your local Splitgraph engine, you can query it with any of your existing tools. As far as they're concerned, wa-gov/cleanup-sites-with-media-and-contaminants-pe6s-sg9w is just another Postgres schema.