ny-gov/draft-disadvantaged-communities-dac-2021-xj7e-q8ja
Icon for Socrata external plugin

Query the Data Delivery Network

Query the DDN

The easiest way to query any data on Splitgraph is via the "Data Delivery Network" (DDN). The DDN is a single endpoint that speaks the PostgreSQL wire protocol. Any Splitgraph user can connect to it at data.splitgraph.com:5432 and query any version of over 40,000 datasets that are hosted or proxied by Splitgraph.

For example, you can query the draft_disadvantaged_communities_dac_2021 table in this repository, by referencing it like:

"ny-gov/draft-disadvantaged-communities-dac-2021-xj7e-q8ja:latest"."draft_disadvantaged_communities_dac_2021"

or in a full query, like:

SELECT
    ":id", -- Socrata column ID
    "copd_ed_rate", -- Percentile ranking of the average annual age-adjusted emergency department visits for COPD in an aggregate area, with all census tracts in an aggregate area assigned the same value. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: COPD
    "unemployment_rate", -- Percentile ranking of the percentage of census tract labor force population that is unemployed. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Unemploymt
    "age_over_65", -- Percentile ranking of the percentage of census tract population over age 65. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Age_Ovr_65
    "internet_access", -- Percentile ranking of the percentage of census tract households with no internet access. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Internet
    "homes_built_before_1960", -- Percentile ranking of the percentage of homes that were built before 1960. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Homes_1960
    "mobile_homes", -- Percentile ranking of the percentage of housing units that are mobile homes. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Mobile
    "renter_percent", -- Percentile ranking of the percentage of housing units that are renter-occupied. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Rent_Pct
    "geoid", -- Census tract identifier; a concatenation of the state FIPS code, county FIPS code, and census tract code Shapefile Attribute Table Label: GEOID
    "dac_designation", -- Indicates if the community is designated or not designated as a draft disadvantaged community Shapefile Attribute Table Label: DAC_Desig
    "redc", -- Regional Economic Development Council (REDC). REDC boundaries are based on existing Empire State Development and Department of Labor regional boundaries. See https://esd.ny.gov/regions  Shapefile Attribute Table Label: REDC
    "population_no_college", -- Percentile ranking of the percentage of census tract population over age 25 without a bachelor's or professional school degree. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: No_College
    "low_vegetative_cover", -- Percentile ranking of the percentage of census tract land area classified as developed or barren land. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Low_Veg
    "burden_score", -- One of two Component Scores. The weighted average of three Factor Scores:  1.	Potential Pollution Exposures 2.	Land Use Associated with Historical Discrimination or Disinvestment 3.	Potential Climate Change Risks Shapefile Attribute Table Label: Burden_Sc
    "burden_score_percentile", -- Combined percentile ranking for the cumulative score of all indicators within the “Environmental Burden and Climate Change Risk” category Shapefile Attribute Table Label: Burden_Pct
    "percentile_rank_combined", -- The percentile rank of a tract’s Combined Score among all tracts in New York State. Reported on a scale from 0-100. Shapefile Attribute Table Label: Rank_State
    "tribal_designation", -- Indicates if the community is designated as tribal Shapefile Attribute Table Label: Trib_Desig
    "urban_rural", -- Either rural, suburban, or urban. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Urb_Rural
    "remediation_sites", -- Percentile ranking of the count of state and federal environmental remediation sites (NPL, Brownfield, Class II) Shapefile Attribute Table Label: Rem_Sites
    "household_low_count_flag", -- Indicates if the population has the potential to be below the threshold for DAC designation Shapefile Attribute Table Label: HH_Low_Cnt
    "nyc_region", -- Region of New York State; either NYC or Rest of State Shapefile Attribute Table Label: NYC_Region
    "latino_percent", -- Percentile ranking of the percentage of census tract population that is Hispanic or Latino alone or in combination with other races. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Lat_Pct
    "county", -- Name of county Shapefile Attribute Table Label: County
    "city_town", -- Name of city or town Shapefile Attribute Table Label: City_Town
    "population_count", -- Population of the community. Source: U.S. Census American Community Survey 2015-2019 Shapefile Attribute Table Label: Pop_Cnt
    "household_count", -- Number of households in the community. Source: U.S. Census American Community Survey 2015-2019 Shapefile Attribute Table Label: HH_Cnt
    "percentile_rank_combined_1", -- The percentile rank of a tract’s Combined Score among all tracts in New York City. Reported on a scale from 0-100. Null for tracts not in New York City. Shapefile Attribute Table Label: Rank_NYC
    "percentile_rank_combined_2", -- The percentile rank of a tract’s Combined Score among all tracts not in New York City. Reported on a scale from 0-100. Null for tracts in New York City. Shapefile Attribute Table Label: Rank_ROS
    "combined_score", -- The product of a tract’s two Component Scores, the Environmental Burdens and Climate Change Risks Score and the Population Characteristics and Health Vulnerabilities Score. Null for Low Population Areas. Shapefile Attribute Table Label: Comb_Sc
    "vulnerability_score_percentile", -- Combined percentile ranking for the cumulative score of all indicators within the “Population Characteristics and Health Vulnerabilities” category Shapefile Attribute Table Label: Vulner_Pct
    "vulnerability_score", -- One of two Component Scores. The weighted average of four Factor Scores:  1.	Income, Education, and Employment 2.	Race, Ethnicity, and Language 3.	Health Outcomes and Sensitivities 4.	Housing, Energy, and Communications Null for Low Population Areas. Shapefile Attribute Table Label: Vulner_Sc
    "benzene_concentration", -- Percentile ranking of the modeled annual average ambient benzene concentration based on emissions from 2014. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Benzene
    "particulate_matter_25", -- Percentile ranking of the average annual concentration of Particulate Matter 2.5 microns or less in diameter per cubic meter. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: PM25
    "traffic_truck_highways", -- Percentile ranking of the annual Average Daily Traffic count of trucks (Federal Highway Administration classes 4-13). Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Traff_Trk
    "traffic_number_vehicles", -- Percentile ranking of the count of vehicles (AADT, avg. annual daily traffic) on major roads within 500 meters of a census block centroid, divided by distance in meters, presented as the population-weighted average of blocks in each tract. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Traff_Veh
    "wastewater_discharge", -- Percentile ranking of the toxicity-weighted concentration in stream reach segments within 500 meters of a census block centroid, divided by distance in meters, presented as the population-weighted average of blocks in each tract. Point sources can include a pipe, ditch, channel, tunnel, conduit, discrete fissure, container, vessel or other floating craft, concentrated animal feeding operations, et al. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Waste_H2O
    "low_birth_weight", -- Percentile ranking of the percentage of all singleton births (births resulting in only one child) that were low birthweight (under 2500 grams). Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Birth_Wt
    "rent_percent_income", -- Percentile ranking of the gross rent as percent of income. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Rent_Inc
    "home_energy_affordability", -- Percentile ranking of the average energy costs as percentage of income. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Energy_Aff
    "housing_vacancy_rate", -- Percentile ranking of the vacant housing units as percent of total housing units. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Vacancy
    "industrial_land_use", -- Percentile ranking of the percent of census tract land area zoned as industrial, mining, or manufacturing land area Shapefile Attribute Table Label: Ind_LU
    "landfills", -- Percentile ranking of the percentage of census tract land area within 500 meters of an active landfill Shapefile Attribute Table Label: Landfills
    "oil_storage", -- Percentile ranking of the percentage of census tract land area within 500 meters of a major oil facility (including storage terminals, airports, military, manufacturing) Shapefile Attribute Table Label: Oil_Stor
    "municipal_waste_combustors", -- Percentile ranking of the percentage of census tract land area within 500 meters of a municipal waste combustor Shapefile Attribute Table Label: Waste_Com
    "power_generation_facilities", -- Percentile ranking of the percentage of census tract land area within 1 mile of a power generation facilities burning fossil fuel, includes peaker units and facilities within 1 mile of the NYS border Shapefile Attribute Table Label: Pwr_Gen
    "rmp_sites", -- Percentile ranking of the count of Regulated Management Plan (RMP) Sites (potential chemical accident) within 5 km  divided by distance and weighted by population. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: RMP_Sites
    "scrap_metal_processing", -- Percentile ranking of the count of scrap metal processing and vehicle dismantling facilities Shapefile Attribute Table Label: Scrap_Met
    "agricultural_land_use", -- Percentile ranking of the percentage of census tract land area classified as hay/pasture or cultivated crop land. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Ag_LU
    "coastal_flooding_storm_risk", -- Percentile ranking of the projected percentage of census tract area at risk of flooding and erosion from: shallow water flooding areas, areas prone to erosion, sea level rise, FEMA Risk Zones (i.e., floodplains  - V zone, 100, and 500 yr), and Category 3 Hurricane from SLOSH - projected out to year 2100 Shapefile Attribute Table Label: Coast_Fld
    "days_above_90_degrees_2050", -- Percentile ranking of the projected average annual number of days with maximum temperature above 90°F in the year 2050. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Days_90_D
    "drive_time_healthcare", -- Percentile ranking of the average drive time between the census tract centroid and its 3 nearest healthcare facilities (hospitals, nursing homes, diagnostic treatment centers, certified home health care agencies, long term home health care programs, or hospices). Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Drv_Health
    "inland_flooding_risk", -- Percentile ranking of the projected annual probability of exceeding FEMA's 100-year floodplain event in a given year, weighted by the proportion of the tract falling within the floodplain Shapefile Attribute Table Label: In_Flood
    "asian_percent", -- Percentile ranking of the percentage of census tract population that is Asian alone or in combination with other races. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Asian_Pct
    "black_african_american_percent", -- Percentile ranking of the percentage of census tract population that is Black or African American alone or in combination with other races. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Black_Pct
    "redlining_updated", -- Percentile ranking of the historic (1930) redlining 'score' from 1-4 where 4 is most hazardous and most likely to be a redlined area. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Redline
    "english_proficiency", -- Percentile ranking of the percentage of households that are limited English speaking households. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Eng_Prof
    "native_indigenous", -- Percentile ranking of the percentage of census tract population that is American Indian, Alaska native, native Hawaiian, or other Pacific Islander alone or in combination with other races. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Native_Pct
    "lmi_80_ami", -- Percentile ranking of the percentage of census tract population earning less than 80% of Area Median Income. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: LMI_80_AMI
    "lmi_poverty_federal", -- Percentile ranking of the percentage of census tract population with income less than 100% of the Federal Poverty Level threshold as determined by the census bureau. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: LMI_Fed
    "premature_deaths", -- Percentile ranking of the percentage of all deaths that occurred before age 65 for the years 2015 through 2019. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Prem_Death
    "health_insurance_rate", -- Percentile ranking of the percentage of census tract population without health insurance. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Health_Ins
    "household_single_parent", -- Percentile ranking of the percentage of census tract households with a single parent and children under age 18. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: HH_Single
    "mi_hospitalization_rate", -- Percentile ranking of the average annual age-adjusted hospitalizations for myocardial infarction (heart attacks) in an aggregate area, with all census tracts in an aggregate area assigned the same value. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: MI_Rates
    "households_disabled", -- Percentile ranking of the percentage of census tract population with a disability. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: HH_Disab
    "asthma_ed_rate" -- Percentile ranking of the average annual age-adjusted emergency department visits for asthma in an aggregate area, with all census tracts in an aggregate area assigned the same value. Blank cells represent data that were not required or are not currently available Shapefile Attribute Table Label: Asthma
FROM
    "ny-gov/draft-disadvantaged-communities-dac-2021-xj7e-q8ja:latest"."draft_disadvantaged_communities_dac_2021"
LIMIT 100;

Connecting to the DDN is easy. All you need is an existing SQL client that can connect to Postgres. As long as you have a SQL client ready, you'll be able to query ny-gov/draft-disadvantaged-communities-dac-2021-xj7e-q8ja with SQL in under 60 seconds.

This repository is an "external" repository. That means it's hosted elsewhere, in this case at data.ny.gov. When you queryny-gov/draft-disadvantaged-communities-dac-2021-xj7e-q8ja:latest on the DDN, we "mount" the repository using the socrata mount handler. The mount handler proxies your SQL query to the upstream data source, translating it from SQL to the relevant language (in this case SoQL).

We also cache query responses on the DDN, but we run the DDN on multiple nodes so a CACHE_HIT is only guaranteed for subsequent queries that land on the same node.

Query Your Local Engine

Install Splitgraph Locally
bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"
 

Read the installation docs.

Splitgraph Cloud is built around Splitgraph Core (GitHub), which includes a local Splitgraph Engine packaged as a Docker image. Splitgraph Cloud is basically a scaled-up version of that local Engine. When you query the Data Delivery Network or the REST API, we mount the relevant datasets in an Engine on our servers and execute your query on it.

It's possible to run this engine locally. You'll need a Mac, Windows or Linux system to install sgr, and a Docker installation to run the engine. You don't need to know how to actually use Docker; sgrcan manage the image, container and volume for you.

There are a few ways to ingest data into the local engine.

For external repositories (like this repository), the Splitgraph Engine can "mount" upstream data sources by using sgr mount. This feature is built around Postgres Foreign Data Wrappers (FDW). You can write custom "mount handlers" for any upstream data source. For an example, we blogged about making a custom mount handler for HackerNews stories.

For hosted datasets, where the author has pushed Splitgraph Images to the repository, you can "clone" and/or "checkout" the data using sgr cloneand sgr checkout.

Mounting Data

This repository is an external repository. It's not hosted by Splitgraph. It is hosted by data.ny.gov, and Splitgraph indexes it. This means it is not an actual Splitgraph image, so you cannot use sgr clone to get the data. Instead, you can use the socrata adapter with the sgr mount command. Then, if you want, you can import the data and turn it into a Splitgraph image that others can clone.

First, install Splitgraph if you haven't already.

Mount the table with sgr mount

sgr mount socrata \
  "ny-gov/draft-disadvantaged-communities-dac-2021-xj7e-q8ja" \
  --handler-options '{
    "domain": "data.ny.gov",
    "tables": {
        "draft_disadvantaged_communities_dac_2021": "xj7e-q8ja"
    }
}'

That's it! Now you can query the data in the mounted table like any other Postgres table.

Query the data with your existing tools

Once you've loaded the data into your local Splitgraph engine, you can query it with any of your existing tools. As far as they're concerned, ny-gov/draft-disadvantaged-communities-dac-2021-xj7e-q8ja is just another Postgres schema.