ny-gov/solar-electric-programs-reported-by-nyserda-3x8r-34rs
Icon for Socrata external plugin

Query the Data Delivery Network

Query the DDN

The easiest way to query any data on Splitgraph is via the "Data Delivery Network" (DDN). The DDN is a single endpoint that speaks the PostgreSQL wire protocol. Any Splitgraph user can connect to it at data.splitgraph.com:5432 and query any version of over 40,000 datasets that are hosted or proxied by Splitgraph.

For example, you can query the solar_electric_programs_reported_by_nyserda table in this repository, by referencing it like:

"ny-gov/solar-electric-programs-reported-by-nyserda-3x8r-34rs:latest"."solar_electric_programs_reported_by_nyserda"

or in a full query, like:

SELECT
    ":id", -- Socrata column ID
    ":@computed_region_wbg7_3whc", -- This column was automatically created in order to record in what polygon from the dataset 'New York Zip Codes' (wbg7-3whc) the point in column 'georeference' is located.  This enables the creation of region maps (choropleths) in the visualization canvas and data lens.
    "green_jobs_green_new_york_participant", -- Indicates if project accessed Green Jobs-Green New York financing.
    "zip_code", -- ZIP code for project location. Where available, standardized ZIP code is provided.
    "project_status", -- Either Complete or Pipeline. Complete indicates projects that are interconnected and operational, and closed out the project application.  Pipeline indicates projects with an active application that are not yet complete. Pipeline projects are subject to change
    "climate_and_economic_justice_screening_tool_status", -- Indicates if the project is located in a federal disadvantaged community as defined by the White House Council on Environmental Quality 
    "municipality_type", -- Type of incorporated municipality; either City or Town
    "state", -- Name of US state for project location. Where available, standardized state abbreviation is provided
    "prevailing_wage_adder", -- Dollars awarded to the project through the Prevailing Wage Adder, which provides additional funding to projects that pay prevailing wage during construction
    "community_adder", -- Dollars awarded to the project through the Community Adder, which provides additional funding to community solar projects
    "purchase_type", -- Solar photovoltaic project purchase agreement type. The purchase types are either Lease, Purchase or Power Purchase Agreement. Blank cells represent data that were not required or are not currently available
    "expanded_solar_for_all_adder", -- Dollars awarded to the project through the Expanded Solar For All program, which provides additional funding to community solar projects serving eligible offtakers in National Grid territory
    ":@computed_region_kjdx_g34t", -- This column was automatically created in order to record in what polygon from the dataset 'Counties' (kjdx-g34t) the point in column 'georeference' is located.  This enables the creation of region maps (choropleths) in the visualization canvas and data lens.
    ":@computed_region_yamh_8v7k", -- This column was automatically created in order to record in what polygon from the dataset 'NYS Municipal Boundaries' (yamh-8v7k) the point in column 'georeference' is located.  This enables the creation of region maps (choropleths) in the visualization canvas and data lens.
    ":@computed_region_n8jy_tbqr", -- This column was automatically created in order to record in what polygon from the dataset 'NYS Assembly Districts' (n8jy-tbqr) the point in column 'georeference' is located.  This enables the creation of region maps (choropleths) in the visualization canvas and data lens.
    "program_type", -- Name of program type; either Residential/Small Commercial, Commercial/Industrial (Competitive), or Commercial/Industrial (MW Block).
    "legacy_project_number", -- Unique identifier for project, assigned using a defunct NYSERDA database.  For some projects, the Legacy Project Number was provided as the Project Number in this Open NY dataset prior to August 31, 2020.  Blank cells indicate that the project was submitted after NYSERDA discontinued the older database or the Project Number remained the same 
    "county", -- Name of county for project location. Where available, standardized county name is provided
    "primary_inverter_model_number", -- Name of primary inverter model number for project. Some projects may use more than one type of inverter model. Only the primary inverter model number is provided. Blank cells represent data that were not required or are not currently available
    "census_tract", -- Census tract identifier; a concatenation of the state FIPS code, county FIPS code, and census tract code. Blank cells represent data that were not required or are not currently available
    "georeference", -- Open Data platform-generated geocoding information from supplied address components.  Point-type location is the centroid of the address components provided and does not reflect a specific address if the street address component is not provided.  Point-type location is supplied in "POINT (<geocoded longitude> <geocoded latitude>)" format.
    "street_address", -- Street address for project location for projects with a non-residential sector. Where available, standardized address is provided. Blank cells indicate that the project sector is residential, or the address is not available.
    "project_number", -- Unique identifier for project
    "affordable_solar_residential_adder", -- Dollars awarded to the project through the Affordable Solar Residential Incentive, which provides additional funding to income-eligible single family projects
    "pv_module_quantity", -- Quantity of all photovoltaic (PV) modules installed for project. Quantity provided is for all PV modules, not just the primary PV module. Blank cells represent data that were not required or are not currently available
    "longitude", -- The approximate longitude coordinate of the project. For Residential projects, the center point of the census tract area is provided, not the exact project location. For Non-Residential projects, the coordinate of the street address is provided.
    "solicitation", -- NYSERDA Program Opportunity Notification (PON) or Request for Proposal (RFP) number.
    "latitude", -- The approximate latitude coordinate of the project. For Residential projects, the center point of the census tract area is provided, not the exact project location. For Non-Residential projects, the coordinate of the street address is provided.
    "project_cost", -- Expected project installation cost in US dollars (USD), as reported by the solar project contractor. Blank cells represent data that were not required or are not currently available
    "inverter_quantity", -- Quantity of all inverters installed for project. Quantity provided is for all inverters, not just the primary inverter. Blank cells represent data that were not required or are not currently available
    "pv_module_manufacturer", -- Name of primary photovoltaic (PV) module manufacturer for project. Some projects may use more than one type of PV module manufacturer. Only the primary PV module manufacturer is provided. Blank cells represent data that were not required or are not currently available
    "minority_or_women_owned_business_enterprise_mwbe", -- Indicates if the Contractor is a Minority or Women Owned Business Enterprise (MWBE). Blank cells represent data that were not required or are not currently available
    "total_nyserda_incentive", -- Amount of project incentives paid by the program in USD. Blank cells represent data that were not required or are not currently available.  Projects that received Green Jobs-Green NY financing but no incentive show an Incentive Amount of $0
    "inclusive_community_solar_adder", -- Dollars awarded to the project through the Inclusive Community Solar Adder, which provides additional funding to community solar projects serving offtakers in disadvantaged communities
    "reporting_period", -- The time period covered by the data set
    "electric_utility", -- Name of electric utility for project location
    "canopy_adder", -- Dollars awarded to the project through the Canopy Adder, which provides additional funding to projects sited on rooftop canopies or carports in ConEdison territory
    "inverter_manufacturer", -- Name of primary inverter manufacturer for project. Some projects may use more than one type of inverter manufacturer. Only the primary inverter manufacturer is provided. Blank cells represent data that were not required or are not currently available
    "pv_module_model_number", -- Name of primary photovoltaic (PV) module model number for project. Some projects may use more than one type of PV module model. Only the primary PV module model number is provided. Blank cells represent data that were not required or are not currently available
    "sector", -- Name of project sector. The sectors in this dataset are either Residential or Non-Residential
    "nys_disadvantaged_community_status", -- Indicates if the project is located in a New York State disadvantaged community (DAC) as defined by the Climate Justice Working Group
    "brownfield_landfill_adder", -- Dollars awarded to the project through the Brownfield/Landfill Adder, which provides additional funding to projects sited on brownfield sites or landfills
    "expected_kwh_annual_production", -- Expected annual electricity production in kilowatt-hours (kWh) as a result of project
    "affordable_multifamily_housing_incentive", -- Dollars awarded to the project through the Multifamily Affordable Housing Incentive, which provides additional funding to regulated affordable housing
    "totalnameplatekwdc", -- The sum of kilowatt (kW) DC capacity ratings of the installed photovoltaic modules
    "date_application_received", -- Date project application was received by the program
    "contractor", -- Name of entity responsible for installation of the project. Blank cells represent data that were not required or are not currently available.  Contractor data is not provided for Pipeline projects
    "date_install", -- Date NYSERDA recognized the project as interconnected and operational, and closed out the project application. Blank cells represent pipeline data for projects not yet completed.
    "remote_net_metering", -- Indicates if project is Remote Net Metered. Blank cells represent data that were not required or are not currently available
    "city", -- Name of city for project location. Where available, standardized United States Postal Service (USPS) city name is provided
    "incorporated_municipality", -- Name of the incorporated city/town municipality for the  project location
    "community_distributed_generation" -- Indicates if project Community Distributed Generation (Shared Solar or Community Solar)
FROM
    "ny-gov/solar-electric-programs-reported-by-nyserda-3x8r-34rs:latest"."solar_electric_programs_reported_by_nyserda"
LIMIT 100;

Connecting to the DDN is easy. All you need is an existing SQL client that can connect to Postgres. As long as you have a SQL client ready, you'll be able to query ny-gov/solar-electric-programs-reported-by-nyserda-3x8r-34rs with SQL in under 60 seconds.

This repository is an "external" repository. That means it's hosted elsewhere, in this case at data.ny.gov. When you queryny-gov/solar-electric-programs-reported-by-nyserda-3x8r-34rs:latest on the DDN, we "mount" the repository using the socrata mount handler. The mount handler proxies your SQL query to the upstream data source, translating it from SQL to the relevant language (in this case SoQL).

We also cache query responses on the DDN, but we run the DDN on multiple nodes so a CACHE_HIT is only guaranteed for subsequent queries that land on the same node.

Query Your Local Engine

Install Splitgraph Locally
bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"
 

Read the installation docs.

Splitgraph Cloud is built around Splitgraph Core (GitHub), which includes a local Splitgraph Engine packaged as a Docker image. Splitgraph Cloud is basically a scaled-up version of that local Engine. When you query the Data Delivery Network or the REST API, we mount the relevant datasets in an Engine on our servers and execute your query on it.

It's possible to run this engine locally. You'll need a Mac, Windows or Linux system to install sgr, and a Docker installation to run the engine. You don't need to know how to actually use Docker; sgrcan manage the image, container and volume for you.

There are a few ways to ingest data into the local engine.

For external repositories (like this repository), the Splitgraph Engine can "mount" upstream data sources by using sgr mount. This feature is built around Postgres Foreign Data Wrappers (FDW). You can write custom "mount handlers" for any upstream data source. For an example, we blogged about making a custom mount handler for HackerNews stories.

For hosted datasets, where the author has pushed Splitgraph Images to the repository, you can "clone" and/or "checkout" the data using sgr cloneand sgr checkout.

Mounting Data

This repository is an external repository. It's not hosted by Splitgraph. It is hosted by data.ny.gov, and Splitgraph indexes it. This means it is not an actual Splitgraph image, so you cannot use sgr clone to get the data. Instead, you can use the socrata adapter with the sgr mount command. Then, if you want, you can import the data and turn it into a Splitgraph image that others can clone.

First, install Splitgraph if you haven't already.

Mount the table with sgr mount

sgr mount socrata \
  "ny-gov/solar-electric-programs-reported-by-nyserda-3x8r-34rs" \
  --handler-options '{
    "domain": "data.ny.gov",
    "tables": {
        "solar_electric_programs_reported_by_nyserda": "3x8r-34rs"
    }
}'

That's it! Now you can query the data in the mounted table like any other Postgres table.

Query the data with your existing tools

Once you've loaded the data into your local Splitgraph engine, you can query it with any of your existing tools. As far as they're concerned, ny-gov/solar-electric-programs-reported-by-nyserda-3x8r-34rs is just another Postgres schema.