pa-gov/agriculture-production-data-in-pennsylvania-1850-a5ik-vwk6
Icon for Socrata external plugin

Query the Data Delivery Network

Query the DDN

The easiest way to query any data on Splitgraph is via the "Data Delivery Network" (DDN). The DDN is a single endpoint that speaks the PostgreSQL wire protocol. Any Splitgraph user can connect to it at data.splitgraph.com:5432 and query any version of over 40,000 datasets that are hosted or proxied by Splitgraph.

For example, you can query the agriculture_production_data_in_pennsylvania_1850 table in this repository, by referencing it like:

"pa-gov/agriculture-production-data-in-pennsylvania-1850-a5ik-vwk6:latest"."agriculture_production_data_in_pennsylvania_1850"

or in a full query, like:

SELECT
    ":id", -- Socrata column ID
    "home_made_manufactures", -- The cash value of all articles manufactured within the year preceding the 1st day of June 1850, in or by the family, whether for home use or for sale. If the raw material has been purchased for such manufacture, the value of such raw material should not be included: the object being to ascertain the value of manufacturers by the family from their own productions, or the value of the labor expended on the production of others. This discrimination is important. 
    "beeswax_honey", -- Beeswax Honey measured in whole number of pounds raised on the farm during the year ending the 1st day of June 1850. 
    "maple_sugar", -- Maple Sugar measured in whole number of pounds raised on the farm during the year ending the 1st day of June 1850. 
    "silk_cocoons", -- Silk Cocoons measured in whole number of pounds raised on the farm during the year ending the 1st day of June 1850. 
    "hemp", -- Hemp measured in whole number of tons raised on the farm during the year ending the 1st day of June 1850. The grain which is gathered in that year is meant, though it may have been sown in 1848.
    "hops", -- Hops measured in whole number of pounds raised on the farm during the year ending the 1st day of June 1850. The grain which is gathered in that year is meant, though it may have been sown in 1848.
    "hay", -- Hay measured in whole number of tons raised on the farm during the year ending the 1st day of June 1850. The grain which is gathered in that year is meant, though it may have been sown in 1848.
    "butter", -- Butter measured in pounds on the farm during the year ending the 1st day of June. 
    "buckwheat", -- Buckwheat measured in whole number of bushels raised on the farm during the year ending the 1st day of June 1850. The grain which is gathered in that year is meant, though it may have been sown in 1848.
    "barley", -- Barley measured in whole number of bushels raised on the farm during the year ending the 1st day of June 1850. The grain which is gathered in that year is meant, though it may have been sown in 1848.
    "sweet_potatoes", -- Sweet Potatoes measured in whole number of bushels raised on the farm during the year ending the 1st day of June 1850. The grain which is gathered in that year is meant, though it may have been sown in 1848.
    "irish_potatoes", -- Irish Potatoes measured in whole number of bushels raised on the farm during the year ending the 1st day of June 1850. The grain which is gathered in that year is meant, though it may have been sown in 1848.
    "indian_corn", -- Indian Corn measured in whole number of bushels raised on the farm during the year ending the 1st day of June 1850. The grain which is gathered in that year is meant, though it may have been sown in 1848. 
    "rye", -- Rye measured in whole number of bushels raised on the farm during the year ending the 1st day of June 1850. The grain which is gathered in that year is meant, though it may have been sown in 1848. 
    "live_stock_value", -- The cash value of all live stock on hand on the 1st day of June 1850. 
    "sheep", -- Whole number of Sheep which belong to the farm on the 1st day of June 1850, which are one year old and older.
    "other_cattle", -- Whole number of other cattle (the number of all cattle not before enumerated) which belong to the farm on the 1st day of June 1850, which are one year old and older.
    "working_oxen", -- Whole number of Working Oxen which belong to the farm on the 1st day of June 1850, which are one year old and older.
    "milch_cows", -- Whole number of milch cows which belong to the farm on the 1st day of June 1850, which are one year old and older. Milch Cows are Cattle that are reared for their milk. More commonly called a Dairy or Milk cow today. 
    "horses", -- Whole number of Horses which belong to the farm on the 1st day of June 1850, which are one year old and older. 
    "implements_machinery", -- Aggregate value of all farming or planting implements and machinery used to cultivate and produce crops including wagons, thrashing machines, cotton gins and sugar mills. All implements and machinery used to cultivate and produce crops and fit the same for market or consumption on the 1st day of June 1850.
    "unimproved_land", -- Number of acres or unimproved land on the 1st day of June 1850 such as a wood lot or other land, the timber or range of which is used for farm purposes. It is not necessary that is should be contiguous to the improved land. 
    "improved_land", -- Number of acres of improved land on the 1st day of June 1850, by which is meant cleared and used for grazing, grass or tillage or which is now fallow, connected with or belonging to the farm which the assistant marshal is reporting. It is not necessary that is should be contiguous, but it must be owned or managed by the person whose name is inserted in the column. 
    "farms", -- The returns of all farms or plantations on the 1st day of June 1850, the produce of which amounts to $100 in value, are to be included in this schedule; but it is not intended to include the returns of small lots, owned or worked by persons following mechanical or other pursuits, where the productions are not $100 in value. In 1850, $100 is roughly equal to $3,000 in 2017.  
    "type", -- Type of PA Municipality. Such as Township, Borough, city, etc. 
    "municipality", -- Name of the Pennsylvania Municipality. In 2017 there are 2,561 municipalities. 
    "county", -- Name of the Pennsylvania County. Pennsylvania has 67 counties.  Counties omitted from the Data and reason:  Cameron - Did not exist until March 29, 1860; created from parts of Clinton, Elk, McKean, and Potter Counties Forest - No farms were recorded in this county; mostly forest area Lackawanna - Did not exist until August 13, 1878; created from the eastern half of Luzerne County Snyder - Did not exist until March 2, 1855; created from the southern half of Union County
    "clover_seed", -- Clover Seed measured in whole number of bushels raised on the farm during the year ending the 1st day of June 1850. The grain which is gathered in that year is meant, though it may have been sown in 1848.
    "peas_beans", -- Peas and Beans measured in whole number of bushels raised on the farm during the year ending the 1st day of June 1850. The grain which is gathered in that year is meant, though it may have been shown in 1848.
    "wool", -- Wool measured in whole number of pounds raised on the farm during the year ending the 1st day of June 1850. 
    "tobacco", -- Tobacco measured in whole number of pounds raised on the farm during the year ending the 1st day of June 1850. The grain which is gathered in that year is meant, though it may have been sown in 1848. 
    "wheat", -- Wheat measured in whole number of bushels raised on the farm during the year ending the 1st day of June 1850. The grain which is gathered in that year is meant, though it may have been sown in 1848. 
    "swine", -- Whole number of Swine which belong to the farm on the 1st day of June 1850, which are one year old and older.
    "asses_mules", -- Whole number of Asses and Mules which belong to the farm on the 1st day of June 1850, which are one year old and older.
    "cheese", -- Cheese measured in pounds on the farm during the year ending the 1st day of June. 
    "orchard_products_value", -- The cash value of all Orchard Products on hand on the 1st day of June 1850. 
    "animals_slaughtered", -- The cash value of animals slaughtered raised on the farm during the year ending the 1st day of June 1850. 
    "other_seeds", -- Other Seeds measured in whole number of bushels raised on the farm during the year ending the 1st day of June 1850. The grain which is gathered in that year is meant, though it may have been sown in 1848.
    "flax", -- Flax measured in whole number of pounds raised on the farm during the year ending the 1st day of June 1850. The grain which is gathered in that year is meant, though it may have been sown in 1848.
    "molasses", -- Molasses measured in whole number of gallons raised on the farm during the year ending the 1st day of June 1850. 
    "oats", -- Oats measured in whole number of bushels raised on the farm during the year ending the 1st day of June 1850. The grain which is gathered in that year is meant, though it may have been sown in 1848. 
    "flaxseed", -- Flaxseed measured in pounds raised on the farm during the year ending the 1st day of June 1850. The grain which is gathered in that year is meant, though it may have been sown in 1848.
    "wine", -- Wine measured in gallons on the farm during the year ending the 1st day of June 1850. 
    "garden_produce_value", -- The cash value of all Garden Produce on hand on the 1st day of June 1850. 
    "cash_value" -- Actual cash value of whole number of improved and unimproved acres on the 1st day of June 1850.
FROM
    "pa-gov/agriculture-production-data-in-pennsylvania-1850-a5ik-vwk6:latest"."agriculture_production_data_in_pennsylvania_1850"
LIMIT 100;

Connecting to the DDN is easy. All you need is an existing SQL client that can connect to Postgres. As long as you have a SQL client ready, you'll be able to query pa-gov/agriculture-production-data-in-pennsylvania-1850-a5ik-vwk6 with SQL in under 60 seconds.

This repository is an "external" repository. That means it's hosted elsewhere, in this case at data.pa.gov. When you querypa-gov/agriculture-production-data-in-pennsylvania-1850-a5ik-vwk6:latest on the DDN, we "mount" the repository using the socrata mount handler. The mount handler proxies your SQL query to the upstream data source, translating it from SQL to the relevant language (in this case SoQL).

We also cache query responses on the DDN, but we run the DDN on multiple nodes so a CACHE_HIT is only guaranteed for subsequent queries that land on the same node.

Query Your Local Engine

Install Splitgraph Locally
bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"
 

Read the installation docs.

Splitgraph Cloud is built around Splitgraph Core (GitHub), which includes a local Splitgraph Engine packaged as a Docker image. Splitgraph Cloud is basically a scaled-up version of that local Engine. When you query the Data Delivery Network or the REST API, we mount the relevant datasets in an Engine on our servers and execute your query on it.

It's possible to run this engine locally. You'll need a Mac, Windows or Linux system to install sgr, and a Docker installation to run the engine. You don't need to know how to actually use Docker; sgrcan manage the image, container and volume for you.

There are a few ways to ingest data into the local engine.

For external repositories (like this repository), the Splitgraph Engine can "mount" upstream data sources by using sgr mount. This feature is built around Postgres Foreign Data Wrappers (FDW). You can write custom "mount handlers" for any upstream data source. For an example, we blogged about making a custom mount handler for HackerNews stories.

For hosted datasets, where the author has pushed Splitgraph Images to the repository, you can "clone" and/or "checkout" the data using sgr cloneand sgr checkout.

Mounting Data

This repository is an external repository. It's not hosted by Splitgraph. It is hosted by data.pa.gov, and Splitgraph indexes it. This means it is not an actual Splitgraph image, so you cannot use sgr clone to get the data. Instead, you can use the socrata adapter with the sgr mount command. Then, if you want, you can import the data and turn it into a Splitgraph image that others can clone.

First, install Splitgraph if you haven't already.

Mount the table with sgr mount

sgr mount socrata \
  "pa-gov/agriculture-production-data-in-pennsylvania-1850-a5ik-vwk6" \
  --handler-options '{
    "domain": "data.pa.gov",
    "tables": {
        "agriculture_production_data_in_pennsylvania_1850": "a5ik-vwk6"
    }
}'

That's it! Now you can query the data in the mounted table like any other Postgres table.

Query the data with your existing tools

Once you've loaded the data into your local Splitgraph engine, you can query it with any of your existing tools. As far as they're concerned, pa-gov/agriculture-production-data-in-pennsylvania-1850-a5ik-vwk6 is just another Postgres schema.