Query the Data Delivery Network
Query the DDNThe easiest way to query any data on Splitgraph is via the "Data Delivery Network" (DDN). The DDN is a single endpoint that speaks the PostgreSQL wire protocol. Any Splitgraph user can connect to it at data.splitgraph.com:5432
and query any version of over 40,000 datasets that are hosted or proxied by Splitgraph.
For example, you can query the feed_the_future_malawi_interim_survey_in_the_zone
table in this repository, by referencing it like:
"datahub-usaid-gov/feed-the-future-malawi-interim-survey-in-the-zone-c5qj-pmev:latest"."feed_the_future_malawi_interim_survey_in_the_zone"
or in a full query, like:
SELECT
":id", -- Socrata column ID
"h19a", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Any liver, kidney, heart, or other organ meats from domesticated animals such as beef, pork, lamb, goat, chicken, duck, or pigeon?
"h03", -- Please tell me how old you are. What was your age your last birthday?
"w_mdd", -- Calculated where 1 w_foodsum10 5 . This is the Feed the Future new indicator for women achieving a minimum dietary diversity, or 5 of the 10 food groups.
"h18b", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Any other fruits like bananas, apples, avocados, pineapples, berries, baobab fruit, etc.?
"h19b", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Any meat from domesticated animals, such as beef, pork, lamb, goat, chicken, duck, or pigeon?
"w_nuts", -- This measure indicates that the women consumed nuts or seeds.
"h24b", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Any foods made from other nuts or seeds? (EXCLUDE FOODS MADE FROM GROUNDNUTS OR SESAME SEEDS WHICH BELONG IN ABOVE CATEGORIES.)
"w_nutsbeans", -- This measure indicates that the women consumed beans, other types of legumes, nuts, or seeds.
"project", -- Identifies whether the record is used to report on Feed the Future (FTF), Food for Peace (FFP), Catholic Relief Services (CRS), or Project Concern International (PCI) programming. Due to coordinated survey collection, the same record may be used to evalu
"w_othveg", -- This measure indicates that the women consumed other vegetables.
"pbs_id", -- Administrative variable for identifying households
"h14", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Bread, savory biscuits, porridge, crackers, pasta, noodles, rice, or other foods made ??from grains such as corn, wheat, millet, sorghum, bulgar, wheat, barley?
"w_foodsum10", -- Calculated as the sum of 10 food groups: w_grains, w_beans, w_nuts, w_dairy, w_eggs, w_meatfish, w_lfygrn, w_vita, w_othveg, w_othfrt
"h26", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Any oil, fats, or butter, or foods made with any of these? (INCLUDE GROUNDNUT OIL AND SESAME OIL.)
"h28", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Condiments for flavor, such as chilies, spices, herbs, fish powder, curry, or bicarbonate soda ash used for cooking?
"h20a", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Any liver, kidney, heart, or other organ meats from wild animals such as warthogs, buck, kudu, impala, antelopes, crocodile, cats, monkeys, alligators, or mice?
"h08", -- Measured height in centimeters
"h15a", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Orange-fleshed sweet potatoes or foods made from orange-fleshed sweet potatoes such as porridge, flitters, or stew, for example?
"np_wrasample", -- This flag variable indicates that the woman is aged 15 to 49 years and not pregnant, based on question H06.
"h23a", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Any food made from groundnut or groundnut products such as groundnut flour, peanut butter, roasted groundnuts, boiled groundnut snack, sauces, groundnut biscuits?
"h02_m", -- In what month and year were you born?
"weight", -- Women s weight in kilograms as reported in h07.
"survey", -- Two surveys were administered. This variable identifies whether the survey was the Interim Feed the Future survey or the Baseline Food for Peace survey.
"h21", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Eggs? (chicken, turkey, fowl, duck)
"h04", -- Are you between the ages of 15 and 49 years old?
"h23b", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Any foods made from soy or soy products such as soya bean flour, soy milk, soy mash relish, soy flitters, or soy soup?
"h16", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD White potatoes, white yams, cassava, plantains or any other foods made from roots?
"h15b", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Any other dark yellow or orange fleshed roots, tubers, or vegetables such as yellow-fleshed sweet potatoes, pumpkin, carrots, or squash?
"h17b", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Any other vegetables such as green beans, tomatoes, mushrooms, cabbage, cauliflower, broccoli etc.?
"urbrur", -- Location type (urban rural)
"bmicat", -- Underweight (BMI 18.5), Normal (18.5 BMI 25), Overweight Obese (BMI 25)
"w_foodmiss", -- This is a flag variable that indicates that questions H14 - H30 are all missing. This variable is created to exclude records missing the dietary information from analyses.
"h23d", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Any other foods made from beans, peas, lentils, or other legumes?
"w_foodsum9", -- Calculated as the sum of 9 food groups: w_grains, w_nutsbeans, w_dairy, w_eggs, w_organmeat, w_flesh, w_lfygrn, w_vita, w_othfrtveg. The weighted mean of this variable is a Feed the Future indicator of women s dietary diversity.
"h24a", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Any foods made from sesame or sesame flour?
"wrasample", -- This flag variable indicates that the woman is aged 15 to 49 years.
"h05", -- interviewer determined: Is the respondent between the ages of 15 and 49?
"w_meatfish", -- This variable indicates that the women consumed organ or flesh meat from wild or domesticated animals.
"w_beans", -- This measure indicates that the women consumed beans or other types of legumes.
"h25", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Milk, soured milk, cheese, yogurt, or other milk products?
"h27", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Any sugary foods such as chocolates, sweets, candies, pastries, doughnuts, cakes, sweet biscuits, or sugar cane?
"w_nrvcc2", -- This measure identifies whether targeted nutrient-rich value chain commodity was consumed in the prior day. The weighted mean of this measure will produce the Feed the Future prevalence indicator. NRVCC Any food made from groundnut or groundnut produ
"w_nrvcc_any", -- This measure identifies whether the any of the targeted nutrient-rich value chain commodities were consumed in the prior day. The weighted mean of this measure will produce the Feed the Future prevalence indicator.
"h02_y", -- In what month and year were you born?
"h20b", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Any flesh from wild animals, such as warthogs, buck, kudu, impala, antelopes, crocodile, cats, monkeys, alligators, or mice?
"h29", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Edible insects, mopane worms, grasshoppers or flying ants?
"underwght", -- Prevalence of underweight (BMI 18.5) among non-pregnant women aged 15-49. This is a Feed the Future indicator.
"h07", -- Measured weight in kilograms
"w_eggs", -- This measure indicates that the women consumed eggs.
"h30", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Foods made with red palm oil, red palm nut, or red palm nut pulp sauce?
"subgroup3", -- The 7-district FTF FEEDBACK ZOI which will include district level data from rural areas only of Michinji, Lilongwe, Dedza, Mangochi, Ntcheu, Balaka, Machinga.
"bmi", -- Women s BMI calculated as height weight (squared)
"cluster", -- Cluster number
"w_flesh", -- This measure indicates that the women consumed flesh meat from wild or domesticated animals.
"h22", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Fresh or dried fish, shellfish, crabs, or seafood?
"w_organmeat", -- This measure indicates that the women consumed organ meat from wild or domesticated animals.
"w_lfygrn", -- This measure indicates that the women consumed dark green leafy vegetables.
"w_dairy", -- This measure indicates that the women consumed milk, cheese, yogurt or other foods made from animal milk.
"h06", -- Are you currently pregnant?
"w_othfrt", -- This measure indicates that the women consumed other fruits.
"h17a", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Any dark green leafy vegetables such as spinach, kale, okra, pumpkin leaves, amaranth leaves or moringa leaves?
"w_grains", -- This measure indicates that the women consumed grains or starchy root vegetables.
"strata3", -- Stratification for subgroup 3
"women_wt3", -- Individual sampling weight for women aged 15 to 49 years. This variable should be used for subgroup 3.
"height", -- Women s height in meters, based on reported height in centimeters in h08.
"w_vita", -- This measure indicates that the women consumed orange-fleshed fruits or vegetables, red palm products, or other foods high in vitamin A.
"h18a", -- Yesterday during the day or night, did you drink eat any FOOD ? FOOD Ripe mangoes, ripe papayas, apricots, cantaloupe melons, pulp from African locust bean, or other fruits that are dark yellow or orange inside?
"w_nrvcc1", -- This measure identifies whether targeted nutrient-rich value chain commodity was consumed in the prior day. The weighted mean of this measure will produce the Feed the Future prevalence indicator. NRVCC Any foods made from soy or soy products such as
"zoi", -- Was the data collected by Westat or ICF?
"w_othfrtveg", -- This variable indicates that the women consumed other fruits or vegetables.
"ghht", -- Derived gendered household type
"idcode", -- Woman s ID code
"today", -- The day of interview as a string variable, MM-DD-YYYY
"w_nrvcc3", -- This measure identifies whether targeted nutrient-rich value chain commodity was consumed in the prior day. The weighted mean of this measure will produce the Feed the Future prevalence indicator. NRVCC Orange-fleshed sweet potatoes or foods made fro
"country" -- These data were collected in Malawi
FROM
"datahub-usaid-gov/feed-the-future-malawi-interim-survey-in-the-zone-c5qj-pmev:latest"."feed_the_future_malawi_interim_survey_in_the_zone"
LIMIT 100;
Connecting to the DDN is easy. All you need is an existing SQL client that can connect to Postgres. As long as you have a SQL client ready, you'll be able to query datahub-usaid-gov/feed-the-future-malawi-interim-survey-in-the-zone-c5qj-pmev
with SQL in under 60 seconds.
Query Your Local Engine
bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"
Read the installation docs.
Splitgraph Cloud is built around Splitgraph Core (GitHub), which includes a local Splitgraph Engine packaged as a Docker image. Splitgraph Cloud is basically a scaled-up version of that local Engine. When you query the Data Delivery Network or the REST API, we mount the relevant datasets in an Engine on our servers and execute your query on it.
It's possible to run this engine locally. You'll need a Mac, Windows or Linux system to install sgr
, and a Docker installation to run the engine. You don't need to know how to actually use Docker; sgr
can manage the image, container and volume for you.
There are a few ways to ingest data into the local engine.
For external repositories, the Splitgraph Engine can "mount" upstream data sources by using sgr mount
. This feature is built around Postgres Foreign Data Wrappers (FDW). You can write custom "mount handlers" for any upstream data source. For an example, we blogged about making a custom mount handler for HackerNews stories.
For hosted datasets (like this repository), where the author has pushed Splitgraph Images to the repository, you can "clone" and/or "checkout" the data using sgr clone
and sgr checkout
.
Cloning Data
Because datahub-usaid-gov/feed-the-future-malawi-interim-survey-in-the-zone-c5qj-pmev:latest
is a Splitgraph Image, you can clone the data from Spltgraph Cloud to your local engine, where you can query it like any other Postgres database, using any of your existing tools.
First, install Splitgraph if you haven't already.
Clone the metadata with sgr clone
This will be quick, and does not download the actual data.
sgr clone datahub-usaid-gov/feed-the-future-malawi-interim-survey-in-the-zone-c5qj-pmev
Checkout the data
Once you've cloned the data, you need to "checkout" the tag that you want. For example, to checkout the latest
tag:
sgr checkout datahub-usaid-gov/feed-the-future-malawi-interim-survey-in-the-zone-c5qj-pmev:latest
This will download all the objects for the latest
tag of datahub-usaid-gov/feed-the-future-malawi-interim-survey-in-the-zone-c5qj-pmev
and load them into the Splitgraph Engine. Depending on your connection speed and the size of the data, you will need to wait for the checkout to complete. Once it's complete, you will be able to query the data like you would any other Postgres database.
Alternatively, use "layered checkout" to avoid downloading all the data
The data in datahub-usaid-gov/feed-the-future-malawi-interim-survey-in-the-zone-c5qj-pmev:latest
is 0 bytes. If this is too big to download all at once, or perhaps you only need to query a subset of it, you can use a layered checkout.:
sgr checkout --layered datahub-usaid-gov/feed-the-future-malawi-interim-survey-in-the-zone-c5qj-pmev:latest
This will not download all the data, but it will create a schema comprised of foreign tables, that you can query as you would any other data. Splitgraph will lazily download the required objects as you query the data. In some cases, this might be faster or more efficient than a regular checkout.
Read the layered querying documentation to learn about when and why you might want to use layered queries.
Query the data with your existing tools
Once you've loaded the data into your local Splitgraph Engine, you can query it with any of your existing tools. As far as they're concerned, datahub-usaid-gov/feed-the-future-malawi-interim-survey-in-the-zone-c5qj-pmev
is just another Postgres schema.