austintexas-gov/hiv-priority-populations-28rb-stgr
Icon for Socrata external plugin

Query the Data Delivery Network

Query the DDN

The easiest way to query any data on Splitgraph is via the "Data Delivery Network" (DDN). The DDN is a single endpoint that speaks the PostgreSQL wire protocol. Any Splitgraph user can connect to it at data.splitgraph.com:5432 and query any version of over 40,000 datasets that are hosted or proxied by Splitgraph.

For example, you can query the hiv_priority_populations table in this repository, by referencing it like:

"austintexas-gov/hiv-priority-populations-28rb-stgr:latest"."hiv_priority_populations"

or in a full query, like:

SELECT
    ":id", -- Socrata column ID
    "percent_of_total_population_2", -- Number Virally Suppressed
    "number_unmet_need", -- Number Unmet Need- In the most recent calendar year, the number of people living with diagnosed HIV infection in the jurisdiction based on most recent known address without any CD4 or VL test result. 
    "percent_of_total_population_1", -- Percent of Total Population Unmet Need
    "met_need", -- Met Need-In the most recent calendar year, the number of people living with diagnosed HIV infection in the jurisdiction based on most recent known address with a CD4 test result or VL test result or outpatient/ambulatory health services (OAHS) visit.
    "data_type", -- Data Type
    "number_of_new_diagnosis", -- Number of New Diagnosis- The number of people in the jurisdiction with HIV diagnosed in the most recent calendar year based on residence at time of diagnosis. 
    "in_care_not_virally_suppressed", -- Percent of Total Population In Care, Not Virally Suppressed
    "percent_not_retained_in_care", -- Percent of people with HIV who have not had  two or more CD4 or viral load tests, performed at least three months apart.
    "location", -- Location
    "percent_people_retained_in_care", -- Percent of people who had two or more CD4 or viral load tests, performed at least three months apart in the Austin TGA.
    "percent_late_diagnosed", -- Percent Late Diagnosed
    "in_care_virally_supressed_1", -- Percent In Care, Not Virally Suppressed
    "in_care_virally_supressed", -- In Care,  Virally Suppressed-In the most recent calendar year, the number of people living with diagnosed HIV infection in the jurisdiction who are in care whose most recent viral load test result was below 200 copies/mL.
    "start_date", -- Start Date
    "number_not_retained_in_care", -- number of people with HIV who do not have two or more CD4 or viral load tests, performed at least three months apart.
    "unmet_need", -- Percent Unmet Need
    "number_of_people_retained_in_care", -- People with HIV who had two or more CD4 or viral load tests, performed at least three months apart
    "end_date", -- End Date
    "year", -- Year
    "population_description", -- Population Description
    "percent_of_total_population", -- Percent of Total Population Late Diagnosed
    "number_in_care_not_virally", -- Number In Care, Not Virally Suppressed-In the most recent calendar year, the number of people living with diagnosed HIV infection in the jurisdiction who are in care whose most recent viral load test result was ≥200 copies/mL.
    "number_late_diagnosed", -- Number Late Diagnosed-The number of people with late diagnosed HIV in the most recent calendar year in the jurisdiction based on residence at time of diagnosis. Late diagnosed HIV is based on the first CD4 test result (<200 cells/mL or a CD4 percentage of total lymphocytes of <14) or documentation of an AIDS-defining condition ≤3 months after a diagnosis of HIV infection. If ≥2 events occurred during the same month and could thus qualify as “first,” apply the same conditions applied by CDC.7
    "people_living_with_diagnosed", -- People Living with Diagnosed HIV infection
    "estimated_not_yet_diagnosed" -- Estimated Not yet diagnosed. Estimate number provided by Texas Department of State Health Services
FROM
    "austintexas-gov/hiv-priority-populations-28rb-stgr:latest"."hiv_priority_populations"
LIMIT 100;

Connecting to the DDN is easy. All you need is an existing SQL client that can connect to Postgres. As long as you have a SQL client ready, you'll be able to query austintexas-gov/hiv-priority-populations-28rb-stgr with SQL in under 60 seconds.

This repository is an "external" repository. That means it's hosted elsewhere, in this case at data.austintexas.gov. When you queryaustintexas-gov/hiv-priority-populations-28rb-stgr:latest on the DDN, we "mount" the repository using the socrata mount handler. The mount handler proxies your SQL query to the upstream data source, translating it from SQL to the relevant language (in this case SoQL).

We also cache query responses on the DDN, but we run the DDN on multiple nodes so a CACHE_HIT is only guaranteed for subsequent queries that land on the same node.

Query Your Local Engine

Install Splitgraph Locally
bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"
 

Splitgraph Cloud is built around Splitgraph Core (GitHub), which includes a local Splitgraph Engine packaged as a Docker image. Splitgraph Cloud is basically a scaled-up version of that local Engine. When you query the Data Delivery Network or the REST API, we mount the relevant datasets in an Engine on our servers and execute your query on it.

It's possible to run this engine locally. You'll need a Mac, Windows or Linux system to install sgr, and a Docker installation to run the engine. You don't need to know how to actually use Docker; sgrcan manage the image, container and volume for you.

There are a few ways to ingest data into the local engine.

For external repositories (like this repository), the Splitgraph Engine can "mount" upstream data sources by using sgr mount. This feature is built around Postgres Foreign Data Wrappers (FDW). You can write custom "mount handlers" for any upstream data source. For an example, we blogged about making a custom mount handler for HackerNews stories.

For hosted datasets, where the author has pushed Splitgraph Images to the repository, you can "clone" and/or "checkout" the data using sgr cloneand sgr checkout.

Mounting Data

This repository is an external repository. It's not hosted by Splitgraph. It is hosted by data.austintexas.gov, and Splitgraph indexes it. This means it is not an actual Splitgraph image, so you cannot use sgr clone to get the data. Instead, you can use the socrata adapter with the sgr mount command. Then, if you want, you can import the data and turn it into a Splitgraph image that others can clone.

First, install Splitgraph if you haven't already.

Mount the table with sgr mount

sgr mount socrata \
  "austintexas-gov/hiv-priority-populations-28rb-stgr" \
  --handler-options '{
    "domain": "data.austintexas.gov",
    "tables": {
        "hiv_priority_populations": "28rb-stgr"
    }
}'

That's it! Now you can query the data in the mounted table like any other Postgres table.

Query the data with your existing tools

Once you've loaded the data into your local Splitgraph engine, you can query it with any of your existing tools. As far as they're concerned, austintexas-gov/hiv-priority-populations-28rb-stgr is just another Postgres schema.

Related Documentation: