Query the Data Delivery Network
Query the DDNThe easiest way to query any data on Splitgraph is via the "Data Delivery Network" (DDN). The DDN is a single endpoint that speaks the PostgreSQL wire protocol. Any Splitgraph user can connect to it at data.splitgraph.com:5432
and query any version of over 40,000 datasets that are hosted or proxied by Splitgraph.
For example, you can query the erate_supplemental_entity_information
table in this repository, by referencing it like:
"datahub-usac/erate-supplemental-entity-information-7i5i-83qf:latest"."erate_supplemental_entity_information"
or in a full query, like:
SELECT
":id", -- Socrata column ID
"status", -- Indicates if an organization is active (open) or closed.
"is_school_library_independent", -- A school/library is listed as independent (Yes) if their EPC entity profile indicates the school/library is not part of a school district/library system. All school districts, library systems, and consortia are listed as independent (Yes).
"esa_consortium", -- An established ESA serving as a consortium leader, aggregating demand and assisting in bulk purchases on behalf of some or all of the school districts and/or libraries in its service area.
"non_profit_purchasing_group", -- Indicates if the consortium is part of a non-profit purchasing group.
"state_wide", -- Indicates if the consortium is a state-wide consortium.
"state_library_agency_consortium", -- Indicates if the consortium is a state library agency.
"main_branch", -- Indicates if the library is the main branch of the library system in which it is located. Main branch will also be listed as “Yes” for independent libraries.
"public_library", -- Indicates if the library is public.
"research", -- Indicates if the library is a research library.
"bie", -- A school operated by the Bureau of Indian Education (BIE).
"pre_k", -- A Pre-K school.
"public_school", -- A public school.
"swing_space", -- An institution that temporarily houses students from a school which is considered the "main entity" or original location of the student population.
"urban_rural_status", -- System-generated urban/rural status for the entity based on the physical address, latitude, and longitude.
"category_two_discount_rate", -- Discount rate for Category Two services, per the entity’s EPC profile. Services discounted by this rate include Internal Connections, Managed Internal Broadband Services (MIBS) and/or Basic Maintenance of Internal Connections. Only applicable to non-consortium independent entities.
"private_library", -- Indicates if the library is private.
"state_library_agency_library", -- Indicates if the library is part of a state library agency.
"does_this_organization_have_an_endowment", -- Indicates whether the organization has an endowment.
"dormitory", -- A dormitory for housing students.
"esa_school", -- A school operated by an Educational Service Agency to provide specialized services such as Vocational or Special Education.
"head_start", -- A facility with a comprehensive child development program that serves preschool-age children and their families.
"form498_status", -- The status of the FCC Form 498 submitted for the entity. Every service provider is required to have a 498 ID in order to participate in any universal service program and/or receive payments from USAC.
"physical_county", -- County where the entity is located.
"mailing_state", -- State of the entity’s mailing address.
"phone_number", -- Entity’s phone number.
"category_one_discount_rate", -- Discount rate for Category One services, per the entity's EPC profile. Services discounted by this rate include Internet Access Services and/or Data Transmission but do not include Voice Services. Only applicable to non-consortium independent entities.
"c2_school_student_count", -- Number of full time students for independent or dependent schools’ C2 budget calculations based on the entity profile or an approved FCC Form 471.
"fscs_key", -- Indicates the FSCS of the library.
"nces_public_building_code", -- Code assigned by NCES for the public school building.
"physical_zipcode_ext", -- Four-digit ZIP code extension if applicable.
"mailing_address", -- Mailing address of the entity.
"user_entered_urban_rural_status", -- User-entered urban/rural status for the entity.
"c2_student_count_reporting_type", -- Indicates how the school district reports the student count for its C2 budget calculation.
"academic", -- Indicates if the library is part of an educational institution.
"tribal_library", -- An E-Rate applicant may self-identify as a Tribal entity in their EPC entity profile if the majority of library patrons served are Tribal members; the entity is located partially or entirely on Tribal land; or the entity is a library operated by a Tribal Nation.
"community_eligibility_program_cep", -- Indicates if the school is enrolled in the Community Eligibility Program (CEP).
"juvenile_justice", -- A NIF which is (and should be) labeled as a school in EPC, but usually has within it a primary or secondary education program.
"new_construction_school", -- A school that has yet to be fully constructed, often requiring estimated student counts.
"tribal_affiliation", -- Selected name of the federally recognized Tribal Nation that the entity is affiliated with from the entity's EPC profile listed drop down options.
"other_tribe_name", -- Typed name of the federally recognized Tribal Nation that the entity is affiliated with when "Other" is selected for Tribal Affiliation.
"public_school_district", -- Indicates if school district is a public school district.
"other_state_agency", -- Indicates if the consortium is a state agency other than the state agencies listed above.
"tribal_school", -- An E-Rate applicant may self-identify as a Tribal entity in their EPC entity profile if the majority of students served are Tribal members; the entity is located partially or entirely on Tribal land; the entity is a school operated by or receiving funding from the Bureau of Indian Education (BIE); or the entity is a school by a Tribal Nation.
"adult_education", -- An institution for adult education.
"parent_entity_type", -- The parent entity type including consortium, school district, and/or library system. The parent entity type is listed as a school or library if there is a child NIF.
"c2_district_student_count", -- Number of full time students for a school district’s C2 Budget calculation based on the entity profile or an approved FCC Form 471.
"kiosk", -- Indicates if the library is a kiosk.
"new_construction_library", -- Indicates if the library has not been fully constructed.
"entity_number", -- The number of the entity assigned by USAC.
"parent_entity_name", -- This field is blank if the entity does not have a parent. If an entity has more than one parent, the entity is listed multiple times.
"fcc_registration_number", -- Ten-digit number assigned by the FCC to each organization for all transactions with the FCC. Use the FCC's Commission Registration System (CORES) to register or search for an FCC registration number.
"physical_zipcode", -- Zip code for the entity.
"mailing_address_2", -- Additional address information if applicable.
"mailing_city", -- City of the entity’s mailing address.
"email", -- Email address for the entity’s EPC profile.
"website_url", -- Entity’s website.
"longitude", -- System-generated longitude coordinate of the entity based on the physical address.
"private_school_district", -- Indicates if school district is a private school district.
"charter_school_district", -- Indicates if a school district consists exclusively of charter schools.
"main_library_branch_school_district_number", -- Indicates the entity number of the school district in which the main library branch is located.
"state_school_code", -- Code assigned by the state for a school.
"nces_public_state_code", -- Code assigned by the National Center for Education Statistics (NCES) for the state.
"nces_private_school_id", -- Code assigned by NCES for private schools.
"number_of_full_time_students", -- The total number of full-time students for the school.
"total_number_of_part_time_students", -- The total number of part-time students for the school.
"peak_number_of_part_time_students", -- The peak number of part-time students for the school.
"number_of_nslp_students", -- Number of students or calculated number of students eligible for the National School Lunch Program (NSLP).
"esa_school_district_with_no_schools", -- An Educational State Agency (ESA) comprised entirely of non-instructional facilities (NIFs), from which its staff is based. It may run education programs in facilities that are not recognized as elementary or secondary schools. Classifying such an entity as a school in EPC is necessary to be able to enter the total student count for all of its member school districts.
"physical_state", -- U.S. state or territory where the entity is located.
"mailing_zipcode_ext", -- Four-digit ZIP code extension if applicable.
"account_administrator_name", -- Name of the account administrator in EPC.
"esa_school_district", -- Indicates if a school district identifies as having an ESA support its member schools.
"main_library_branch_school_district_name", -- Indicates the name of the school district in which the main library branch is located.
"locale_code", -- Code assigned by the Institute for Museum and Library Services (IMLS) that identifies the population density of the library’s location. Libraries that are located in the IMLS locale code of 11-City, Large; 12-City, Midsize; or 21-Suburb are eligible for a higher pre-discount budget than libraries with other locale codes.
"square_footage", -- Total interior area of the library measured in square footage.
"nces_public_district_code", -- Code assigned by NCES for the school district.
"entity_name", -- Name of entity.
"parent_entity_number", -- The number of the parent entity.
"fcc_form_498_form_number", -- Unique number assigned to the form by USAC for an Applicant FCC Form 498.
"fcc_form_498_status_date_time", -- Date and time the last status was updated for the FCC Form 498 for the applicant.
"physical_address", -- Street address where the entity is located.
"physical_address_2", -- Additional address information if applicable.
"physical_city", -- City where the entity is located.
"mailing_county", -- County of the entity’s mailing address.
"mailing_zipcode", -- ZIP code of the entity’s mailing address.
"general_contact_name", -- General contact of the entity.
"general_contact_email", -- Email address for the general contact.
"latitude", -- System-generated latitude coordinate of the entity based on the physical address.
"private_library_system", -- Indicates if the library system consists exclusively of private libraries.
"fscs_seq", -- Library FSCS sequence code is a unique 3-digit suffix that distinguishes outlets associated with a library system.
"state_lea_code", -- LEA code for the school district.
"cep_percentage", -- Indicates the approved percentage of a school in the Community Eligibility Program (CEP).
"entity_last_modified_by", -- Indicates the e-mail address of the person who last modified the entity’s information in EPC.
"last_updated_date", -- Indicates the date and time the profile of the entity was last modified in EPC.
"public_library_system", -- Indicates if the library system consists exclusively of public libraries.
"state_education_agency", -- Indicates if the consortium is a state education agency.
"charter_school", -- A charter school.
"detention_center", -- A facility that is treated in the same way as a juvenile justice facility.
"general_use_school", -- A school that offers instruction to students drawn from other schools, and student counts can change throughout the year.
"alternative_discount_method", -- Indicates the school’s alternative discount method such as sibling match, survey, or combination (not CEP).
"bookmobile", -- Indicates if the library is a mobile library.
"private_school", -- A private school.
"entity_type" -- The type of the entity including school district, school, library system, library, consortium, and non-instructional facility (NIF).
FROM
"datahub-usac/erate-supplemental-entity-information-7i5i-83qf:latest"."erate_supplemental_entity_information"
LIMIT 100;
Connecting to the DDN is easy. All you need is an existing SQL client that can connect to Postgres. As long as you have a SQL client ready, you'll be able to query datahub-usac/erate-supplemental-entity-information-7i5i-83qf
with SQL in under 60 seconds.
Query Your Local Engine
bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"
Read the installation docs.
Splitgraph Cloud is built around Splitgraph Core (GitHub), which includes a local Splitgraph Engine packaged as a Docker image. Splitgraph Cloud is basically a scaled-up version of that local Engine. When you query the Data Delivery Network or the REST API, we mount the relevant datasets in an Engine on our servers and execute your query on it.
It's possible to run this engine locally. You'll need a Mac, Windows or Linux system to install sgr
, and a Docker installation to run the engine. You don't need to know how to actually use Docker; sgr
can manage the image, container and volume for you.
There are a few ways to ingest data into the local engine.
For external repositories, the Splitgraph Engine can "mount" upstream data sources by using sgr mount
. This feature is built around Postgres Foreign Data Wrappers (FDW). You can write custom "mount handlers" for any upstream data source. For an example, we blogged about making a custom mount handler for HackerNews stories.
For hosted datasets (like this repository), where the author has pushed Splitgraph Images to the repository, you can "clone" and/or "checkout" the data using sgr clone
and sgr checkout
.
Cloning Data
Because datahub-usac/erate-supplemental-entity-information-7i5i-83qf:latest
is a Splitgraph Image, you can clone the data from Spltgraph Cloud to your local engine, where you can query it like any other Postgres database, using any of your existing tools.
First, install Splitgraph if you haven't already.
Clone the metadata with sgr clone
This will be quick, and does not download the actual data.
sgr clone datahub-usac/erate-supplemental-entity-information-7i5i-83qf
Checkout the data
Once you've cloned the data, you need to "checkout" the tag that you want. For example, to checkout the latest
tag:
sgr checkout datahub-usac/erate-supplemental-entity-information-7i5i-83qf:latest
This will download all the objects for the latest
tag of datahub-usac/erate-supplemental-entity-information-7i5i-83qf
and load them into the Splitgraph Engine. Depending on your connection speed and the size of the data, you will need to wait for the checkout to complete. Once it's complete, you will be able to query the data like you would any other Postgres database.
Alternatively, use "layered checkout" to avoid downloading all the data
The data in datahub-usac/erate-supplemental-entity-information-7i5i-83qf:latest
is 0 bytes. If this is too big to download all at once, or perhaps you only need to query a subset of it, you can use a layered checkout.:
sgr checkout --layered datahub-usac/erate-supplemental-entity-information-7i5i-83qf:latest
This will not download all the data, but it will create a schema comprised of foreign tables, that you can query as you would any other data. Splitgraph will lazily download the required objects as you query the data. In some cases, this might be faster or more efficient than a regular checkout.
Read the layered querying documentation to learn about when and why you might want to use layered queries.
Query the data with your existing tools
Once you've loaded the data into your local Splitgraph Engine, you can query it with any of your existing tools. As far as they're concerned, datahub-usac/erate-supplemental-entity-information-7i5i-83qf
is just another Postgres schema.