datahub-transportation-gov/2014-its-world-congress-connected-vehicle-test-bed-3trh-hz5x
Icon for Socrata external plugin

Query the Data Delivery Network

Query the DDN

The easiest way to query any data on Splitgraph is via the "Data Delivery Network" (DDN). The DDN is a single endpoint that speaks the PostgreSQL wire procotol. Any Splitgraph user can connect to it at data.splitgraph.com:5432 and query any version of over 40,000 datasets that are hosted or proxied by Splitgraph.

For example, you can query the 2014_its_world_congress_connected_vehicle_test_bed table in this repository, by referencing it like:

"datahub-transportation-gov/2014-its-world-congress-connected-vehicle-test-bed-3trh-hz5x:latest"."2014_its_world_congress_connected_vehicle_test_bed"

or in a full query, like:

SELECT 
    ":id", -- Socrata column ID
    "spat_int_states_stateconfidence", -- This field describes the confidence of the current signal phase and its estimated time values.  Values are enumerated as follows: 0 – unknownEstimate, 1 – minTime, 2- maxTime, 3 - timeLikelyToChange
    "spat_int_states_yellstateconfidence", -- This field describes the confidence of the yellow signal phase and its estimated time values.  Values are enumerated as follows: 0 – unknownEstimate, 1 – minTime, 2- maxTime, 3 - timeLikelyToChange
    "spat_timestamp_minute", -- Minute when the SPaT message was generated in GMT.
    "spat_int_states_currstate", -- This attribute defines the current state of a particular known movement and depends on the type of lane that the currState applies to. 
    "spat_int_states_laneset", -- This attribute represents movement within lanes at an intersection.  Data is provided in hexadecimal.  Each movement/lane pair is provided as a double-octet (four character) hex string.  The first two characters represent the movement, while the last two characters represent the lane.  The movement is obtained by converting the hexadecimal value to an 8 bit binary string with bits 76543210.  A movement is present with a logical 1 for any given bit.  Bit 0 represents straight motion, bit 1 represents a left turn, bit 2 represents a right turn, and bit 3 represents a U-turn.  No other bits are used.  For example, a laneSet string of 0x0703 has the movement characters 07, which are written as 00000111 in binary.  Bits 0, 1, and 2 are in the logical 1 position, indicating straight, left, and right movement.  The string has lane characters 03 which directly indicate lane 3.  Therefore, a laneSet of 0x0703 indicates straight/left/right movement on lane 3.  There is no limit to the number of movement lane/pairs possible.  For example, a laneSet of 0x08040302 indicates a U-Turn on lane 4 (with movement string 00001000 in binary) and straight/left on lane 2 (with movement string 00000011 in binary). Values in decimal should be converted back to the full binary bit string in order to interpret the laneSet field.
    "mapdata_int_refpoint_long", -- This field contains the longitude of a reference point at the described intersection, expressed in 1/10th integer microdegrees.  The value 1800000001 shall be used when unavailable.
    "seqid", -- This field communicate the message content type.  All entries in this instance will be “data” indicating that the data portion of the transmitted message is presented. 
    "map_int_approach_drivinglanes_laneattributes", -- The laneAttributes data element relates the allowed (possible) movements from a motorized vehicle lane. Note that in practice these values may be further restricted by vehicle class, local regulatory environment and other changing conditions.  There are 17 enumerated values for the Lane Attributes data element.
    "bundlenumber", -- This field contains the bundle number of the ISD message
    "spat_timestamp_second", -- Second when the SPaT message was generated in GMT.
    "map_int_approach_name", -- This field contains an optional human readable name for the approach lanes to the intersection.
    "serviceregion_nwcorner_lat", -- This field contains the latitude of the northwest corner of the rectangle that the SDW presides over, expressed in 1/10th integer microdegrees. The value 900000001 shall be used when unavailable.
    "map_int_name", --  This field contains an optional human readable name for the intersection.
    "spat_timestamp_hour", -- Hour when the SpaT message was generated in GMT.
    "serviceregion_secorner_long", -- This field contains the longitude of the southeast corner of the rectangle that the SDW presides over, expressed in 1/10th integer microdegrees. The value 1800000001 shall be used when unavailable.
    "timetolive", -- This field contains the time remaining before the ISD message expires.  This field is enumerated as follows: 0 – 1 minute, 1 – 30 minutes, 2- one day, 3 – one week, 4 – one month, 5 – one year
    "serviceregion_secorner_lat", -- This field contains the latitude of the southeast corner of the rectangle that the SDW presides over, expressed in 1/10th integer microdegrees. The value 900000001 shall be used when unavailable
    "spat_int_states_timetochange", -- This attribute specifies the time remaining, in tenths of a second, before the signal phase will change to the next phase.  The maximum time remaining is 12000 (120.0 seconds = 2 minutes).  0 indicates no time remaining.  A special value of 12001 indicates indefinite time remaining (greater than 2 minutes) and a special value of 12002 indicates an unknown amount of time remaining.
    "map_msgcnt", -- This field contains a sequence number within a stream of messages with the same msgID and from the same sender. A sender may initialize this element to any value in the range 0-127 when sending the first message with a given msgID, or if the sender has changed identity since sending the most recent message with that msgID. Two further use cases exist when the sender has not changed identity: When the rest of the message content to be sent changes, the msgCnt shall be set equal to one greater than the value used in the most recent message sent with the same msgID. When the message content has not changed, the msgCnt is not changed. For this element the value after 127 is zero. The receipt of a non-sequential msgCnt value (from the same sending device and message type) implies that one or more messages from that sending device may have been lost, unless msgCnt has been reinitialized due to an identity change
    "map_int__approach_drivinglanes_lanenumber", -- This field contains a unique index value for a lane used to refer to that lane by other objects in the intersection map data structure. Lanes may be ingress (inbound traffic) or egress (outbound traffic) in nature, as well as barriers and other types of specialty lanes. All lanes are numbered. The laneNumber, in conjunction with the intersections_id, forms a regionally unique way to address a specific lane in that intersection.
    "spat_timestamp_year", -- Year when the SPaT message was generated in GMT.
    "map_int_refpoint_lat", -- This field contains the latitude of a reference point at the described intersection, expressed in 1/10th integer microdegrees.  The value 900000001 shall be used when unavailable.
    "serviceregion_nwcorner_long", -- This field contains the longitude of the northwest corner of the rectangle that the SDW presides over, expressed in 1/10th integer microdegrees. The value 1800000001 shall be used when unavailable.
    "map_crc", -- This attribute is a two byte data element calculated over the payload bytes of the message (starting with the initial sequence and ending with the last data element before the CRC itself and including all tag, length, and values bytes found in between). It is always placed as the very last data element in the message. The generating polynomial used is the "CRC-CCITT" commonly expressed as x16 + x12 + x5 + 1. An initial seed value of zero shall be used. Note that because the first byte of every DSRC message is never zero (it is 0x30), framing errors due to incorrectly clocking initial zero values cannot occur. Note that the MSB byte is always transmitted first, following the typical ASN bytes order. When a well formed DSRC message (including its last two bytes holding the CRC value) is decoded and input to the CRC process, the resulting CRC should always be the value zero. 
    "map_msgid", -- The msgID is a data element used in each message to define which type of message follows from the message set defined by this Standard. This data element is always the first value inside the message and is used to tell the receiving application how to interpret the remaining bytes (i.e. what message structure has been used). For the ISD dataset, this data element takes only one value: “mapData” 
    "map_int_approaches_approach_drivinglanes_", -- This field contains the width of a lane in in centimeters. The maximum value would be a lane of over 327 centimeters. 
    "map_int_id", -- This field contains an ID (number) that globally and uniquely defines an intersection within a country or region. Assignment rules for this value are established elsewhere and may use regional assignment schemas that vary.
    "map_int_approach_id", -- This field contains a unique index value for an approach or egress in an intersection for the convenience of human users. It is typically used along with an optional human readable string name for the object. 
    "spat_timestamp_day", -- Day when the SPaT message was generated in GMT.
    "map_layertype", -- This field specifies the type of information found in a layer of a geographic map fragment (such as an intersection). For the ISD dataset, this data element takes only one value: “intersectionData”
    "spat_int_id", -- The unique ID number for the SPaT message.
    "map_int_approach_drivinglanes_nodelist", -- This field contains the sequence of signed offset values for determining the Xs and Ys (and, possibly Width or Zs when present) using the current intersection position to build a path for the enclosing reference lane relating to a lane in the current intersection.  Please consult page 77 of the SAE J2735 Standard for more information.
    "dialogid", -- This field contains the ID of the type of data being transmitted.  In this instance all entries will be “intersectionSitDataDep” indicating that intersection situation data (ISD) is being transmitted / presented here.
    "spat_timestamp_month", -- Month when the SPaT message was generated in GMT.
    "requestid", -- This field contains the ID (number) of the ISD message.
    "spat_int_status" -- This attribute contains the Advance Traffic Controller status information on the intersection.  Data is encoded in hexadecimal.  Converting the hexadecimal to binary gives an 8 bit unsigned integer with bits 76543210.  Each bit has value 0 or 1 and indicates whether or not a condition is present at an intersection.  For the ISD dataset, this data element takes only one value: {0}  which indicates the intersection is operating normally.
FROM
    "datahub-transportation-gov/2014-its-world-congress-connected-vehicle-test-bed-3trh-hz5x:latest"."2014_its_world_congress_connected_vehicle_test_bed"
LIMIT 100;

Connecting to the DDN is easy. All you need is an existing SQL client that can connect to Postgres. As long as you have a SQL client ready, you'll be able to query datahub-transportation-gov/2014-its-world-congress-connected-vehicle-test-bed-3trh-hz5x with SQL in under 60 seconds.

This repository is an "external" repository. That means it's hosted elsewhere, in this case at datahub.transportation.gov. When you querydatahub-transportation-gov/2014-its-world-congress-connected-vehicle-test-bed-3trh-hz5x:latest on the DDN, we "mount" the repository using the socrata mount handler. The mount handler proxies your SQL query to the upstream data source, translating it from SQL to the relevant language (in this case SoQL).

We also cache query responses on the DDN, but we run the DDN on multiple nodes so a CACHE_HIT is only guaranteed for subsequent queries that land on the same node.

Query Your Local Engine

Install Splitgraph Locally
bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"
 

Splitgraph Cloud is built around Splitgraph Core (GitHub), which includes a local Splitgraph Engine packaged as a Docker image. Splitgraph Cloud is basically a scaled-up version of that local Engine. When you query the Data Delivery Network or the REST API, we mount the relevant datasets in an Engine on our servers and execute your query on it.

It's possible to run this engine locally. You'll need a Mac, Windows or Linux system to install sgr, and a Docker installation to run the engine. You don't need to know how to actually use Docker; sgrcan manage the image, container and volume for you.

There are a few ways to ingest data into the local engine.

For external repositories (like this repository), the Splitgraph Engine can "mount" upstream data sources by using sgr mount. This feature is built around Postgres Foreign Data Wrappers (FDW). You can write custom "mount handlers" for any upstream data source. For an example, we blogged about making a custom mount handler for HackerNews stories.

For hosted datasets, where the author has pushed Splitgraph Images to the repository, you can "clone" and/or "checkout" the data using sgr cloneand sgr checkout.

Mounting Data

This repository is an external repository. It's not hosted by Splitgraph. It is hosted by datahub.transportation.gov, and Splitgraph indexes it. This means it is not an actual Splitgraph image, so you cannot use sgr clone to get the data. Instead, you can use the socrata adapter with the sgr mount command. Then, if you want, you can import the data and turn it into a Splitgraph image that others can clone.

First, install Splitgraph if you haven't already.

Mount the table with sgr mount

sgr mount socrata \
  "datahub-transportation-gov/2014-its-world-congress-connected-vehicle-test-bed-3trh-hz5x" \
  --handler-options '{
    "domain": "datahub.transportation.gov",
    "tables": {
        "2014_its_world_congress_connected_vehicle_test_bed": "3trh-hz5x"
    }
}'

That's it! Now you can query the data in the mounted table like any other Postgres table.

Query the data with your existing tools

Once you've loaded the data into your local Splitgraph engine, you can query it with any of your existing tools. As far as they're concerned, datahub-transportation-gov/2014-its-world-congress-connected-vehicle-test-bed-3trh-hz5x is just another Postgres schema.

Related Documentation: