cdc-gov/cdc-text-corpora-for-learners-mmwr-eid-and-pcd-7rih-tqi5
Icon for Socrata external plugin
Open repository in Console
 
Readme
Updated over 1 year ago
Indexed 11 months ago

CDC Text Corpora for Learners: MMWR, EID, and PCD Article Metadata

This landing page is part of the <a href="https://github.com/cmheilig/harvest-cdc-journals">CDC Text Corpora for Learners</a> program; this includes the compiled 33,576 CDC Text for Learners <a href="https://data.cdc.gov/National-Center-for-State-Tribal-Local-and-Territo/CDC-Text-Corpora-for-Learners-HTML-Mirrors-of-MMWR/ut5n-bmc3/about_data">HTML mirrors</a> of the MMWR <a href="https://www.cdc.gov/mmwr/">Morbidity and Mortality Weekly Report</a> including its series: <i>Weekly Reports</i>, <i>Recommendations and Reports</i>, <i>Surveillance Summaries</i>, <i>Supplements</i>, and <i>Notifiable Diseases</i>, a subset of <i>Weekly Reports</i>, constructed ad hoc; EID <a href="https://www.cdc.gov/eid/">Emerging Infectious Diseases</a>; and PCD <a href="https://www.cdc.gov/pcd/">Preventing Chronic Disease</a>

The data represented here is the tabulated <a href="https://github.com/cmheilig/harvest-cdc-journals/blob/main/README.md#metadata-fields">metadata</a> of the combined 33,567 articles of the <a href="https://github.com/cmheilig/harvest-cdc-journals?tab=readme-ov-file#collections">MMWR, EID, and PCD collections</a> whose contents are organized into three ZIP archived JSON files per collection. The JSON value output formats include UTF-8 HTML, UTF-8 markdown, and ASCII plain text.

The <a href="https://github.com/cmheilig/harvest-cdc-journals?tab=readme-ov-file#collections">JSON files</a> are located in the <a href="https://github.com/cmheilig/harvest-cdc-journals">program's repository.</a> This version was constructed on 2024-03-01 using source content retrieved on 2024-01-09.

Querying over HTTP

Splitgraph serves as an HTTP API that lets you run SQL queries directly on this data to power Web applications. For example:

curl https://data.splitgraph.com/sql/query/ddn \
    -H "Content-Type: application/json" \
    -d@-<<EOF
{"sql": "
    SELECT *
    FROM \"cdc-gov/cdc-text-corpora-for-learners-mmwr-eid-and-pcd-7rih-tqi5\".\"cdc_text_corpora_for_learners_mmwr_eid_and_pcd\"
    LIMIT 100 
"}
EOF

See the Splitgraph documentation for more information.

 
Preview
  • cdc_text_corpora_for_learners_mmwr_eid_and_pcd
     
     
     
     
     
Upstream Metadata