Splitgraph is a data API to power your analytics, data visualizations and other read-intensive applications.
repositories:
- namespace: CHANGEME
repository: airbyte-bigquery
# Catalog-specific metadata for the repository. Optional.
metadata:
readme:
text: Readme
description: Description of the repository
topics:
- sample_topic
# Data source settings for the repository. Optional.
external:
# Name of the credential that the plugin uses. This can also be a credential_id if the
# credential is already registered on Splitgraph.
credential: airbyte-bigquery
plugin: airbyte-bigquery
# Plugin-specific parameters matching the plugin's parameters schema
params:
project_id: '' # REQUIRED. Project ID. The GCP project ID for the project containing the target BigQuery dataset.
normalization_mode: basic # Post-ingestion normalization. Whether to normalize raw Airbyte tables. `none` is no normalization, `basic` is Airbyte's basic normalization, `custom` is a custom dbt transformation on the data.. One of none, basic, custom
normalization_git_branch: master # dbt model Git branch. Branch or commit hash to use for the normalization dbt project.
dataset_id: '' # Default Dataset ID. The dataset ID to search for tables and views. If you are only loading data from one dataset, setting this option could result in much faster schema discovery.
tables:
sample_table:
# Plugin-specific table parameters matching the plugin's schema
options:
airbyte_cursor_field: [] # Cursor field(s). Fields in this stream to be used as a cursor for incremental replication (overrides Airbyte configuration's cursor_field)
airbyte_primary_key_field: [] # Primary key field(s). Fields in this stream to be used as a primary key for deduplication (overrides Airbyte configuration's primary_key)
# Schema of the table, a list of objects with `name` and `type`. If set to `[]`, will infer.
schema: []
# Whether live querying is enabled for the plugin (creates a "live" tag in the
# repository proxying to the data source). The plugin must support live querying.
is_live: false
# Ingestion schedule settings. Disable this if you're using GitHub Actions or other methods
# to trigger ingestion.
schedule:
credentials:
airbyte-bigquery: # This is the name of this credential that "external" sections can reference.
plugin: airbyte-bigquery
# Credential-specific data matching the plugin's credential schema
data:
credentials_json: '' # REQUIRED. Credentials JSON. The contents of your Service Account Key JSON file. See the <a href="https://docs.airbyte.io/integrations/sources/bigquery#setup-the-bigquery-source-in-airbyte">docs</a> for more information on how to obtain this key.
normalization_git_url: '' # dbt model Git URL. For `custom` normalization, a URL to the Git repo with the dbt project, for example,`https://uname:pass_or_token@github.com/organisation/repository.git`.
Use our splitgraph.yml format to check your Splitgraph configuration into version control, trigger ingestion jobs and manage your data stack like your code.
Splitgraph connects your vast, unrelated data sources and puts them in a single, accessible place.
Splitgraph handles data integration, storage, transformation and discoverability for you. All that remains is adding a BI client.
Focus on building data-driven applications without worrying about where the data will come from.
Splitgraph supports data ingestion from over 100 SaaS services, as well as data federation to over a dozen databases. These are all made queryable over a PostgreSQL-compatible interface.
Splitgraph stores data in a columnar format. This accelerates analytical queries and makes it perfect for dashboards, blogs and other read-intensive use cases.
Read more about Splitgraph’s support for Google BigQuery, including its documentation and sample queries you can run on Google BigQuery data with Splitgraph.
Splitgraph has a PostgreSQL-compatible endpoint that most BI clients can connect to.