Splitgraph has been acquired by EDB! Read the blog post.
 
Previous Post
Mapping query results with Mapbox
Jul 11, 2022· By Peter Neumark, Artjoms Iškovs
TIME REQUIRED: 15 min

Exploring Google Search terms from BigQuery using Splitgraph

Query data from Google BigQuery using Splitgraph.

What you will build in this guide

time required: 15 minutes

Ever wonder how ideas travel from country to country? Do they move to neighboring countries, or "hop" between distant countries sharing the same language? Search Trends are a pretty good indicator of what people living in a certain area are interested in. By analysing the top-ranked search terms for a given country, we can probably find answers to these questions and more!

Google provides a public BigQuery dataset of popular search terms which can be registered as an external repository and queried with Splitgraph.

Read on to find out how! All you'll need is a Splitgraph account (sign up for free if you don't have one yet) and a GCP account (the generous free tier will be more than enough).

How it works

There are many ways to add data to Splitgraph:

In this guide, we'll use the last option.

Preparation in Google Cloud

Let's start by enabling access to the search trends dataset. Running the example query in BigQuery is an easy way to verify that the GCP project you are using has access.

To authenticate with BigQuery, Splitgraph requires a Service Account with the BigQuery Admin role. You can create one on the IAM and admin dashboard. Once the Service Account is ready, you need to generate a JSON-format private key on the Service Accounts page.

Splitgraph does not yet support reading datasets which belong to a different GCP project than the Service Account used for authentication. The public datasets belong to the bigquery-public-data project, but you can work around this limitation by creating a BigQuery dataset within your own project called search_trends and adding a latest_international_top_terms view defined by the following query:

SELECT
  country_name,
  week,
  term,
  MIN(rank) as rank
FROM
    `bigquery-public-data.google_trends.international_top_terms`
WHERE
    refresh_date =
        (SELECT
            MAX(refresh_date)
        FROM
        `bigquery-public-data.google_trends.international_top_terms`)
GROUP BY
    country_name, week, term

Registering the external repository

The next step is to connect the external repository in Splitgraph.

After clicking on the link above, you should be greeted by the following form:

Form to connect a BigQuery dataset to Splitgraph

Fill out the form with the following:

fieldvalue
Credential namebq
GCP credentials[the contents of the downloaded JSON service account key]
GCP project name[the name of your GCP project]
Big Query datasetsearch_trends

Once the form is filled out, click Continue and wait a few seconds for Splitgraph to read the BigQuery view. A few preview rows are displayed. I named the repository google-search-trends, but you can choose any name, just remember to update the qualified table names in the queries throughout the remainder of this guide. Click on Create repository and load data immediately.

Within a few seconds, the tables should be ready for querying in Splitgraph!

Note that Splitgraph also supports live querying BigQuery instead of importing data during repository creation. To connect the repository this way, click on the chevron to the right of the Create repository and load data immediately button, and from the resulting dropdown menu, select Create repository without loading data.

If you receive a 404 "not found" error, simply reloading the page will solve the problem.

On to querying!

The introduction contains the question,

Do [ideas] move to neighboring countries, or "hop" between distant countries sharing the same language?

If we consider search terms as a proxies for ideas, then an idea "moving to another country" means it's rank diminishes in the source and grows in the destination country. With ranking, the lower the number the more popular the search term, so the rank field would actually increase in value in the source and decrease the destination during the "move".

Splitgraph co-founder Artjoms Iškovs came up with the following query:

Loading...

The migration_strength fields is the sum of ranking change in the source and destination country. A high value indicates that interest in the search term significantly plummeted at the source the same week it skyrocketed in the target country. The query above returns pairs of countries where this occurred most frequently.

The top 10 results:

sourcedestinationneighboringsame language
EgyptSaudi Arabiayesyes
SwitzerlandAustriayesyes
South AfricaAustralianoyes1
MalaysiaPhilippinesyesyes1
United KingdomNigerianoyes
AustriaGermanyyesyes
FranceBelgiumyesyes
MexicoColombianoyes
DenmarkNorwayyesno
CanadaDenmarknono

So it seems that common language is more important than being neighbors.

Country hopping ideas

How can we be sure that the query above actually represents the migration of ideas (as recorded by search terms)? We can look at some of these, for example to see which search terms migrated from the UK to Nigeria we could run:

Loading...

The result is a single term: Joe Aribo, a football player who relocated from the UK to Nigeria.

Unfortunately, not all of the search terms that migrate between countries were obvious. Why was Howard Carpendale becoming less popular in Austria and more so in Germany this May? I have absolutely no idea!

Are there terms going in the opposite direction, from Germany to Austria? Yes, the same 4 terms, but in a different order. The search term Galatasaray (the name of a Turkish football club) was in decline in Germany and increasing in Austria 94 times in the last 5 years. It only moved in the opposite direction once.

These search terms are a gold mine of unexpected insights into what captures nations' attention at specific times. Querying this data in Splitgraph was technically easy, but the underlying reasons for these trends are challenging to understand!

Get back to us!

Have you figured out something surprising looking at the search terms repository? Let us know!


  1. Countries with multiple languages that share at least one also count.
Querying Stripe data with Splitgraph