BigQuery Public Data

Created by Ben Deverman, Modified on Wed, 6 Mar at 1:21 AM by Alsabana Sahubarali

What is BigQuery Public data?

With your PAD account, you can access over 200 publicly available datasets from Google. These datasets are hosted by BigQuery and allow maximum flexibility to access and integrate them into your applications or projects. Google pays for the storage of these datasets; you only pay for the queries you perform on the data. The public datasets contain helpful information, such as:

  • U.S. Census geography data
  • International Census data
  • FEC campaign finance data
  • GDP and Income data

 

How do I find the data?

To find this data, search for the word ‘public’  in the search bar on your PAD project page. This will pull up a list of the datasets under the `bigquery-public-data` project:

image1.png

To ensure you don't have to keep searching for this data, we recommend “starring” the project so it sits alongside your other PAD datasets.

 

Sample queries

Below are some sample queries you can try using the public data in BigQuery!

-- Which committees supported the greatest number of candidates in 2020?

SELECT
cmte_nm as Committee_Name,
COUNT(DISTINCT(linkage_id)) as Number_of_Candidates,
STRING_AGG(DISTINCT cand_name) as Candidates_Supported
FROM
(SELECT
linkage_id,
cand_id,
cmte_id
FROM
`bigquery-public-data.fec.ccl20`) link
INNER JOIN
`bigquery-public-data.fec.cn20` cand ON cand.cand_id=link.cand_id
INNER JOIN
`bigquery-public-data.fec.cm20` cmte ON cmte.cmte_id=link.cmte_id
GROUP BY cmte.cmte_nm
ORDER BY 2 desc
LIMIT 10;
-- Which committees raised the most money through small dollar donations in 2020?

SELECT
cmte_nm AS Committee_Name,
SUM(transaction_amt)/1000000 AS Total_Small_Dollar_Donations_in_Millions
FROM
(SELECT
cmte_id,
transaction_amt
FROM
`bigquery-public-data.fec.indiv20`
WHERE
transaction_amt>0
AND transaction_amt<200) indiv
INNER JOIN
`bigquery-public-data.fec.cm20` cmte
ON
cmte.cmte_id=indiv.cmte_id
GROUP BY
cmte_nm
ORDER BY
2 desc
LIMIT
20;
-- What are the ten most populous zip codes according the 2010 census?

SELECT
zipcode,
population
FROM
`bigquery-public-data.census_bureau_usa.population_by_zip_2010`
ORDER BY
population DESC
LIMIT
10;

Helpful links and articles

To learn more about BigQuery public datasets, you can find more details in Google’s help documentation here.

To see the entire list of the datasets, including a description of each one, you can find them here

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article