Skip to main content
Version: 0.12.1

Configuring Your BigQuery Connector to DataHub

Now that you have created a Service Account and Service Account Key in BigQuery in the prior step, it's now time to set up a connection via the DataHub UI.

Configure Secrets

  1. Within DataHub, navigate to the Ingestion tab in the top, right corner of your screen

Navigate to the "Ingestion Tab"

note

If you do not see the Ingestion tab, please contact your DataHub admin to grant you the correct permissions

  1. Navigate to the Secrets tab and click Create new secret

Secrets Tab

  1. Create a Private Key secret

This will securely store your BigQuery Service Account Private Key within DataHub

  • Enter a name like BIGQUERY_PRIVATE_KEY - we will use this later to refer to the secret
  • Copy and paste the private_key value from your Service Account Key
  • Optionally add a description
  • Click Create

Private Key Secret

  1. Create a Private Key ID secret

This will securely store your BigQuery Service Account Private Key ID within DataHub

  • Click Create new secret again
  • Enter a name like BIGQUERY_PRIVATE_KEY_ID - we will use this later to refer to the secret
  • Copy and paste the private_key_id value from your Service Account Key
  • Optionally add a description
  • Click Create

Private Key Id Secret

Configure Recipe

  1. Navigate to the Sources tab and click Create new source

Click "Create new source"

  1. Select BigQuery

Select BigQuery from the options

  1. Fill out the BigQuery Recipe

You can find the following details in your Service Account Key file:

  • Project ID
  • Client Email
  • Client ID

Populate the Secret Fields by selecting the Private Key and Private Key ID secrets you created in steps 3 and 4.

Fill out the BigQuery Recipe

  1. Click Test Connection

This step will ensure you have configured your credentials accurately and confirm you have the required permissions to extract all relevant metadata.

Test BigQuery connection

After you have successfully tested your connection, click Next.

Schedule Execution

Now it's time to schedule a recurring ingestion pipeline to regularly extract metadata from your BigQuery instance.

  1. Decide how regularly you want this ingestion to run-- day, month, year, hour, minute, etc. Select from the dropdown

    schedule selector

  2. Ensure you've configured your correct timezone

    timezone_selector

  3. Click Next when you are done

Finish Up

  1. Name your ingestion source, then click Save and Run

    Name your ingestion

You will now find your new ingestion source running

ingestion_running

Validate Ingestion Runs

  1. View the latest status of ingestion runs on the Ingestion page

ingestion succeeded

  1. Click the plus sign to expand the full list of historical runs and outcomes; click Details to see the outcomes of a specific run

ingestion_details

  1. From the Ingestion Run Details page, pick View All to see which entities were ingested

ingestion_details_view_all

  1. Pick an entity from the list to manually validate if it contains the detail you expected

ingestion_details_view_all

Congratulations! You've successfully set up BigQuery as an ingestion source for DataHub!

Need more help? Join the conversation in Slack!