Using GitHub Actions for Metadata Ingestion on DataHub

Original Slack Thread

<@U06TM7M40RK> can ingestion be done by github actions

Yes, ingestion can be done using GitHub Actions. This approach is particularly useful for push-based integration, ensuring that metadata is always fresh and up-to-date whenever changes are made to your repository. Here is a step-by-step guide on how to set up ingestion using GitHub Actions:

Step-by-Step Guide

  1. Create a GitHub Action Workflow File:
    Create a workflow file in your .github/workflows directory inside your repository. Below is a sample GitHub Action workflow file for LookML metadata ingestion:

    name: lookml metadata upload
    on:
      push:
        branches:
          - main
      release:
        types: [published, edited]
      workflow_dispatch:
    
    jobs:
      lookml-metadata-upload:
        runs-on: ubuntu-latest
        steps:
          - uses: actions/checkout@v3
          - uses: actions/setup-python@v4
            with:
              python-version: "3.10"
          - name: Run LookML ingestion
            run: |
              pip install 'acryl-datahub[lookml,datahub-rest]'
              cat << EOF > lookml_ingestion.yml
              source:
                type: "lookml"
                config:
                  base_folder: ${{ github.workspace }}
                  parse_table_names_from_sql: true
                  github_info:
                    repo: ${{ github.repository }}
                    branch: ${{ github.ref }}
                  api:
                    client_id: ${LOOKER_CLIENT_ID}
                    client_secret: ${LOOKER_CLIENT_SECRET}
                    base_url: ${LOOKER_BASE_URL}
              sink:
                type: datahub-rest
                config:
                  server: ${DATAHUB_GMS_URL}
                  token: ${DATAHUB_GMS_TOKEN}
              EOF
              datahub ingest -c lookml_ingestion.yml
            env:
              DATAHUB_GMS_URL: ${{ secrets.DATAHUB_GMS_URL }}
              DATAHUB_GMS_TOKEN: ${{ secrets.DATAHUB_GMS_TOKEN }}
              LOOKER_BASE_URL: ${{ secrets.LOOKER_BASE_URL }}
              LOOKER_CLIENT_ID: ${{ secrets.LOOKER_CLIENT_ID }}
              LOOKER_CLIENT_SECRET: ${{ secrets.LOOKER_CLIENT_SECRET }}
    
  2. Set Up Secrets in GitHub:
    You need to set up the following secrets in your GitHub repository to get this workflow to work:

    • DATAHUB_GMS_URL: The endpoint where your DataHub host is running.
    • DATAHUB_GMS_TOKEN: An authentication token provisioned for DataHub ingestion.
    • LOOKER_BASE_URL: The base URL where your Looker assets are hosted.
    • LOOKER_CLIENT_ID: A provisioned Looker Client ID.
    • LOOKER_CLIENT_SECRET: A provisioned Looker Client Secret.
  3. Run the Workflow:
    The workflow will automatically run on pushes to the main branch, releases, or can be manually triggered. It will install the necessary dependencies, create an ingestion configuration file, and run the ingestion using the datahub ingest command.

Additional Resources

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Are you using UI or CLI for ingestion?
  2. Which DataHub version are you using? (e.g. 0.12.0)
  3. What data source(s) are you integrating with DataHub? (e.g. BigQuery)

<@U06TM7M40RK> Can you help with the questions in this thread?


:information_source: We’re experimenting with a new AI assistant for questions in the <#CUMUWQU66> channel to help you track down helpful resources more quickly. Keep in mind that responses may not always be spot-on; feel free to ask it follow-up questions and don’t forget to provide its responses a :thumbsup: or :thumbsdown: so other folks in the Community know if you’re still looking for help on the topic!

I’m not able to find any data sources to answer your question. Could you provide more details about the specific issue or question you have regarding DataHub? If this doesn’t help, please reach out to the DataHub support team for further assistance. :+1: 0 button :-1: 0 button :bulb: Hint: Mention @runllm in the thread for followups.

Would you like more details on any specific step?

Sources: