Discussion about Vulnerabilities in Datahub Actions Module

Original Slack Thread

Good Morning.

we have installed Datahub in our environment. When we scanned the Data hub code, we are seeing lot of vulnerabilities in the code, we also looked at detailed summary and found that the Actions module have most of the vulnerabilities.

Is these vulnerabilities in any other version?

Currently, we deployed the actions version - 0.0.14
Datahub version - 0.12.0
Thanks

<@U0657KMJJKC>

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Which DataHub version are you using? (e.g. 0.12.0)

  2. How are you deploying DataHub? (e.g. Helm, Quickstart, etc)

Datahub version - 0.12.0
we installed these components using Helm

What scanning tools are you using and vulnerabilities are you seeing? We are aware of some vulnerabilities from transitive dependencies on the images, but want to make sure we’re aware of the ones you’re seeing as well so we can get them prioritized.

<@U0657KMJJKC>, Can you respond to this question?

Hello <@UV5UEC3LN> Vulnerability scanning for image (datahub-actions:v0.0.14) stored in Azure Container Registry using Azure DevOps pipeline.
Here the list of Vulnerabilities.

Severity Name
Medium GO (Go) Security Update for http://golang.org/x/net|golang.org/x/net (GHSA-2wrh-6pvc-2jm9)
Medium GO (Go) Security Update for http://golang.org/x/net|golang.org/x/net (GHSA-4374-p667-p6c8)
Medium GO (Go) Security Update for http://golang.org/x/net|golang.org/x/net (GHSA-qppj-fm5r-hxr3)

High Java (Maven) Security Update for com.google.protobuf:protobuf-java (GHSA-4gg5-vx3j-xwc7)
High Java (Maven) Security Update for com.google.protobuf:protobuf-java (GHSA-g5ww-5jh7-63cx)
High Java (maven) Security Update for com.google.protobuf:protobuf-kotlin (GHSA-wrvw-hg22-4m67)
High Java (Maven) Security Update for com.google.protobuf:protobuf-parent (GHSA-77rm-9x9h-xj3g)
High Java (Maven) Security Update for commons-net:commons-net (GHSA-cgp8-4m63-fhh5)
High Java (maven) Security Update for net.minidev:json-smart (GHSA-fg2v-w576-w4v3)
High Java (Maven) Security Update for org.apache.mesos:mesos (GHSA-95q3-pppp-r683)
High Java (Maven) Security Update for org.eclipse.jetty.http2:http2-hpack (GHSA-wgh7-54f2-x98r)
High Java (Maven) Security Update for org.xerial.snappy:snappy-java (GHSA-55g7-9cwv-5qfv)
Medium Java (Maven) Security Update for com.fasterxml.woodstox:woodstox-core (GHSA-3f7h-mf4q-vrm4)
Medium Java (maven) Security Update for com.google.guava:guava (GHSA-mvr2-9pj6-7w5j)
Medium Java (Maven) Security Update for org.apache.avro:avro (GHSA-rhrv-645h-fjfh)
Medium Java (Maven) Security Update for org.apache.commons:commons-compress (GHSA-cgwf-w82q-5jrr)
Medium Java (Maven) Security Update for org.apache.zookeeper:zookeeper (GHSA-7286-pgfv-vxvh)
Medium Java (maven) Security Update for org.eclipse.jetty:jetty-http (GHSA-cj7v-27pg-wf7q)
Medium Java (Maven) Security Update for org.eclipse.jetty:jetty-http (GHSA-hmr7-m48g-48f6)
Medium Java (Maven) Security Update for org.eclipse.jetty:jetty-xml (GHSA-58qw-p7qm-5rvh)

<@UV5UEC3LN>, do you have any update on this Vulnerabilities?

CC: <@U0657KMJJKC>

We have them tracked, but some are blocked on upstream dependencies and mediums in general are not something we immediately prioritize over other work. We’ll happily accept contributions though :slightly_smiling_face:

Hi <@UV5UEC3LN>,

we found the code for Actions component (https://github.com/acryldata/datahub-actions/blob/v0.0.14/docker/datahub-actions/Dockerfile). But it has a image reference for Datahub-ingestion-base.

Do you have code for this Datahub-ingestion-base?

CC; <@U0657KMJJKC>, <@U05RNCBRQ5A>

The ingestion base image is in the main repo

<@UV5UEC3LN>, do you have code for ingestion base? we have the image. But we don’t have the code. we would like to reproduce the same vulnerabilities with the code as well

https://github.com/datahub-project/datahub/blob/master/docker/datahub-ingestion-base/Dockerfile

The ingestion base image is built from code within the main repo, this is the build script that creates it. It compiles code within the metadata-ingestion submodule of datahub-project/datahub