Exporting Statistics Results in DataHub

Original Slack Thread

<@U06TM7M40RK> can i export stats result

To export statistics results in DataHub, you can utilize the Lineage Impact Analysis feature, which allows you to export the full list of dependencies to a CSV file. This feature is available via the DataHub UI and GraphQL endpoints. Here are the steps to export the results:

  1. Access the Lineage Tab: On a given Entity Page, select the Lineage tab.
  2. Toggle Dependencies: Easily toggle between Upstream and Downstream dependencies.
  3. Filter Dependencies: Choose the Degree of Dependencies you are interested in and apply filters by Entity Type, Platform, Owner, and more to isolate the relevant dependencies.
  4. Export to CSV: Export the full list of dependencies to a CSV file.

This process allows you to view and export the filtered set of dependencies, including details about assigned ownership, domain, tags, terms, and quick links back to those entities within DataHub.

For more detailed information, you can refer to the <https://datahubproject.io/docs/0.14.0/act-on-metadata/impact-analysis|Lineage Impact Analysis documentation>.

Would you like more detailed steps or additional information on this process?

Sources:

<@U06TM7M40RK> i need to download the results on the stats tab

To download the results from the “Stats” tab in DataHub, you need to ensure that the relevant usage statistics have been ingested and are available for the dataset. The “Stats” tab provides insights into dataset usage, such as the top users and queries referencing the dataset.

However, the data sources provided do not explicitly mention a direct method to export the “Stats” tab results to a file like CSV. The “Stats” tab is typically enabled after successful ingestion of usage data, and it displays the top users and queries for a dataset.

If you are looking to programmatically access and potentially export this data, you might consider using the DataHub GraphQL API to query the relevant usage statistics. The GraphQL API provides objects for dataset profiling and usage statistics, which you can query to retrieve the data displayed in the “Stats” tab.

For more information on how to use the GraphQL API for this purpose, you can refer to the <https://datahubproject.io/docs/api/graphql/overview|GraphQL API overview> and the <Objects | DataHub objects for dataset profiling>.

If you need more detailed guidance on using the GraphQL API to access and export stats data, please let me know!

Sources:

Hi :slight_smile: ! Did this interaction resolve your issue or do you still require assistance ?

Please tag @ RunLLM if you still need further assistance !

Or tag @Datahub Community Support if you require a person to take a look :slight_smile:

Hope this helps ! ><