Documenting Dimensions and Metrics in DataHub UI using Tags

Original Slack Thread

<@U06TM7M40RK> how can I document which fields are dimensions and which mare metrics in datahub UI?

I’m not able to find any data sources to answer your question. You might want to provide more details about the specific UI components or features you are using in DataHub. If this doesn’t help, please reach out to the DataHub support team for further assistance. :+1: 0 button :-1: 0 button :bulb: Hint: Mention @runllm in the thread for followups.

By the way, I found these links that might be relevant to what you’re looking for. Let me know if this helps: - <https://datahubproject.io/docs/api/tutorials/forms/|Compliance Forms | DataHub> - <https://help.doit.com/docs/datahub/import-data/upload-csv|CSV ingestion - DoiT Help Center> - <https://forum.datahubproject.io/t/best-practices-for-documentation-in-datahub-editing-in-ui-vs-csv-ingestion/234|Best Practices for Documentation in Datahub: Editing in UI vs. CSV …>

<@U06TM7M40RK> I mean in certain tables there are columns which represent the data groups, and columns which represent the aggregated metrics for those groups. What is the best practice for documenting these? Should I use tags?

It seems likely that using tags could be a suitable approach for documenting which columns in your tables represent dimensions (data groups) and which represent metrics (aggregated data) in DataHub. According to the <https://datahubproject.io/docs/0.14.0/tags|DataHub documentation on tags>, tags are informal, loosely controlled labels that can be added to datasets, dataset schemas, or containers. They are useful for categorizing entities and aiding in search and discovery.

Tags can be applied at different levels, such as the dataset or schema level, which might be beneficial for your use case. You could create specific tags like “Dimension” and “Metric” and apply them to the relevant columns in your tables. This would allow you to easily identify and search for columns based on their role in the dataset.

Additionally, the <https://datahubproject.io/docs/0.14.0/api/tutorials/tags|DataHub documentation> provides guidance on how to add and manage tags, including the necessary permissions and setup.

Would you like more detailed guidance on how to implement this using DataHub’s tagging feature?

Sources:

ok tags looks like the way to go then