Understanding Data Products, Domains, and Environments in Data Mesh Architecture

Original Slack Thread

Hi team,
What is the difference between data product and domain? Why should one create these? Also what is the relevance of environment in URNs? Can one filter entities in UI via environment value?

You can filter entities (i.e. datasets) by environment as you can see from the screenshot of my tests I’m currently doing around S3

Domain and Data Product reflect the ideas of the Data Mesh concept. There is a lot of material on the Web on Data Mesh, here one blog article which summarizes the Data mesh concept quite well: https://www.starburst.io/blog/data-mesh-and-starburst-domain-oriented-ownership-architecture/. Domains help to define clear ownership of the all the data, so in DataHub you can add assets to a domain. By that you define that the data belongs to the domain, but by that it is “only” internally available by the domain. The Data Product is the idea to expose parts of the data (a Data API with a contract) so that these Data Products can be consumed by the other domains. A Domain is an owner and producer of one or more Data Products.