Types of Entity Types in DataHub

Original Slack Thread

<@U06TM7M40RK> what are all the types of entity types in datahub?

DataHub supports a wide range of entity types that model various data assets and components within a data ecosystem. According to the <https://datahubproject.io/docs/0.14.0/metadata-modeling/metadata-model#the-core-entities|DataHub Metadata Model documentation>, the core entity types include:

  1. Data Platform: Represents external systems involved in processing, storing, or visualizing data assets, such as MySQL, Snowflake, Redshift, and S3.
  2. Dataset: A collection of data, including tables, views, streams, document collections, and files.
  3. Chart: A single data visualization derived from a dataset.
  4. Dashboard: A collection of charts for visualization.
  5. Data Job (Task): An executable job that processes data assets.
  6. Data Flow (Pipeline): An executable collection of data jobs with dependencies among them.

Additionally, the https://github.com/datahub-project/datahub/blob/master/datahub-graphql-core/src/main/java/com/linkedin/datahub/graphql/types/entitytype/EntityTypeMapper.java|EntityTypeMapper in the DataHub codebase provides a comprehensive list of entity types, including:

  • Domain
  • CorpUser
  • CorpGroup
  • ERModelRelationship
  • Notebook
  • Tag
  • Glossary Term
  • Glossary Node
  • Container
  • ML Model
  • ML Model Group
  • ML Feature Table
  • ML Feature
  • ML Primary Key
  • Ingestion Source
  • Execution Request
  • Assertion
  • Data Process Instance
  • Data Platform Instance
  • Access Token
  • Test
  • DataHub Policy
  • DataHub Role
  • Post
  • Schema Field
  • DataHub View
  • Query
  • Data Product
  • Custom Ownership Type
  • Incident
  • Role
  • Structured Property
  • Form
  • Data Type
  • Entity Type
  • Restricted
  • Business Attribute

These entities are part of DataHub’s metadata model, which is designed to support data discovery, observability, and governance.

Would you like more details on any specific entity type or how they are used within DataHub?

Sources:

Hi <@U03QUKUU2NB>, how did my answer work out for you? Need any more assistance? You can also tag <@U07FV3UHE4D> to let your fellow community members know you’re still looking for guidance.

Hi :slight_smile: ! Did this interaction resolve your issue or do you still require assistance ?

Please tag @ RunLLM if you still need further assistance !

Or tag @Datahub Community Support if you require a person to take a look :slight_smile:

Hope this helps ! ><