Adding Custom Properties to DataJob in DataHub

Original Slack Thread

Hi, team! I’m trying to add properties when running a job and they don’t go to the datahub, although it says it’s a dictionary. What am I doing wrong?

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Which DataHub version are you using? (e.g. 0.12.0)
  2. Please post any relevant error logs on the thread!

![attachment]({‘ID’: ‘F0708SWATQC’, ‘EDITABLE’: False, ‘IS_EXTERNAL’: False, ‘USER_ID’: ‘U05A17F6EHL’, ‘CREATED’: ‘2024-04-23 09:18:48+00:00’, ‘PERMALINK’: ‘Slack’, ‘EXTERNAL_TYPE’: ‘’, ‘TIMESTAMPS’: ‘2024-04-23 09:18:48+00:00’, ‘MODE’: ‘hosted’, ‘DISPLAY_AS_BOT’: False, ‘PRETTY_TYPE’: ‘PNG’, ‘NAME’: ‘image.png’, ‘IS_PUBLIC’: True, ‘PREVIEW_HIGHLIGHT’: None, ‘MIMETYPE’: ‘image/png’, ‘PERMALINK_PUBLIC’: ‘https://slack-files.com/TUMKD5EGJ-F0708SWATQC-64882644f4’, ‘FILETYPE’: ‘png’, ‘EDIT_LINK’: None, ‘URL_PRIVATE’: ‘Slack’, ‘HAS_RICH_PREVIEW’: False, ‘TITLE’: ‘image.png’, ‘IS_STARRED’: False, ‘PREVIEW_IS_TRUNCATED’: None, ‘URL_PRIVATE_DOWNLOAD’: ‘Slack’, ‘PREVIEW’: None, ‘PUBLIC_URL_SHARED’: False, ‘MESSAGE_TS’: ‘1713863932.213009’, ‘PARENT_MESSAGE_TS’: ‘1713863923.829139’, ‘MESSAGE_CHANNEL_ID’: ‘C029A3M079U’, ‘_FIVETRAN_DELETED’: False, ‘LINES_MORE’: None, ‘LINES’: None, ‘SIZE’: 287156, ‘_FIVETRAN_SYNCED’: ‘2024-04-28 12:56:59.777000+00:00’})

version: 0.12.1

  1. creating an emitter datahub
  2. creating DataFlow obj
  3. creating DataJob obj, add properties
  4. creating DataProcessInstance from DataFlow
  5. emmit start/stop

How to add properties for the task?

use acryl-datahub-0.13.1.3

        datajob_info = DataJobInfo(
            customProperties=custom_properties,
            type=AzkabanJobTypeClass.COMMAND,
            name=datajob_name
        )
        # Construct a MetadataChangeProposalWrapper object for dataset
        mcpw = MetadataChangeProposalWrapper(
            entityUrn=datajob_urn,
            aspect=datajob_info,
            changeType=ChangeType.UPSERT
        )
       rest_emitter = DatahubRestEmitter(gms_server=gms)
       rest_emitter.emit(mcpw)```

thx, I needed something else, I already found a solution - just assign the dictionary with the necessary data to the properties attribute of the DataJob object

At first I tried to put them in the properties of the DataProcessInstance object, but it didn’t work)

DataProcessInstance represents an instance of a datajob, it will have run , input and output information

![attachment]({‘ID’: ‘F0706QLDE1H’, ‘EDITABLE’: False, ‘IS_EXTERNAL’: False, ‘USER_ID’: ‘U05A17F6EHL’, ‘CREATED’: ‘2024-04-23 11:01:09+00:00’, ‘PERMALINK’: ‘Slack’, ‘EXTERNAL_TYPE’: ‘’, ‘TIMESTAMPS’: ‘2024-04-23 11:01:09+00:00’, ‘MODE’: ‘hosted’, ‘DISPLAY_AS_BOT’: False, ‘PRETTY_TYPE’: ‘PNG’, ‘NAME’: ‘image.png’, ‘IS_PUBLIC’: True, ‘PREVIEW_HIGHLIGHT’: None, ‘MIMETYPE’: ‘image/png’, ‘PERMALINK_PUBLIC’: ‘https://slack-files.com/TUMKD5EGJ-F0706QLDE1H-198155695c’, ‘FILETYPE’: ‘png’, ‘EDIT_LINK’: None, ‘URL_PRIVATE’: ‘Slack’, ‘HAS_RICH_PREVIEW’: False, ‘TITLE’: ‘image.png’, ‘IS_STARRED’: False, ‘PREVIEW_IS_TRUNCATED’: None, ‘URL_PRIVATE_DOWNLOAD’: ‘Slack’, ‘PREVIEW’: None, ‘PUBLIC_URL_SHARED’: False, ‘MESSAGE_TS’: ‘1713870074.785109’, ‘PARENT_MESSAGE_TS’: ‘1713863923.829139’, ‘MESSAGE_CHANNEL_ID’: ‘C029A3M079U’, ‘_FIVETRAN_DELETED’: False, ‘LINES_MORE’: None, ‘LINES’: None, ‘SIZE’: 105881, ‘_FIVETRAN_SYNCED’: ‘2024-04-28 12:57:00.097000+00:00’})

properties: Custom properties to set for the DataProcessInstance

Yes, that’s correct it properties for DataProcessInstance not to datajob
there is a separate aspect DataProcessInstanceProperties for properties DataProcessInstance
https://github.com/datahub-project/datahub/blob/934ab03d16dc52f992a807a2002e9949cc6f95fa/metadata-ingestion/src/datahub/api/entities/dataprocess/dataprocess_instance.py#L234C20-L234C49|https://github.com/datahub-project/datahub/blob/934ab03d16dc52f992a807a2002e9949cc[…]on/src/datahub/api/entities/dataprocess/dataprocess_instance.py

https://datahubproject.io/docs/generated/metamodel/entities/dataprocessinstance/