Hi everyone,
I have a question about retrieving the inputs and outputs of an already defined dataJob. I assume the best way to do so is by using the GET API endpoint /entities/v1/latest. I noticed that this works fine to retrieve info about datasets and dataflows when passing their URN but it doesn’t seem to work for dataJobs. For reference, I created a dataJob with the following urn (which i can verify in the url of the datahub UI). Using this urn gives me a 500 result code when using the API: urn:li:dataJob:(urn:li:dataFlow:(Airflow,test_flow_id,DEV),TestETL)
The returned message is the following:
Failes to batch get entities with urns: [urn:li:dataJob:(urn:li:dataFlow:(Airflow,test_flow_id,DEV),TestETL)]
Could it be that the API doesn’t support the retrieval of dataJob objects?
Edit: I’m using version 0.12.0
Edit2: It seems that this issue only occurs when I want to retrieve the dataJobInfo aspect. This is the one that breaks the call. When I only retrieve dataJobInputOutput it works fine. How can this be the case? I think it has something to do with the type
parameter when instantiating DataJobInfoClass
. I use the plain string “AIRFLOW”
but this seems to cause the issue?