Using Python OpenAPI to Add Validations in DataHub

Original Slack Thread

Dear all,
I’m new with datahub.
I’d like to use python OpenAPI to interact with it.
In this case I need to add validations to some columns in a table.
My goal is to make a POST request and add some validations to a dataset.
Could you please provide me an example ? Thanks

here is an example that submits data quality assertions in our managed service. you would need to modify a bit to load your own custom ones.

https://gist.github.com/mrjefflewis/9eb8ba15b4b2315bed0a49cb67b2bd90

Hi <@U068BKF6G84> thanks for the answer. Just one newbie question. Do I have to use python SDK in this case instead of openapi ?

i highly recommend it. openAPI is definitely an option though, but you might find yourself quickly making helper classes and functions that replicate what they python SDK already does :shrug:

i would start here and make sure you use Java 11. once you get it installed i recommend using this one liner in your ~/.bashrc or equivalent to set JAVA_HOME

export JAVA_HOME=$(/usr/libexec/java_home -v 11)

https://datahubproject.io/docs/developers

ok, I have deployed it using docker (datahub docker quickstart)
But I’m not familiar with Java, only Python. Can I use it?
Or you mean using java only for the deployment and then python for interactions ?

gradle is used to build the python virtualenv, in this case. it will build some stuff you don’t need but it’s the fastest way to get your venv going

you won’t do any actual java when doing integrations, just python

ok, thanks a lot :slightly_smiling_face:

great questions, keep em coming