Hi <@U07FV3UHE4D> - I am struggling to get Datahub ingest data from SQL Server with a secure connection. I have followed the instructions in this https://forum.datahubproject.io/t/seeking-help-with-datahub-ingestion-through-mssql-integration/1360/3|link but it doesn’t work. Can anyone help with a link that has detailed process of enabling secure SQL server connection?.. I have also tried installing the SQL Server drivers but I am unable to. My environment is deployed in Azure AKS.
Hey there! 
<@U06TM7M40RK> Can you help with the questions in this thread?
Please continue to tag RunLLM (e.g <@U06TM7M40RK>) to have it continue generating responses!
To enable a secure connection for DataHub ingestion from a SQL Server, you need to ensure that encryption is properly configured. Based on the information from the <https://datahubproject.io/docs/0.14.0/generated/ingestion/sources/mssql|DataHub MSSQL documentation>, there are two main approaches to achieve this: using the pytds
library or the pyodbc
library.
-
Using
pytds
Library:- You need to specify the CA file and enable host validation in your ingestion configuration. Here is an example configuration:
source: type: mssql config: host_port: your_host:1433 database: your_database username: your_username password: your_password include_views: true include_tables: true profiling: enabled: true profile_table_level_only: true stateful_ingestion: enabled: true options: connect_args: cafile: /path/to/your/ca.pem validate_host: true
-
Using
pyodbc
Library:- This method is often recommended for environments like Azure SQL where encryption is mandatory. You need to ensure that the ODBC driver is installed and configure your ingestion recipe as follows:
source: type: mssql-odbc config: host_port: your_host:1433 database: your_database username: your_username password: your_password use_odbc: "True" uri_args: driver: "ODBC Driver 17 for SQL Server" Encrypt: "yes" TrustServerCertificate: "Yes" ssl: "True"
For installing the ODBC driver in an Azure AKS environment, you might need to create a custom Docker image that includes the necessary dependencies. Here is a basic Dockerfile example:
FROM linkedin/datahub-ingestion:latest
RUN apt-get update && apt-get install -y unixodbc-dev
RUN pip install pyodbc
RUN pip install 'acryl-datahub[mssql]'
Build the Docker image with:
docker build -t datahub-ingestion-mssql .
Then, update your Helm chart to use this custom image.
If you need more detailed guidance on any of these steps, please let me know!
Sources:
- <https://datahubproject.io/docs/0.14.0/generated/ingestion/sources/mssql|DataHub MSSQL Documentation>
0 button
0 button
Hint: Mention @runllm in the thread for followups.
<@U06TM7M40RK> - I Have tried using Using pytds
Library but I get an error message during ingestion that no arguments are allowed as Options.
<@U06TM7M40RK> - Can I please have detailed information on how to build a custom docker image in Azure AKS?..
Hi ! Did this interaction resolve your issue or do you still require assistance ?
Please tag @ RunLLM if you still need further assistance !
Or tag @Datahub Community Support if you require a person to take a look
Hope this helps ! ><
<@U07FV3UHE4D> - I Have tried using Using pytds
Library but I get an error message during ingestion that no arguments are allowed as Options.
<@U07FV3UHE4D> - Can I please have detailed information on how to build a custom docker image in Azure AKS?..
Thank you for letting us know you need a person to take a look at this thread.
Either a community member or one of our team members will try to assist when they have a moment, responses unfortunately might not be immediate. We hope this helps speed things along