Hi Team,
im trying to ingest SQL server data to datahub local instance. Im using Azure SQL server. i tried to connect with mssql and mssql-odbc. both of them doesn’t work. since the documentation says, mssql-odbc should be used for managed sql server instances, i followed the steps and installed the pre requisites pyodbc(can see in pip list output) and Microsoft ODBC Driver 17 for sql server(can see in control panel installed programs). I still get error saying driver not found. see the error below.
(Background on this error at: <https://sqlalche.me/e/14/dbapi>)```
what am i missing?
To address the issue you’re encountering with the ODBC driver for SQL Server in your DataHub ingestion setup, it seems that the driver is not being found by the system. Here are some steps to troubleshoot and resolve the problem:
Verify ODBC Driver Installation:
Ensure that the ODBC Driver 17 for SQL Server is correctly installed on your system. You can check the installation by running the following command in your terminal:
odbcinst -j
This command will show the configuration files used by unixODBC. Make sure the driver is listed in the odbcinst.ini file.
Update Dockerfile:
If the driver is not installed in your Docker container, you need to update your Dockerfile to include the installation of the ODBC driver. Here is an example Dockerfile that installs the ODBC driver:
FROM python:3.9-slim
# Install unixODBC and the ODBC Driver for SQL Server
RUN apt-get update && \
apt-get install -y curl apt-transport-https gnupg && \
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - && \
curl https://packages.microsoft.com/config/debian/10/prod.list > /etc/apt/sources.list.d/mssql-release.list && \
apt-get update && \
ACCEPT_EULA=Y apt-get install -y msodbcsql17 unixodbc-dev && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
# Install DataHub
RUN pip install acryl-datahub
# Set environment variables
ENV ODBCINI=/etc/odbc.ini
ENV ODBCSYSINI=/etc
# Copy your DataHub recipe
COPY recipe.yml /app/recipe.yml
# Run DataHub ingestion
CMD ["datahub", "ingest", "-c", "/app/recipe.yml"]
Update Ingestion Recipe:
Ensure your ingestion recipe is correctly configured. Here is an example configuration for the mssql-odbc source type:
Given that you are using Docker Desktop on a Windows 11 machine, the issue likely stems from the Docker container not having the necessary ODBC driver installed. Here are the steps to resolve this:
Create a Custom Docker Image:
You need to create a custom Docker image that includes the Microsoft ODBC Driver for SQL Server. Below is an example Dockerfile that you can use:
FROM python:3.9-slim
# Install unixODBC and the ODBC Driver for SQL Server
RUN apt-get update && \
apt-get install -y curl apt-transport-https gnupg && \
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - && \
curl https://packages.microsoft.com/config/debian/10/prod.list > /etc/apt/sources.list.d/mssql-release.list && \
apt-get update && \
ACCEPT_EULA=Y apt-get install -y msodbcsql17 unixodbc-dev && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
# Install DataHub
RUN pip install acryl-datahub
# Set environment variables
ENV ODBCINI=/etc/odbc.ini
ENV ODBCSYSINI=/etc
# Copy your DataHub recipe
COPY recipe.yml /app/recipe.yml
# Run DataHub ingestion
CMD ["datahub", "ingest", "-c", "/app/recipe.yml"]
Build the Docker Image:
Save the above Dockerfile in a directory and build the Docker image using the following command:
docker build -t datahub-ingestion-mssql .
Run the Docker Container:
Run the Docker container with the newly built image:
docker run --rm datahub-ingestion-mssql
Update Ingestion Recipe:
Ensure your ingestion recipe is correctly configured. Here is an example configuration for the mssql-odbc source type:
If you have already installed the ODBC driver in your Docker container and are still encountering the same error, it might be due to the driver not being correctly registered or the environment not being properly configured. Here are some additional steps to troubleshoot and resolve the issue:
Verify Driver Installation in Docker Container:
Ensure that the ODBC driver is correctly installed and registered in the Docker container. You can do this by running the following commands inside the container:
To resolve the issue of the Docker build failing due to the app/recipe.yml file not being found, you need to ensure that the recipe.yml file is present in the same directory as your Dockerfile. Here are the steps to follow:
Ensure recipe.yml is in the Same Directory:
Make sure that the recipe.yml file is in the same directory as your Dockerfile. The directory structure should look like this:
Update Dockerfile:
Ensure your Dockerfile is correctly referencing the recipe.yml file. Here is the updated Dockerfile:
FROM python:3.9-slim
# Install unixODBC and the ODBC Driver for SQL Server
RUN apt-get update && \
apt-get install -y curl apt-transport-https gnupg && \
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - && \
curl https://packages.microsoft.com/config/debian/10/prod.list > /etc/apt/sources.list.d/mssql-release.list && \
apt-get update && \
ACCEPT_EULA=Y apt-get install -y msodbcsql17 unixodbc-dev && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
# Configure ODBC
RUN echo "[ODBC Driver 17 for SQL Server]\nDescription=Microsoft ODBC Driver 17 for SQL Server\nDriver=/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.6.so.1.1" > /etc/odbcinst.ini
RUN echo "[ODBC Data Sources]\nMSSQL=ODBC Driver 17 for SQL Server\n\n[MSSQL]\nDriver=ODBC Driver 17 for SQL Server\nServer=your_host\nDatabase=your_database" > /etc/odbc.ini
# Install DataHub
RUN pip install acryl-datahub
# Set environment variables
ENV ODBCINI=/etc/odbc.ini
ENV ODBCSYSINI=/etc
# Copy your DataHub recipe
COPY recipe.yml /app/recipe.yml
# Run DataHub ingestion
CMD ["datahub", "ingest", "-c", "/app/recipe.yml"]
Build the Docker Image:
Navigate to the directory containing your Dockerfile and recipe.yml file, then build the Docker image using the following command:
docker build -t datahub-ingestion-mssql .
Run the Docker Container:
Run the Docker container with the newly built image:
docker run --rm datahub-ingestion-mssql
Would you like more detailed guidance on any of these steps?
If the Docker build process is still unable to find the recipe.yml file despite having the correct directory structure, it might be due to the context in which the Docker build command is being run. Here are some steps to ensure the file is correctly included in the Docker build context:
Verify Directory Structure:
Ensure your directory structure looks like this:
Run Docker Build Command from the Correct Directory:
Navigate to the directory containing your Dockerfile and recipe.yml file before running the Docker build command:
cd /path/to/your/project
docker build -t datahub-ingestion-mssql .
Check Docker Ignore File:
Ensure there is no .dockerignore file that might be excluding the recipe.yml file from the build context. If there is a .dockerignore file, make sure it does not contain an entry that excludes recipe.yml.
Verify Dockerfile and Build Context:
Ensure the COPY command in your Dockerfile is correctly referencing the recipe.yml file. Here is the Dockerfile again for reference:
FROM python:3.9-slim
# Install unixODBC and the ODBC Driver for SQL Server
RUN apt-get update && \
apt-get install -y curl apt-transport-https gnupg && \
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - && \
curl https://packages.microsoft.com/config/debian/10/prod.list > /etc/apt/sources.list.d/mssql-release.list && \
apt-get update && \
ACCEPT_EULA=Y apt-get install -y msodbcsql17 unixodbc-dev && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
# Configure ODBC
RUN echo "[ODBC Driver 17 for SQL Server]\nDescription=Microsoft ODBC Driver 17 for SQL Server\nDriver=/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.6.so.1.1" > /etc/odbcinst.ini
RUN echo "[ODBC Data Sources]\nMSSQL=ODBC Driver 17 for SQL Server\n\n[MSSQL]\nDriver=ODBC Driver 17 for SQL Server\nServer=your_host\nDatabase=your_database" > /etc/odbc.ini
# Install DataHub
RUN pip install acryl-datahub
# Set environment variables
ENV ODBCINI=/etc/odbc.ini
ENV ODBCSYSINI=/etc
# Copy your DataHub recipe
COPY recipe.yml /app/recipe.yml
# Run DataHub ingestion
CMD ["datahub", "ingest", "-c", "/app/recipe.yml"]
Debugging the Build Context:
To debug the build context, you can create a simple Dockerfile that lists the contents of the build context:
FROM alpine:latest
COPY . /app
RUN ls -la /app
Build and run this Dockerfile to see if recipe.yml is being copied correctly:
docker build -t debug-context .
docker run --rm debug-context