Troubleshooting MSSQL Ingestion Failure in DataHub

Original Slack Thread

Hi, I have a troubles with ingestion from local MSSQL…

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Are you using UI or CLI for ingestion?
  2. Which DataHub version are you using? (e.g. 0.12.0)
  3. What data source(s) are you integrating with DataHub? (e.g. BigQuery)

<@U06TM7M40RK> Can you help with the questions in this thread?


:information_source: We’re experimenting with a new AI assistant for questions in the <#CUMUWQU66> channel to help you track down helpful resources more quickly. Keep in mind that responses may not always be spot-on; feel free to ask it follow-up questions and don’t forget to provide its responses a :thumbsup: or :thumbsdown: so other folks in the Community know if you’re still looking for help on the topic!

Sure, I can help with that! To assist you effectively, I need a bit more information:

  1. Are you using the UI or CLI for ingestion?
  2. Which DataHub version are you using? (e.g., 0.12.0)
  3. What data source(s) are you integrating with DataHub? (e.g., BigQuery, MSSQL)

Additionally, if you could provide any error messages or issues you’re encountering, that would be very helpful.

Sources:

  1. UI
  2. DataHub CLI version: 0.13.2.4
  3. MSSQL 2019

the metadata arrived at Datahub, although the ingestion jobs are still indicated as “Failed”…

<@U06TM7M40RK>

Thank you for providing the details. Based on your description, it seems that the metadata ingestion from MSSQL to DataHub is partially successful, but the ingestion jobs are marked as “Failed.” Here are some steps and considerations to help troubleshoot and resolve the issue:

Debugging Steps

  1. Check Logs for Detailed Errors:

    • Review the logs from the DataHub ingestion executor and the DataHub GMS (Global Metadata Service) to identify any specific error messages. This can provide insights into why the jobs are marked as failed.
    • You can access the logs by clicking on the Details button of the failed ingestion run in the UI.
  2. Common Issues:

    • Recipe Misconfiguration: Ensure that your ingestion recipe is correctly configured. Refer to the <https://datahubproject.io/docs/metadata-ingestion|Metadata Ingestion Framework> source docs for the required configurations.
    • Connectivity / Network Reachability: Verify that DataHub can reach your MSSQL instance. Network issues can cause ingestion failures.
    • Authentication: Ensure that the credentials provided in your ingestion recipe are correct and have the necessary permissions to access the MSSQL database.
  3. Specific Error Messages:

    • If you encounter errors related to encoding, such as the ‘utf-16-le’ codec issue, you may need to specify the character encoding explicitly in the connection string. For example:
      source:
        type: mssql
        config:
          host_port: localhost:1433
          database: DemoDatabase
          username: admin
          password: password
          encoding: "utf-16-le"
          use_odbc: "True"
          uri_args:
            driver: "ODBC Driver 17 for SQL Server"
            Encrypt: "yes"
            TrustServerCertificate: "Yes"
            ssl: "True"
      

Example Configuration

Here is an example of how you might configure your ingestion recipe for MSSQL:

source:
  type: mssql
  config:
    host_port: localhost:1433
    database: DemoDatabase
    username: admin
    password: password
    encoding: "utf-16-le"
    use_odbc: "True"
    uri_args:
      driver: "ODBC Driver 17 for SQL Server"
      Encrypt: "yes"
      TrustServerCertificate: "Yes"
      ssl: "True"

Additional Resources

Sources: