Troubleshooting SQL Query Metadata Ingestion in DataHub with Tagging RunLLM

user-3 · January 20, 2025, 12:03am

Hi,
I trying to scan the metadata in datahub ui for sql-queries and while I am giving the query as {“query”:“SELECT* FROM sys.schemaName”} it is throwing this error

{
"title": "No metadata was produced by the source",
"message": "Please check the source configuration,filters, and permissions.",
"context" []
}```
so do we need to provide database schema name or each table name in schema to produce metadata?
and also in the query file format in the optional fields the timestamp,upstream,downstream what do we need to give like the timestamp should be present timestamp? and what data we need to give for upstream and downstream?

datahub_team · January 20, 2025, 12:03am

Hey there!

<@U06TM7M40RK> Can you help with the questions in this thread?

Please continue to tag RunLLM (e.g <@U06TM7M40RK>) to have it continue generating responses!

user-1 · January 20, 2025, 12:03am

To address your issue with the SQL query ingestion in DataHub, let’s break down the requirements and the error message you’re encountering.

Error Message
The error message "No metadata was produced by the source" suggests that the ingestion process did not generate any metadata. This could be due to several reasons, such as incorrect query format, missing configuration, or lack of permissions.

Query File Format
According to the <https://datahubproject.io/docs/0.14.0/generated/ingestion/sources/sql-queries|DataHub documentation>, the query file should be a newline-delimited JSON file where each line is a separate JSON object. Here’s a breakdown of the fields:

query: The SQL query to parse. Ensure that your query is correctly formatted and valid for the SQL dialect you are using.
timestamp (optional): The timestamp of the query, in seconds since the epoch. This can be the current timestamp or the time when the query was executed.
user (optional): The user who ran the query. This should be a string representing the user.
operation_type (optional): Platform-specific operation type, used if the operation type can’t be parsed.
downstream_tables (optional): A list of tables that the query writes to. This is used if the query can’t be parsed to determine the downstream tables.
upstream_tables (optional): A list of tables the query reads from. This is used if the query can’t be parsed to determine the upstream tables.

Example Query File
Here is an example of how your query file should look:

{"query": "SELECT * FROM sys.schemaName", "timestamp": 1689232738, "user": "user_a", "upstream_tables": ["your_database.your_schema.your_table"]}

Troubleshooting Steps

Check Query Syntax: Ensure that your SQL query is valid and correctly formatted for the SQL dialect you are using.
Provide Schema and Table Names: If the query involves specific tables, ensure that the upstream_tables and downstream_tables fields are correctly populated with the full table names, including the schema.
Verify Configuration: Double-check your ingestion configuration to ensure that it is correctly set up. This includes the platform, database, and any other relevant settings.
Permissions: Ensure that the user running the ingestion has the necessary permissions to access the metadata.

Would you like more detailed steps on any specific part of this process?

Sources:

<https://datahubproject.io/docs/0.14.0/generated/ingestion/sources/sql-queries|DataHub SQL Queries Ingestion Documentation> 0 button 0 button Hint: Mention @runllm in the thread for followups.

user-1 · January 20, 2025, 12:03am

Hi, how did my answer work out for you? Need any more assistance? You can also tag <@U07FV3UHE4D> to let your fellow community members know you’re still looking for guidance.

datahub_team · January 20, 2025, 12:03am

Hi ! Did this interaction resolve your issue or do you still require assistance ?

Please tag @ RunLLM if you still need further assistance !

Or tag @Datahub Community Support if you require a person to take a look

Hope this helps ! ><

Topic		Replies	Views
Troubleshooting SQL Query Ingestion Issue in DataHub ingestion	5	35	July 8, 2024
Writing Queries on a MSSQL DB Dataset Ingested into DataHub ingestion	5	73	June 17, 2024
Ingesting SQL Query Results with Tags ingestion	4	49	March 4, 2024
Uploading Metadata Information from a Database to a File in DataHub ingestion	6	61	December 30, 2024
Listing Connection Schemas Before Ingestion using DataHub ingestion	3	48	March 4, 2024

Troubleshooting SQL Query Metadata Ingestion in DataHub with Tagging RunLLM

Hey there!

<@U06TM7M40RK> Can you help with the questions in this thread?

Related topics