Hi, I’m facing a lineage parse error due to presence of a control character emanating from error class <class 'pydantic.error_wrappers.ValidationError'>: 1 validation error for UpstreamLineageEdge
{
"query_id": "xxxxxx",
"query_text": "xxxxxxx",
"start_time": " 2024-11-26 14:05:20.043 Z"
}
]; pos=1022; lineno=5; colno=20)```
which doesn't let me visualise the lineage for the metadata i'm ingesting, Can you help with how to resolve this error
The error you’re encountering, Invalid control character, typically arises from the presence of an unexpected character in your JSON data, which is causing the JSON decoder to fail. This is often due to improper formatting or encoding issues in the JSON string.
Here are some steps you can take to resolve this issue:
Check for Control Characters:
Review the JSON data you are trying to ingest, especially around the specified position (line 5, column 20). Look for any control characters (like tabs, newlines, etc.) that might not be properly escaped.
Validate JSON Format:
Use a JSON validator tool to check the format of your JSON data. This can help identify any syntax errors or invalid characters.
Escape Special Characters:
Ensure that any special characters in your JSON string are properly escaped. For example, replace tabs with \t and newlines with \n.
Review Ingestion Configuration:
Check your ingestion configuration to ensure that the data source is correctly set up to handle the JSON format. This includes verifying any transformations or preprocessing steps that might introduce control characters.
Update DataHub Version:
If you are using an older version of DataHub, consider updating to a newer version. There might be bug fixes or improvements related to JSON parsing in newer releases.
Debugging:
Enable debug logging to get more detailed information about where the error is occurring. This can help pinpoint the exact location and cause of the issue.
Would you like more detailed steps on any of these points or assistance with a specific part of this process?
I’m not able to find any data sources to answer your question. You might want to provide details about the specific JSON structure or the context in which it’s being used. If this doesn’t help, please reach out to the DataHub support team for further assistance. 0 button 0 button Hint: Mention @runllm in the thread for followups.
I’m not able to find any data sources to answer your question. You might want to provide information about the tools or libraries you’re using to process the JSON data. If this doesn’t help, please reach out to the DataHub support team for further assistance. 0 button 0 button Hint: Mention @runllm in the thread for followups.
To reach out to the DataHub support team, you can join the DataHub Slack community, which is a great place to get real-time support and interact with other DataHub users and contributors. Here are the steps to join and engage with the community:
Once you’re a member, you can post your question in relevant channels such as #troubleshoot or #office-hours. Make sure to include detailed information about your issue, such as error messages, logs, and any steps you’ve already taken to troubleshoot.
Use Threads:
When posting, use threads to keep conversations organized, especially if you’re including long messages or code snippets.
Office Hours:
DataHub has office hours where you can get live support from the Core DataHub Team. Check the #office-hours channel for more details on the schedule.
Thank you for letting us know you need a person to take a look at this thread.
Either a community member or one of our team members will try to assist when they have a moment, responses unfortunately might not be immediate. We hope this helps speed things along
was just confirming this was a supported Datahub recipe and not a custom process sending the lineage metadata in. could you share the full Recipe? not sure what would be causing this without understanding more the source-system you’re connecting to, etc