I tested ingesting manifest.json and catalog.json files generated on DBT Core in the datahub Quickstart Docker environment. It works fine and I can see the models and columns documentation produced on DBT. The quickstart environment runs datahub v1.0.0rc3
When ingesting the same files on my datahub deployment on Kubernetes running datahub v1.0.0, I don’t get the documentation on models and columns.
How can I check if I am using the same version of the DBT integration in both environments ?
I used the same recipe in both environments: source:
type: dbt
config:
manifest_path: /dbt-files/manifest.json
catalog_path: /dbt-files/catalog.json
run_results_path: /dbt-files/run_results.json
target_platform: snowflake
sink:
type: datahub-rest
config:
server: ‘http://datahub-gms:8080’
To address the issue of missing documentation on models and columns when ingesting manifest.json and catalog.json files in your Kubernetes deployment of DataHub v1.0.0, here are some steps and considerations:
Verify File Paths and Permissions:
Ensure that the paths to the manifest.json and catalog.json files are correctly specified in your ingestion recipe. The files should be accessible to the DataHub ingestion process running in your Kubernetes environment.
Check Ingestion Recipe Configuration:
Double-check your ingestion recipe to ensure it is correctly configured. The paths to the manifest.json and catalog.json files should be accurate, and all necessary configurations should be included.
Compare Configurations:
Compare the configurations between your Quickstart Docker environment and your Kubernetes deployment. Ensure that any environment-specific settings, such as paths or network configurations, are correctly set in your Kubernetes deployment.
Check for Errors in Logs:
Review the logs of the DataHub ingestion process in your Kubernetes deployment for any errors or warnings that might indicate issues with the ingestion process.
Verify DataHub Version Compatibility:
Ensure that the version of DataHub you are using in your Kubernetes deployment is compatible with the DBT files you are ingesting. Sometimes, newer versions might have changes that affect how metadata is processed.
Check for Known Issues:
Look for any known issues or bug reports related to the version of DataHub you are using. There might be specific issues with metadata ingestion that have been reported and addressed in newer versions.
Run Ingestion with Debug Logs:
Enable debug logging for the DataHub ingestion process to get more detailed information about what might be going wrong. This can help identify any issues with the ingestion process.
Ensure Correct Deployment Configuration:
Verify that your Kubernetes deployment is correctly configured, including any necessary environment variables or configurations that might affect the ingestion process.
Would you like more detailed steps or further assistance on any specific part of this process?
<@U06TM7M40RK> the paths are correct as the remaining of the information coming from DBT is ingested. How can I check the version of the DBT ingestion module in the docker and kubernetes environment to compare them ?
The DataHub’s documentation I have ingested does not mention how to check the version of the DBT ingestion module specifically within Docker and Kubernetes environments. You might want to ensure that both environments are using the same version of the DataHub CLI and ingestion plugins, as discrepancies could lead to differences in behavior. If this doesn’t help, please reach out to the DataHub support team for further assistance. 0 button 1 button Hint: Mention <@U06TM7M40RK> in the thread for followups.
To check the version of the DataHub CLI and the installed ingestion plugins, you can follow these steps:
Check DataHub CLI Version:
You can check the version of the DataHub CLI by running the following command in your terminal:
datahub version
This command will display the current version of the DataHub CLI installed on your system.
Check Installed Plugins:
To list the installed plugins and their versions, you can use the datahub check plugins command. This will show you the plugins that are currently installed and available for use:
datahub check plugins --verbose
This command will provide a detailed list of all the plugins, including their versions, that are installed in your DataHub environment.
These commands should help you verify the versions of the DataHub CLI and the ingestion plugins in your environment. If you need further assistance or more detailed steps, feel free to ask!
To run the commands for checking the DataHub CLI version and installed plugins in both Docker and Kubernetes environments, follow these steps:
Docker Environment
Access the DataHub CLI Container:
First, you need to access the container where the DataHub CLI is installed. You can do this by using the docker exec command to open a shell inside the container.
Replace <datahub-cli-pod-name> with the actual name of the pod running the DataHub CLI.
Run the Commands:
Once inside the pod, you can run the same commands to check the DataHub CLI version and list the installed plugins:
datahub version
datahub check plugins --verbose
These steps will help you verify the versions of the DataHub CLI and the installed plugins in both Docker and Kubernetes environments. If you need more detailed instructions or encounter any issues, feel free to ask!