Hello, I can no longer see BigQuery table query stats now. I’m not sure when this this became a problem but I know it used to work and I never changed the ingestion recipe.
type: bigquery
config:
include_table_lineage: true
include_tables: true
include_views: true
convert_urns_to_lowercase: true
profiling:
enabled: true
profile_table_level_only: true
stateful_ingestion:
enabled: true```
Does anything seem off in here? Based on the documentation, `include_usage_statistics` still defaults to true so I'm not sure what's going on.
Hey there! Make sure your message includes the following information if relevant, so we can help more effectively!
Are you using UI or CLI for ingestion?
Which DataHub version are you using? (e.g. 0.12.0)
What data source(s) are you integrating with DataHub? (e.g. BigQuery)
<@U06TM7M40RK> Can you help with the questions in this thread?
We’re experimenting with a new AI assistant for questions in the <#CUMUWQU66> channel to help you track down helpful resources more quickly. Keep in mind that responses may not always be spot-on; feel free to ask it follow-up questions and don’t forget to provide its responses a or so other folks in the Community know if you’re still looking for help on the topic!
To address the issue of missing BigQuery table query stats, let’s first ensure that all necessary configurations are correctly set in your ingestion recipe. Based on the provided data sources, here are a few points to consider:
Include Usage Statistics: Although include_usage_statistics defaults to true, explicitly setting it in your configuration might help ensure that usage statistics are indeed being included.
Usage Configuration: Ensure that the usage configuration is correctly set. This includes parameters like include_top_n_queries, include_operational_stats, and include_read_operational_stats.
Profiling Configuration: Your profiling settings look mostly correct, but ensure that profile_table_level_only is set according to your needs. Additionally, check if enable_stateful_profiling is affecting the stats.
Stateful Ingestion: Ensure that stateful ingestion is correctly configured and not causing any issues with the stats.
Here is an updated version of your ingestion recipe with explicit settings for usage statistics:
'num_usage_total_log_entries': {},
'num_usage_parsed_log_entries': {},
'usage_error_count': {},
'num_usage_resources_dropped': 0,
'num_usage_operations_dropped': 0,
'operation_dropped': [],
'usage_failed_extraction': [],```
I ran with the above config and still do not see usage stats.
DataHub 0.13.3 and CLI 0.13.2.4
<@U06TM7M40RK>
Given that your ingestion configuration appears correct and you are still not seeing usage stats, there are a few additional steps you can take to troubleshoot the issue:
Steps to Troubleshoot:
Check Permissions: Ensure that the service account used for ingestion has the necessary permissions to access BigQuery usage logs. Missing permissions can prevent usage statistics from being captured.
Examine Logs: Review the ingestion logs for any errors or warnings related to usage statistics. The logs you provided show that no usage log entries were parsed ('num_usage_parsed_log_entries': {}), which suggests that the logs might not be accessible or are empty.
Enable Detailed Debug Logs: Increase the logging level to debug to capture more detailed information about the ingestion process. This can help identify where the issue might be occurring.
Check BigQuery Audit Logs: Verify that BigQuery audit logs are being generated and contain the expected usage information. If the audit logs are not being generated, DataHub cannot extract usage statistics.
Review Stateful Ingestion: Ensure that stateful ingestion is not causing issues. Sometimes, stateful ingestion settings can interfere with the proper capture of usage statistics.