Troubleshooting dbt Models Not Combining with Snowflake Tables in Datahub Production Instance

Original Slack Thread

I’ve got an issue with some dbt models not combining with their respective Snowflake tables. The issue is only affecting some tables (eg works for RAW_ANSWERS but not DIM_ANSWERS in the screenshot). It is also only affecting our production instance of Datahub, it works fine on dev instance ingesting the same data and using the same server version (0.10.2).

I can’t see why certain tables wouldn’t combine and don’t have many ideas for debugging. Does anyone have ideas for how to correct? I’ve tried rerunning the ingestions with dbt first and with Snowflake first but saw no change. I see no difference in naming or URNs between our prod and dev instances.

Hi <@U04M79B3CN9>! My first thought is it might be an issue with URN casing; can you share your recipes for both sources? Pls omit any sensitive info :slightly_smiling_face:

Are you able to upgrade to the newest version, 0.10.5? We had a bug with the creation of siblings (combining entities) that was fixed somewhat recently

Have you tried to have same cases for both dbt and snowflake dataset? Urns are case sensitive

I’d already checked the URNs and they match

urn:li:dataset:(urn:li:dataPlatform:snowflake,mydb.myschema.dim_answers,PROD)```
This is especially confusing as the urns and, from what I can tell, all the other configs, seem to the same on our dev instance of Datahub, but the dim_answers table and model do combine there.

Unfortunately I can’t easily upgrade the server version right now. The recipes are below, but I think the only thing changing between the runs is the rest endpoint.

dbt cloud
```source:
  type: "dbt-cloud"
  config:
    token: x
    account_id: x
    project_id: x
    job_id: x
    metadata_endpoint: <https://metadata.emea.dbt.com/graphql>
    target_platform: snowflake
    stateful_ingestion:
      enabled: false
    env: PROD```
snowflake
```pipeline_name: snowflake__policy_domain

source:
  type: snowflake
  config:
    account_id: x
    warehouse: x
    username: x
    password: x
    role: x
    database_pattern:
      allow:
        - ^...$
        - ...
    schema_pattern:
      allow:
        - ...
    profiling:
      enabled: false
      turn_off_expensive_profiling_metrics: true
    ignore_start_time_lineage: true
    stateful_ingestion:
      enabled: false
      remove_stale_metadata: false
    env: PROD

transformers:
    - type: pattern_add_dataset_terms
      config:
        semantics: PATCH
        term_pattern:
          rules:
            ...```