Using Regex Patterns for Allowing and Denying Snowflake Tables and Schemas

Original Slack Thread

<@U06TM7M40RK> examples of regex patterns for allow & deny List of tables & schemas for Snowflake

To specify which tables and schemas to include or exclude during ingestion from Snowflake, you can use regex patterns in the allow and deny lists within your ingestion recipe. Here are some examples:

  1. Schema Pattern:

    • Allow Specific Schema: To include only a specific schema, use the allow pattern with the exact schema name.
      schema_pattern:
        allow:
          - '^my_schema$'  # Only include 'my_schema'
      
    • Deny Specific Schemas: To exclude schemas that end with _DEV, _TEST, or _DEMO, use the deny pattern.
      schema_pattern:
        deny:
          - '.*_DEV$'
          - '.*_TEST$'
          - '.*_DEMO$'
      
  2. Table Pattern:

    • Allow Specific Tables: To include only specific tables, use the allow pattern with the exact table names.
      table_pattern:
        allow:
          - '^table1$'
          - '^table2$'
      
    • Deny Specific Tables: To exclude tables that start with temp_, use the deny pattern.
      table_pattern:
        deny:
          - '^temp_.*'
      

These patterns are used in the ingestion recipe to filter which schemas and tables are processed. The allow patterns specify what to include, while the deny patterns specify what to exclude. Deny patterns take precedence over allow patterns.

For more detailed information on configuring these patterns, you can refer to the <https://datahubproject.io/docs/0.13.1/generated/ingestion/sources/snowflake#config-details|DataHub documentation on Snowflake ingestion>.

Would you like more detailed steps or additional examples?

Sources:

Hi <@U082MQQBL21>, how did my answer work out for you? Need any more assistance? You can also tag <@U07FV3UHE4D> to let your fellow community members know you’re still looking for guidance.

<@U06TM7M40RK> validate the include & exclude patterns

            allow:
                - ^DATA_CATALOGUE_TEMP$
                - ^data_catalogue_temp$
            deny:
                - '^PRODUCTION.*'
                - '^scratch.*'
                - '^snowflake_admin.*'
                - '^staging.*'
        schema_pattern:
            allow:
                - ^TEST$
            deny:
                - '^topic.*'
                - '^marketing.*'```

To validate the include and exclude patterns for your Snowflake ingestion configuration, let’s break down the regex patterns you’ve provided:

  1. Database Pattern:

    • Allow:
      • ^DATA_CATALOGUE_TEMP$: This pattern will include only the database named exactly DATA_CATALOGUE_TEMP.
      • ^data_catalogue_temp$: This pattern will include only the database named exactly data_catalogue_temp. Note that this is case-sensitive, so it will not match DATA_CATALOGUE_TEMP.
    • Deny:
      • ^PRODUCTION.*: This pattern will exclude any database whose name starts with PRODUCTION.
      • ^scratch.*: This pattern will exclude any database whose name starts with scratch.
      • ^snowflake_admin.*: This pattern will exclude any database whose name starts with snowflake_admin.
      • ^staging.*: This pattern will exclude any database whose name starts with staging.
  2. Schema Pattern:

    • Allow:
      • ^TEST$: This pattern will include only the schema named exactly TEST.
    • Deny:
      • ^topic.*: This pattern will exclude any schema whose name starts with topic.
      • ^marketing.*: This pattern will exclude any schema whose name starts with marketing.

Validation:

  • Ensure that the patterns are correctly specified for your use case. The ^ and $ anchors are used to match the start and end of the string, respectively, ensuring exact matches where specified.
  • Deny patterns take precedence over allow patterns, so if a database or schema matches both an allow and a deny pattern, it will be excluded.

For more information on configuring these patterns, you can refer to the <https://datahubproject.io/docs/0.13.1/generated/ingestion/sources/snowflake#config-details|DataHub documentation on Snowflake ingestion>.

Would you like more detailed steps or additional examples?

Sources:

Hi :slight_smile: ! Did this interaction resolve your issue or do you still require assistance ?

Please tag @ RunLLM if you still need further assistance !

Or tag @Datahub Community Support if you require a person to take a look :slight_smile:

Hope this helps ! ><