Support for Column-Level Lineage in DataHub Redshift Ingestor for Serverless Clusters

Original Slack Thread

<@U06TM7M40RK> I’m using the DataHub Redshift ingestor (version 0.14.0) for both provisioned and serverless clusters. While everything works fine for provisioned clusters, lineage extraction - particularly column-level lineage doesn’t seem to be working for serverless. Does DataHub currently support column-level lineage ingestion for Redshift Serverless?

DataHub does support Redshift Serverless, but there are some limitations regarding lineage extraction, particularly column-level lineage. According to the <https://datahubproject.io/docs/0.14.0/generated/ingestion/sources/redshift#config-details|DataHub Redshift documentation>, column-level lineage is supported but requires specific configurations and may not work seamlessly with Redshift Serverless due to the limitations of accessing certain system tables.

The <https://github.com/datahub-project/datahub/pull/9998|Redshift Serverless support> was introduced to accommodate the limitations of Amazon Redshift Serverless, such as restricted access to STL, STV, SVCS, and SVL system tables. This support includes a configuration flag is_serverless that adjusts the ingestion pipeline to work with these limitations.

However, as per the <Slack discussion>, column-level lineage is not fully supported for Redshift Serverless, and only table-level lineage is currently available. This is due to the absence of certain information schema tables in Redshift Serverless that are necessary for extracting detailed lineage information.

Would you like more detailed information on how to configure the Redshift ingestor for serverless environments or any other specific aspect?

Sources:

Have you enabled serverless mode?![attachment]({‘ID’: ‘F08JV95KGTH’, ‘EDITABLE’: False, ‘IS_EXTERNAL’: False, ‘USER_ID’: ‘U02NNEBBW8P’, ‘CREATED’: ‘2025-03-25 18:01:31+00:00’, ‘PERMALINK’: ‘Sign in to Acryl Data | Slack’, ‘EXTERNAL_TYPE’: ‘’, ‘TIMESTAMPS’: ‘2025-03-25 18:01:31+00:00’, ‘MODE’: ‘hosted’, ‘DISPLAY_AS_BOT’: False, ‘PRETTY_TYPE’: ‘PNG’, ‘NAME’: ‘CleanShot 2025-03-25 at 19.01.24@2x.png’, ‘IS_PUBLIC’: True, ‘PREVIEW_HIGHLIGHT’: None, ‘MIMETYPE’: ‘image/png’, ‘PERMALINK_PUBLIC’: ‘https://slack-files.com/T06TPM9T0G7-F08JV95KGTH-b4ff900938’, ‘FILETYPE’: ‘png’, ‘EDIT_LINK’: None, ‘URL_PRIVATE’: ‘Sign in to Acryl Data | Slack’, ‘HAS_RICH_PREVIEW’: False, ‘TITLE’: ‘CleanShot 2025-03-25 at 19.01.24@2x.png’, ‘IS_STARRED’: False, ‘PREVIEW_IS_TRUNCATED’: None, ‘URL_PRIVATE_DOWNLOAD’: ‘Sign in to Acryl Data | Slack’, ‘PREVIEW’: None, ‘PUBLIC_URL_SHARED’: False, ‘MESSAGE_TS’: ‘1742925695.449309’, ‘PARENT_MESSAGE_TS’: ‘1742922238.568649’, ‘MESSAGE_CHANNEL_ID’: ‘CUMUWQU66’, ‘_FIVETRAN_DELETED’: False, ‘LINES_MORE’: None, ‘LINES’: None, ‘SIZE’: 135180, ‘_FIVETRAN_SYNCED’: ‘2025-03-30 08:22:21.373000+00:00’})

Hi Tamás. Yes, I did set is_serverless=true

Hi :slight_smile: ! Did this interaction resolve your issue or do you still require assistance ?

Please tag @ RunLLM if you still need further assistance !

Or tag @Datahub Community Support if you require a person to take a look :slight_smile:

Hope this helps ! ><