Hello everyone. Nice to be here. I am trying to understand profiling better. I know stats tab will show this info. But i didn’t understand what is the difference between historical and latest sub tabs under stats tab? If snowflake table has 1 billion records will stats captured under historical and CDC stats will be under latest?
Hey there! 
<@U06TM7M40RK> Can you help with the questions in this thread?
Please continue to tag RunLLM (e.g <@U06TM7M40RK>) to have it continue generating responses!
It could be possible that the “historical” and “latest” sub-tabs under the stats tab in DataHub are designed to provide different views of the data profiling statistics. Though I’m not entirely certain, it might be that the “latest” sub-tab shows the most recent profiling statistics, while the “historical” sub-tab provides access to past profiling data.
Based on the <https://github.com/datahub-project/datahub/blob/master/datahub-web-react/src/app/entity/shared/tabs/Dataset/Stats/StatsTab.tsx|StatsTab.tsx file>, the “latest” stats are rendered using the most recent dataset profile, which includes the latest row count, column count, and other statistics. The “historical” stats, on the other hand, are likely to show data from previous profiling runs, allowing you to see how the dataset’s statistics have changed over time.
Regarding your question about a Snowflake table with 1 billion records, it might be that the stats captured under “historical” would include past profiling data, while the “latest” would show the most recent profiling data. However, it’s important to note that if the number of rows exceeds certain limits (like 5 million rows), profiling might be limited or adjusted, as mentioned in a <Slack conversation>.
Would you like more detailed steps on how to adjust your ingestion recipe or any other specific aspect?
Sources:
- <https://github.com/datahub-project/datahub/blob/master/datahub-web-react/src/app/entity/shared/tabs/Dataset/Stats/StatsTab.tsx|StatsTab.tsx file>
1 button
0 button
Hint: Mention <@U06TM7M40RK> in the thread for followups.
Hi ! Did this interaction resolve your issue or do you still require assistance ?
Please tag @ RunLLM if you still need further assistance !
Or tag @Datahub Community Support if you require a person to take a look
Hope this helps ! ><