Friday, July 29, 2022

Snowflake Materialized Tables

 Snowflake's Materialized Tables will bring Real-time loading into Data Vault based Data Warehouses.

Equivalent of Matillion's Detect Changes in DBT

 I am a fan of Test Driven Development (TDD). Basically I define tests based on the requirements, and then write code till those tests are successful. Then I refactor and optimize the code, while making sure the Tests are still passing.

The same principles can be applied to Data Engineering. Start with defining a set of test, may be table that contains the PK and the metrics, values that should be associated with that PK. Then write your transformation job in SQL on the raw and compare the transformed data to the reference table. The idea, is that you have the reference table defined first, testing automated and then write the transformation SQL.

Test-driven development is a very powerful technique to ensure the data transformation is verified.

To achieve this, not only we need the correct reference tables, but also a way to automate the testing. In my previous job, I used Matillion's Detect Changes component to achieve this.  Matillion's Detect Changes component lets users scan two separate (but similar) tables, and insert a new column detailing if data has been inserted, deleted, changed, or even if the data is unchanged. This is super helpful in validating the output of transformation job that you are working on.

Unfortunately there doesn't seem to be a equivalent of Detect Changes in DBT. If you know of a DBT macro that does this, please share in the comments.

Snowflake Materialized Tables

 Snowflake's Materialized Tables will bring Real-time loading into Data Vault based Data Warehouses.