Connecting Products to Hierarchy in 3NF model

Publish date: 2019-08-27
Authors: Grzegorz Swierniak
Tags: dwh, design, 3nf, hierarchy, integrated layer, access layer, surrogate key

This article is the second part of the series of how to cope with hierarchies in 3NF models (Hierarchies in 3NF model). If you have not read the previous one yet, I strongly encourage you to read it because from time to time in this article I will be referring to the previous article in the series.

Today I will show you how to connect underlying products with their hierarchy. The hierarchy as a standalone set of data cannot serve many purposes if we do not connect it to the underlying object it relates to, for instance, the hierarchy of region would be meaningless if we would not combine countries (if that is the granularity of data in our case) with each continent or geographical region (EMEA, APAC, etc.). So the hierarchy can be fully functional if it combines all its nodes with the underlying objects it relates to.

After reading this article you will learn:

After reading this article you will not learn:

connecting_products_to_hierarchy

Source file

In the source file, we have the list of products with its attributes, such are the name of a product, model year, list price, node_id (that will act as a bridge to our hierarchy) and brand_id.

In a source system (ERP) the table is named as ‘products’ and for education purposes, we will keep the same in our stage layer (demo_sl). Every product has a node_id which represents a node in the product hierarchy table, also available in the source (the table was presented in details in the previous article).

Create your model

It is the time to extend our current ER model with the 3 additional entities - we need a key table for storing the underlying products, an entity to store product attributes (main table) and another relationship table to define the relationship between each product and the lowest hierarchy item in the given node.

prod_hier_rltd

Feel free to copy the model, just clone my project: Demo_il model in genmymodel.com

Let me quickly give you an overview of the tables that we have:

Important note In order to keep the article clear enough, some scripts were shortened, normally the pattern is presented in the first 2 and the last UNION and the rest of UNIONs were cut off. But all functional scripts can be found in the GitHub repository, so feel free to download them and test if on your own. Link to the the dwhacademy/demoproject repository can be found at the end of the article. repo

Design the transformation logic

The transformation logic

The transformation logic is wrapped in a view that will be used to update key table (k004_prod_key) and the main target table (m004_prod), therefore in the view, we should combine all columns from both tables. Fortunately, the logic to uniquely define each product is simple in our case because the product_id in the source is unique and it is the perfect candidate to act as a natural key. Here are a few key facts about the logic:

Load your model

Generate surrogate keys

It is time to start loading our tables in the integrated layer. At first, we have to make sure, all our products have their own surrogate keys, in order to do that - we update the key table at the beginning. In the script below you can find the common formula for surrogate key generation:

Load target table

Having all surrogate keys in place, we can easily insert data to our main target table - m004_prod.

Results:

Load relationship table

In order to finally connect products with the lowest levels of hierarchy items, we need to define logic to load m004_hier_item_prod_rltd. It requires combining products and products_tree_unq tables from the source and then joining our key tables from our Integrated Layer to get the surrogate keys for each of the object (hier_item_id and prod_id).

Views on the top of the Integrated Layer tables

Recursive view as a universal object to store all hierarchies

Once the tables in our integrated model had been loaded, we can modify the recursive view we created in the previous article. Just to remind, the recursive view aims to prepare data for denormalization in the Access Layer through combining each underlying object (product or hierarchy item) with its all upstream levels of hierarchy. The recursive view contains two unions: in the first union, we need to define the lowest level of the hierarchy (product level in our case) together with its direct parent hierarchy item. In the second union, we build all the upstream levels up to the top.

Hierarchy - final obejct for Access Layer

The second and the last step in denormalization exercise is to pivot the data from the recursive view. I named the new view as dim_product_tree_hier and placed it in the Access layer (demo_al) because it is the proper layer to create such denormalize tables(views). The modification I applied here is that I added prod_id and prod_nm to the pivoting PK, so now we can see all products and what hierarchy node they fall under.

Summary

I hope that you found this article interesting, in the next article I will try to show you how to add the second hierarchy for products, so each product can be seen in the context of 2 dimensions (we will use Grafana tool to visualize this).

All the scripts presented in this article can be found also on GitHub, feel free to use them and try to improve!

comments powered by Disqus