Using the data vault modeling method allows us to implement changes quickly, without changing the database structure. Together with our agile mindset here at RatePAY, where changes are welcome and belong to our daily business, Data Vault is the preferred modeling method.
What are the key facts about data vault?
Data vault modeling is a database modeling method that is designed to provide long-term historical storage of data coming in from multiple operational systems. It is also a method of looking at historical data that deals with issues such as auditing, loading speed and resilience to change.
A data vault stores “a single version of the facts“. As opposed to the practice in other data warehouse methods of storing “a single version of truth“, in the data vault definitions are not removed or “cleansed”.
The data vault structure is a useful method to store and historise a huge amount of data, with the possibility of quick changes, which should be one of the key elements for a data warehouse. The structure is not optimized for reporting reasons. For reporting reasons, RatePAY uses multidimensional databases (Cubes) and relational reportings using Tableau software. The descriptive information for both of the reporting issues are based on the data stored in the data vault environment.
The data vault structure:
- reduced complexity
- allows faster loading processes due to parallelism
- is scalable for future growth
- optimized for automated loading processes
- simplifies tracebility and historisation
Compared to other modeling methods, Data Vault is a relatively new kind of method. It was originally conceived by Dan Linstedt in the 1990s and was released in 2000 as a public domain modeling method. Data Vault 2.0 has arrived on the scene as of 2013 and brings to the table Big Data, NoSQL, unstructured and semi-structured seamless integration.
Data vault wants to solve the problem of dealing with change in the business environment by separating the business keys (that do not mutate as often) and the associations between those business keys, from the descriptive attributes of those keys.
Every row in a data vault is accompanied by record source and load date attributes, enabling an auditor to trace values back to the source.
Data Vault‘s key elements are Hub-, Satellite- and Link tables. The „Hub“ tables contain the business key, which is the driver for the hub. „Link“ tables are used for associations or translations between different business keys (Hub Tables). The „Satellite“ tables are always linked to the Hub- or Link-tables and contain the descriptive attributes.
The descriptive attributes for the shop are stored in the satellite table. The hub table contains an ID (surrogate key), which is used to connect the shop via the link table to the customer-hub table. The descriptive informations about the customer are stored in the customer satellite table.
If e.g. new attributes need to be added, a new satellite table would be introduced. The Hub structure and the existing satellite- and link-tables are not influenced by the change. This makes it easier and faster to react to changes, historisation only needs to be realised in the satellite tables.
Join our Newsletter
Fintech-related articles (like this one) everey week in your inbox