Corporate wide provisioning of data is the responsibility of our Data engineering and Analytics (DEA) team.
Source data is imported from different interfaces and prepared for analysis. From time to time, we get structural source data changes, i.e. additional fields provided or data architectural changes. To react as fast as possible and to support our agile development methods, we decided to implement data vault modelling techniques. Within data vault, there are different table types that are loaded with data; however, the logic for loading the data remained the same. Therefore, an automated ETL solution was recommended for our solution.
As a tool for data modelling, we utilised SAP Power Designer. This is because all our tables and views fitted with our data warehouse. A distinction is drawn between data models for staging, archive, data vault (core vault / rule vault) and data mart.
In the past, we used SSIS templates for the creation of different data vault table types. This was a very time-consuming process, for the customisation of individual SSIS packages for specific ETL jobs. It to us a lot of time to adapt all SSIS packages for each specific ETL task, thus resulting in the creation of non-standard packages. Due to the complexity of individually tailored packages, this increased the likelihood of bugs in our ETL packages.
In order to resolve the potential issues, we decided to automate the creation of all new SSIS packages. The following tasks were required in order to achieve our objective:
1. Dataflow mapping between source and target tables in SAP Power Designer
2. Export function for all data model descriptive information (meta data) within SAP Power Designer including the dataflow mappings
3. A SSIS package generator that make use of metadata to create customised ETL processes
1. Creation of dataflow mapping between source and target tables
In SAP Power Designer, we utilise a staging and an archiving concept for each source data interface. Additionally, we then use a model for the complete core vault (same with our rule vault and data mart), thus all our models are physical data models. Within the integrated mapping editor of SAP Power Designer, it is possible to create mappings between tables of different data models. One target table can have more than one mapping to different source tables. One source table can also have different target tables (which is common because of integrity loads).
One mapping equates to one data transfer. A meaningful and unique mapping name has to be defined because this name is used later for SSIS package names.
2. Exporting of Meta data
We used SAP Power Designer Extension (TeDaMoh ExMeX Framework) to extract all metadata from each physical data model. Files consist of delete and insert statements for meta data tables. Meta data tables include all entities, entity attributes and relationships and mappings between entities.
3. SSIS Package Generation
Data Access Programming Interface views (DAPI) compose all the necessary meta data for generating one specific target table type. DAPI views exists for hubs, links, standard satellites, link satellites, bitemporal satellites, end-dating satellites and insert-only satellites as an example.
SSIS packages are created using “Business Intelligence Markup Language” (BIML) scripts by using DAPI views. BIML provides all possible SSIS objects required, such as connections, tasks, parameters and variables. We currently use free BimlExpress add-in for Visual Studio. BimlExpress enables syntax highlighting and shows a preview of generated structures. For the DEA team, it significantly simplifies development.
We use one BIML script for each specific target table type. We decided to group the resulting ETL packages into separate solutions for each source data interface. The reason why we decided to use this method was due to the frequency of the data, which was updated in our data vault tables.
It is a big advantage for us to be able to make central changes to all our existing and future ETL processes. That means that error-prone manual work on SSIS packages is no longer necessary.
Join our newsletter
Find articles like this in your inbox every week