15/08/2017 Edgar Kautzner

What Is Data Warehouse Automation And How Does It Deliver Value?

This is the second video in our Data Warehouse Automation series, Introduction to Data Warehouse Automation. This video is ideal for people from a business background who are new to data warehouse automation and want to find out more. We’ll be covering the following topics:

If you haven’t already seen our first video in the series, Introduction to Data Warehousing, you may choose to watch this video first.

We hope you find the video below informative.

Notes from the video are provided below.  The slides can also be found on SlideShare.

Our first video, Introduction to Data Warehousing covered some of the basics of data warehouses and how they deliver business value.

In this video, we’re going to talk more about the development of data warehouses, and in particular, give you an introduction to data warehouse automation, and how it delivers significant value to organisations. Let’s first cover what is involved in developing a data warehouse.

Overview of traditional data warehouse development

Diagram showing the ETL / ELT process of a data warehouse

Traditional data warehouse organisations often have data sitting in various locations. This includes operational systems (where the data isn’t structured for reporting), in Excel or CSV, or in the cloud in some form. We need to get the data from these sources into our data warehouse where it is structured for reporting.

This is done by:

  • Writing code to extract the data from the sources and load it into a temporary location.
  • Transforming the data into a format suitable for reporting. This process also includes processes to improve the quality control (called cleansing), and data integration.
  • Making the data available in the data warehouse where it is structured in an optimum way for reporting purposes

This whole process is known in the industry as “ELT” or “ETL”. Traditionally, all of the code for the ELT/ETL process was written from scratch.

The problems with traditional data warehouse development methods

There are four main problems with traditional data warehouse development.

  • It’s time consuming as it requires significant manual coding – approximately 70-75% of the total project time. As a result, projects take far too long and users have to wait to get the information they need.
  • It’s error-prone. Due to the amount of manual coding required, it is easy to make mistakes and difficult to find and fix errors.
  • It’s costly due to the development resources and the effort required. Coding accountes for 70-75% of project costs. As a result, projects often result in failure – due to budget overruns, slow delivery, and low perceived value with stakeholders.
  • Users get frustrated as it takes way too long for them to get the information they need.

What’s the solution to these problems?

You might say that data warehouses are bad – let’s stop using data warehouses. However this isn’t the right solution because data warehouses provide a tremendous amount of business value. The problem was that we were using the wrong approach. There was a mismatch between traditional data warehouse development and the evolutionary nature of analytics development.

There were three main problems that are specific to analytics development.

  • Users don’t know whether they want something until they’ve actually seen it. We needed a different approach where we could release functionality rapidly in short, iterative cycles and make changes based upon user feedback. The reason is that if we build most functionality in one go, we would run a high risk that the output would have little value to the business.
  • Users requirements change once they start using the analytics system. If you don’t respond to those changes, people will stop using it.
  • Manual coding takes far too long and costs too much money.

The right solution to these problems is to use a combination of:

  • Evolutionary analytics development methods, and
  • Data warehouse automation

Evolutionary Development

Evolutionary Analytics Development: An iterative cycle of prioritisation, prototyping and feedback.

Evolutionary development involves a constant cycle of requirements gathering, rapid prototyping and review. It aims to deliver functionality incrementally in short release cycles. The evolutionary approach allows development of analytics systems that deliver value in a shorter time frame and have a tighter fit with organisational decision-making requirements.

The problem is that the evolutionary approach is extremely difficult with traditional data warehouse development methods – because manual coding takes too long. So we need to use data warehouse automation to solve these problems.

How Data Warehouse Automation delivers value

Rapid data warehouse delivery and agile changes
By automating much of the design, build and maintenance of your data warehouse. You can pull data from virtually any data source or cloud-based platform and deploy new functionality rapidly – in days and weeks, not months and years.

Dramatically reduced development costs
As all development is metadata-driven, minimal coding is required. As such, fewer ELT developers are needed and a high-quality solution can be delivered in a very short time.

Consistent high quality code and automated documentation
Data warehouse automation builds the entire extract, load and transform process by combining metadata that describes your data warehouse with industry best practice design patterns. This results in high-quality, consistent and error-free code with auto-generated documentation.

Choice of data warehouse platform
One of the problems with traditional data warehouse development was that after you’ve spent time writing the code to extract data from your source systems to your destination data warehouse, you were more or less locked into your data warehouse platform.

For example, if you were using SQL server to house your data warehouse, and you wanted to move to a cloud-based platform such as Amazon Redshift, it would take you months or perhaps years to rewrite all the code from scratch.

With data warehouse automation, all the transform logic is built in using metadata, and then the data warehouse automation platform will use this metadata and logic to write the specific code required for your target data warehouse platform. If you want to switch to a different platform, the data warehouse automation platform will re-write all the ELT code for you, rewrite it specifically so that it is optimised for the new platform.

Metadata export
Data warehouse automation tools allow you to export metadata into your data visualisation tool such as Yellowfin, Power BI, Qliq and Tableau, allowing you to analyse your data immediately and save time in report development.

And the output?

  • Functionality can be delivered rapidly, incrementally and fully support evolutionary development of analytics functionality
  • Development costs are a fraction of those using traditional methods, with less financial risk around data warehouse initiatives and a significantly higher return on investment
  • Dramatically reduced risk
  • Greater stakeholder engagement as the output is tightly aligned with their requirements
  • Far more business value and a significantly higher return on investment

Stay tuned for the next video in our data warehouse automation series where we’ll use the Ajilus data warehouse automation platform to build a data warehouse from scratch!

Do decision makers in your organisation have to wait weeks and months for changes and additions to be made to the data warehouse? Has your organisation avoided building a data warehouse because you have heard it can take too long or it’s too expensive? Minerra’s experienced consultants can help you assess your organisations needs and show you how using data warehouse automation and evolutionary development methods can deliver a cost effective data warehouse in much less time than you thought. Contact us for a casual chat to see how we can help.

Tagged: , , , , , ,