This is the third video in our Data Warehouse Automation series, Data to Dashboard in 90 Minutes with Ajilius Data Warehouse Automation. In this video, we demonstrate a full end-to-end workflow for the creation of a data warehouse from the Sakila movie rental database using Ajilius.
The following steps are covered during the demonstration:
Connecting to data sources (MySQL and CSV)
Extracting data from data sources to loading tables
Transforming data from load tables to staging tables
In this video, I use the Ajilius data warehouse automation platform to demonstrate a full end-to-end workflow to create a data warehouse from scratch. I speed the video up along the way to compress about 90 minutes of work into about 25 minutes. Future videos will show each part of the process at normal speed.
The Sakila Database
First, we’ll talk about the database that we’ll be using for this exercise. We’re going to use the ‘Sakila’ database which is a sample data set for the MySQL database platform.
The story behind the sample database is all about a DVD rental company. The company uses a system to record all the rental activity for the business, and all data entered into the system is saved a transactional database. This transactional database is the dataset we’ll be working with.
The dataset contains a variety of information such as:
Information about customers and staff (personal details, etc.)
Information about each store (location, manager, etc.)
Information about each film (e.g. title, language, genre)
Information about when when each film was rented and returned
Metrics such as how much revenue the rental generated
The data has been collected in a transactional database which isn’t structured to be optimised for reporting and analytics. In the first video in our data warehouse automation series, we mentioned that a star schema is the best data structure to support analytics.
Building a Data Warehouse with Ajilius
In this exercise, we’re going to transform the transactional data in the DVD rental database to a star schema. This will allow quick and easy analysis of the rental transactions at the DVD rental company.
We will build the following:
Fact table containing metrics such as how long each film was rented for and how much revenue was generated
A series of dimensions allowing this data to be sliced and diced by staff member, customer, store, film and date
The process we’ll go through is as follows:
Create a destination database for the warehouse in Microsoft SQL server and connect the destination data warehouse to Ajilius
Create a connection to the source database, which is the Sakila DVD rental database in MySQL
Extract data from the source database to the loading tables
Transform the data from the loading tables into the staging tables
Create dimension and fact tables
Create relationships between the dimensions and fact table to create a star schema and then export the metadata to Yellowfin to allow immediate analysis of the data
Summary of Timings
The table below summarises the time it took for each of the components of the data warehouse to be completed. All in all, the data warehouse was created in less than 90 minutes!
Stay tuned for future videos in our data warehouse automation series where we’ll use the Ajilius Data Warehouse Automation platform to further build upon the data warehouse we have just created.
Data to Dashboard in 90 Minutes with Ajilius Data Warehouse Automation August 22nd, 2017Edgar Kautzner
This is the second video in our Data Warehouse Automation series, Introduction to Data Warehouse Automation. This video is ideal for people from a business background who are new to data warehouse automation and want to find out more. We’ll be covering the following topics:
In this video, we’re going to talk more about the development of data warehouses, and in particular, give you an introduction to data warehouse automation, and how it delivers significant value to organisations. Let’s first cover what is involved in developing a data warehouse.
Overview of traditional data warehouse development
Traditional data warehouse organisations often have data sitting in various locations. This includes operational systems (where the data isn’t structured for reporting), in Excel or CSV, or in the cloud in some form. We need to get the data from these sources into our data warehouse where it is structured for reporting.
This is done by:
Writing code to extract the data from the sources and load it into a temporary location.
Transforming the data into a format suitable for reporting. This process also includes processes to improve the quality control (called cleansing), and data integration.
Making the data available in the data warehouse where it is structured in an optimum way for reporting purposes
This whole process is known in the industry as “ELT” or “ETL”. Traditionally, all of the code for the ELT/ETL process was written from scratch.
The problems with traditional data warehouse development methods
There are four main problems with traditional data warehouse development.
It’s time consuming as it requires significant manual coding – approximately 70-75% of the total project time. As a result, projects take far too long and users have to wait to get the information they need.
It’s error-prone. Due to the amount of manual coding required, it is easy to make mistakes and difficult to find and fix errors.
It’s costly due to the development resources and the effort required. Coding accountes for 70-75% of project costs. As a result, projects often result in failure – due to budget overruns, slow delivery, and low perceived value with stakeholders.
Users get frustrated as it takes way too long for them to get the information they need.
What’s the solution to these problems?
You might say that data warehouses are bad – let’s stop using data warehouses. However this isn’t the right solution because data warehouses provide a tremendous amount of business value. The problem was that we were using the wrong approach. There was a mismatch between traditional data warehouse development and the evolutionary nature of analytics development.
There were three main problems that are specific to analytics development.
Users don’t know whether they want something until they’ve actually seen it. We needed a different approach where we could release functionality rapidly in short, iterative cycles and make changes based upon user feedback. The reason is that if we build most functionality in one go, we would run a high risk that the output would have little value to the business.
Users requirements change once they start using the analytics system. If you don’t respond to those changes, people will stop using it.
Manual coding takes far too long and costs too much money.
The right solution to these problems is to use a combination of:
Evolutionary analytics development methods, and
Data warehouse automation
Evolutionary development involves a constant cycle of requirements gathering, rapid prototyping and review. It aims to deliver functionality incrementally in short release cycles. The evolutionary approach allows development of analytics systems that deliver value in a shorter time frame and have a tighter fit with organisational decision-making requirements.
The problem is that the evolutionary approach is extremely difficult with traditional data warehouse development methods – because manual coding takes too long. So we need to use data warehouse automation to solve these problems.
How Data Warehouse Automation delivers value
Rapid data warehouse delivery and agile changes By automating much of the design, build and maintenance of your data warehouse. You can pull data from virtually any data source or cloud-based platform and deploy new functionality rapidly – in days and weeks, not months and years.
Dramatically reduced development costs As all development is metadata-driven, minimal coding is required. As such, fewer ELT developers are needed and a high-quality solution can be delivered in a very short time.
Consistent high quality code and automated documentation Data warehouse automation builds the entire extract, load and transform process by combining metadata that describes your data warehouse with industry best practice design patterns. This results in high-quality, consistent and error-free code with auto-generated documentation.
Choice of data warehouse platform One of the problems with traditional data warehouse development was that after you’ve spent time writing the code to extract data from your source systems to your destination data warehouse, you were more or less locked into your data warehouse platform.
For example, if you were using SQL server to house your data warehouse, and you wanted to move to a cloud-based platform such as Amazon Redshift, it would take you months or perhaps years to rewrite all the code from scratch.
With data warehouse automation, all the transform logic is built in using metadata, and then the data warehouse automation platform will use this metadata and logic to write the specific code required for your target data warehouse platform. If you want to switch to a different platform, the data warehouse automation platform will re-write all the ELT code for you, rewrite it specifically so that it is optimised for the new platform.
Metadata export Data warehouse automation tools allow you to export metadata into your data visualisation tool such as Yellowfin, Power BI, Qliq and Tableau, allowing you to analyse your data immediately and save time in report development.
And the output?
Functionality can be delivered rapidly, incrementally and fully support evolutionary development of analytics functionality
Development costs are a fraction of those using traditional methods, with less financial risk around data warehouse initiatives and a significantly higher return on investment
Dramatically reduced risk
Greater stakeholder engagement as the output is tightly aligned with their requirements
Far more business value and a significantly higher return on investment
Stay tuned for the next video in our data warehouse automation series where we’ll use the Ajilus data warehouse automation platform to build a data warehouse from scratch!
Do decision makers in your organisation have to wait weeks and months for changes and additions to be made to the data warehouse? Has your organisation avoided building a data warehouse because you have heard it can take too long or it’s too expensive? Minerra’s experienced consultants can help you assess your organisations needs and show you how using data warehouse automation and evolutionary development methods can deliver a cost effective data warehouse in much less time than you thought.Contact us for a casual chat to see how we can help.
What Is Data Warehouse Automation And How Does It Deliver Value? May 29th, 2018Edgar Kautzner
We’re kicking off a series of videos to answer “What is a data warehouse?” and “Data Warehouse Automation” with our first video: Introduction to Data Warehousing. This video is ideal for people with a business background who would like to learn more about data warehousing, specifically:
Notes from the video are provided below. The slides can also be found on SlideShare.
What is a data warehousing?
In order to correctly understand and make use of data warehousing, you need to understand the problem you are solving.
It may be a decision that needs to be made, or a question that needs answering. For example:
A marketing manager might want to understand where to invest their online advertising dollars
A call centre manager might want to know the optimum number of staff to hire for their call centre, or
A sales manager might want to identify the customers that deliver the most profit – so they can find more of these customers.
We want to make an informed, and objective, data driven decision, and we need some data to answer these questions.
At an organisation, you may have data being collected in a variety of places. This may be:
From internal systems, such as a finance system or ERP, customer systems such as your CRM, HR / payroll, or operations systems such as a manufacturing system
From web apps, such as Salesforce and Xero, Google Analytics, or social media such as Twitter and Facebook
From spreadsheets and flat files
From other cloud sources
Or from any other data sources within the organisation
However this data is likely to be:
Not structured for reporting
Hard to access
Captured in a silo and not integrated with all the other data to give a complete picture
This is where data warehousing comes in. Data warehousing allows you to:
Extract data from your organisational systems
Load it into a centralised location
Transform and integrate the data into a format optimised for analytics
The data warehouse can be used as a source for your data visualisation tool to provide reports & dashboards, for advanced analytics, and for a variety of other purposes.
So how does this create value for an organisation?
A data warehouse creates a single source of data that is consistent in format, structured in a uniform way, contains complete and accurate data that can be relied upon, and is up to date
It is structured and designed specifically to allow data to be accessed quickly
It provides a single integrated view of an organisation by combining data from multiple sources
Provides a complete data set, allowing you to analyse data from the past to predict the future
All of this delivers value to the business by:
Giving managers access to the information they need more quickly and easily, with significantly less ongoing effort to prepare the data
Providing information about the business environment more quickly which means that managers can respond to changes rapidly
Providing information more frequently
And leads to more informed managers, making data driven, objective decisions – for which you’ll see the results in your organisation’s bottom line.
How is a data warehouse structured?
At its most basic, a data warehouse is a collection of tables containing data structured in a way that is optimised for reporting and analytics.
There are different ways that the data can be modeled. The best way to model data to support decision making is a dimensional model, sometimes known as a ‘star schema’ or a Kimball model.
At a high level, the tables in a dimensional model can be categorised into two different types:
Fact tables (also called event tables) contain individual business events (e.g. sales transactions), or aggregated, summarised business events (e.g. sales by month). They consist of measures and calculations such as sales amount and discount rate.
Dimensions contain descriptive attributes – fields that describe the measures in the fact table. They allow you to slice and dice your data. For example, a given sales transaction relates to a customer, a product, a salesperson, and a date.
So you could look at, for example
The transactions by salesperson for a given product
The top customers by revenue
The overall trend on revenue over time
If there is a relationship between discounts given and sales
Data warehouses usually contain multiple star schemas for analysing organisational data.
How are data warehouses used?
A data warehouse contains data that is used to support business decisions. It can be used in a variety of ways:
With a Data visualisation tool such as Yellowfin, Power BI, Tableau or Qlik, to provide reports and dashboards to managers to allow them to monitor and manage organisational performance, share information around the business, drive processes or send out tasks.
With Advanced Analytics such as network and cluster analysis, forecasting, data mining, sentiment analysis, simulation modeling, etc. There could be a two-way feed between the data warehouse and advanced analytics.
As a source of data to feed into other organisational systems, for example, into CRM for future marketing after customer segmentation analysis.
Do you have people in your organisation who spend most of their time using Excel to manually “glue” together data from for different systems? Do you only get monthly reports because weekly reports take too long? Do you have to wait too long for ad hoc data questions to be answered? If you answered yes to one or more of these questions then your organisation may gain a lot of value from a data warehouse. Minerra’s consultants have a lot of experience working with many type of organisation to help them quickly build a cost effective data warehouse that significantly reduces the manual effort required to prepare the data for regular reports and ad hos analysis.Contact us for a casual chat to see how we can help.
What Is A Data Warehouse And How Does It Deliver Value? May 29th, 2018Edgar Kautzner
On a chilly mid-July Thursday, Minerra participated in the Melbourne Business Analytics Conference 2017. Minerra had a booth at the one-day conference where we met analytics and data experts and business leaders from across the country.
I also gave a presentation titled “Delivering Decision Support Through User-Centric Design” where I shared a successful approach we took with a large American organisation. The slides of my presentation are available on Slideshare and a recording of the presentation is below.
Keynote Presentation by Tom Davenport
To kick off the conference, keynote speaker Tom Davenport appeared via video conference from the US. He presented his view about analytics ‘eras’; in particular a new era in analytics named augmented analytics, which is what he called “analytics 4.0”. It was an interesting presentation however the ideas may somewhat align more with large multi-national organisations with a very mature analytics capability, than with the priorities and needs of many organisations, particularly small to medium enterprises that are still building their analytics capability from a comparatively low base.
Good Decisions versus Good Outcomes
Professor Zeger Degraeve, Dean of Melbourne Business School, conducted an engaging session about the differences between Good Decisions, and Good Outcomes. He talked about the role of analytics on informing good decisions, and that outcomes are generally uncertain. Based on probability, outcomes can be negative. Good managers should therefore not measure decision and decision support on outcomes, but in the way they support good decisions.
Analytics in Sport
It was a pleasure to hear Michael Cheika, head coach of the Wallabies, introduce a down-to- earth view of analytics use. He explained how the Wallabies use analytics in combination with their own expertise in areas such as injury prevention, performance improvement, and competition analysis. Although he mentioned big data and analytics, it seemed more like he was using spreadsheets (integrating data from different sources) and then trying to push those results to players directly.
Creating Business Value with Analytics
I also attended a fantastic panel session later in the afternoon about value creation from analytics. Senior analytics users and managers participated in the session, including Jane Eastgate (Head of Flybuys Analytics, Coles), Dr Amy Shi-Nash (Head of Data Science, Commonwealth Bank of Australia, Sheetal Patole (Chief Data and Analytics Officer, Macquarie Bank), Scott Jendra (CIO, Australian Football League), and Enrico Rizzon (Partner, Procurement Analytics, AT Kearney).
The discussion was focused on two important parts of analytics in organizations: value creation, and gender diversity in analytics teams. Interesting statements were made about the analytics teams’ closeness with the business areas of organizations (participation in meetings, roadshows, etc). The conversation also focused on the lack of female analytics partition in the market. The panel recognised that improvements had been made, but that much more work on this issue is needed in the next decade.
Overall, it was a productive and insightful day, and we’re looking forward to attending the event again next year.
If you have any questions about my presentation, please don’t hesitate to contact me at [email protected]. Wherever your organisation is in your analytics journey, we are happy to have a no-obligations chat to see what we can do to help.
Minerra at Melbourne Business Analytics Conference August 14th, 2017Edgar Kautzner
In every business, in every department, managers are accountable for the money they spend, and how effective that spending has been. Marketing analytics is no different. Unfortunately for years, marketers were executing plans based on gut feel, or unreliable numbers.
Digital Marketing Analytics For A Changing Era
In this digital age, marketers can no longer flounder around in the dark to figure out if their tactics are effective. With almost all aspects of business and personal consumption conducted online, it is no longer difficult to track return on investment. Aided by the right tools, the ability to make informed, data-driven decisions by looking at digital marketing analytics, marketers can answer key questions such as:
What are your most successful marketing channels?
Where are you getting the most bang for your buck?
How are your customers engaging with your company?
So where do you start? What do you need to measure? How do you measure what you require?
Minerra’s latest whitepaper answers these questions and more. We take a look at the evolving role of the marketer and what’s important now and in the future. We explore the role of analytics in marketing, and take you through some of the ways you can start leveraging the data you are no doubt already accumulating in the background.
Wherever your marketing department is in your analytics journey, Minerra is happy to help. Download our whitepaper to learn more about marketing, analytics, and how you can optimise your marketing approach. If you have any questions, or would like to chat about how we can help, contact us.
New whitepaper by Minerra – Marketing Analytics: The only way forward May 29th, 2018Minerra Editor
In our conversations, one theme quickly emerged. Many people visiting the Yellowfin booth had a similar conundrum: they had many different analytics tools (usually three or more) in their organisation and wanted to consolidate down to just one tool.
Over the two days, the number of people coming up to us with this issue was so numerous that we began calling it the “Pokemon Problem” – it appeared that many organisations treat analytics tools like Pokemon and feel that they have to “catch ‘em all”!
Why so many analytics tools?
The reasons why organisations “suddenly” had so many different analytics tools varied but over the course of the summit, some similar explanations emerged:
Different tools were acquired as a result of mergers and acquisitions but nothing had been done to standardise one analytics tool across the merged organisation.
Organisations had a loose or non-existent policy about the purchase of analytics tools within different departments, so each department purchased the tool they wanted despite different tools already existing within the organisation.
This is also a consequence of poorly governed self-service business intelligence.
Organisations had an existing analytics tool, but the existing tool did not have the required functionality. As a consequence of this, it decided to buy another analytics tool and use both tools in parallel rather than creating a plan to migrate from the old system to the new.
One organisation even said they purchased a new analytics tool because a newly employed data scientist did not like the existing analytics tools.
Why is this a problem?
The problems cause by collecting many different analytics tools were:
Confusion and inefficiencies among consumers of analytics content. There was a huge burden on staff to remember how to use many different tools to access the information they need to monitor performance and make decisions. This also led many users to use the analytics tools just to download raw data so they could later analyse the data using Excel – the one tool they know well. This also increased the risks associated with ungoverned analytics content being used in the organisation.
Low productivity for analytics developers, either because additional developers are needed to ensure that the organisation has expert skills for all the analytics tools, or the existing developers have to know all tools, which leads to them having only a moderate level of skill with all tools rather than being expert in one tool.
Increased licence and maintenance costs for the organisation because they have small licence holdings with many software vendors rather a large licence holding with one vendor, which may lead to overall lower licence costs.
Increased operational costs because IT departments have to provide and maintain multiple sets of infrastructure for each analytics tool, particularly if the tool requires a server to distribute the content.
In my next post on this topic, I will provide the responses we gave to attendees – how you can solve the issue, and where you can start.
Is your organisation using multiple analytics tools? Are your employees making the most of these tools? Are some of these tools perhaps redundant, or have too many overlapping features? Minerra can help assess your needs and weight them against what you have to provide you with a plan to streamline your analytics tools. Contact us for a casual chat to see how we can help.
Marketers were among the first departments to start utilising data and analytics to inform strategy and tactics. They had to go where the audience was, and when the audience moved online, so did marketers. The internet, social media and mobile technology have all influenced the way data driven marketing get their messages out to their target audiences, and the way they measure success. The digital marketing mix now has to include a web presence, social media accounts, and content generation and distribution as basic tactics, and all these have to be analysed and measured.
Marketers constantly have to answer these questions; What is your ROI? What are your most successful marketing channels? What is your cost per lead? How are your best leads scored? Is the sales team focusing on these top leads? Are the leads being generated aligned to the type of leads the business needs for continued growth and success? These are all questions that can only be answered with data and analysis.
The Future Of Data Driven Marketing
According to Gartner, more than two-third of marketers plan to base most of their decisions on analytics within two years. It is no longer acceptable for a sophisticated marketing strategy to be formed based on ‘gut feel’. Plans and tactics should not only align with overall business strategy, they must be informed and optimised by data driven marketing.
Their analysts have also identified four key traits marketers should have for success in data-driven marketing: empathy, agility, accountability, and a focus on data hygiene and integrity. High-quality data will aid marketers build dynamic programs that meet business goals.
Forrester’s analysts have identified the need to balance analytics with engagement. Data must be evaluated and then engaged with in the most effective way to create an impact on revenue and the bottom line. While many marketers are utilising analytics, they need to be able to tweak, adapt and change tactics to really ensure that they’re using the data to improve the marketing function.
Forrester also emphasizes the need to create deeper relationships with customers through technology – “business intelligence (BI) solutions; cloud infrastructure to reduce cost and be more agile; marketing tech that offers a real-time, single view of the customer; customer experience processes; and specifically useful to the next wave of relationships, artificial intelligence that will drive a conversational relationship with your customer”. Traditional advertising fails to create deep, conversational relationships – Forrester predicts the end these marketing channels.
Analytics and data visualisation platforms can certainly help marketers understand and analyse data in order to make data-driven customer-centric decisions. Real-time data allows for the marketer to be agile and test and tweak plans to ensure success. Engagement data allows the marketer to build deeper and more meaningful relationships with customers – whether engagement occurs via social media, sales teams, or through product usage.
Wherever your marketing department is in your analytics journey we’d be happy to have a no-obligation consultation with you to see how we can help you meet your business goals. Contact us at our details here.
Analytics and Data Driven Marketing: What are the analysts saying? May 29th, 2018Minerra Editor
These buzzwords entered conversations in organisations a few years ago now, but not many people know what they mean.
As information technology has pervaded every department and every business process, companies have generated a tremendous amount of data. From small businesses to international conglomerates, every organisation generates data that, until fairly recently, has been simply accumulated and stored. And until you can access, interpret and understand this data, your organisation is missing out on the opportunity to use this information to augment and make informed business decisions.
Analytics is a vital tool to support the decision-making process. This business.com article provides a few succinct examples of how businesses of any size can benefit from analytics:
It’s much easier to make informed decisions.
It’s a structured way for growing revenue.
It increases the competitive advantage over other players in the industry, including larger businesses.
It improves the productivity of their business operations.
It enhances the quality of their customer service.
So why wouldn’t you leverage your data to gain these benefits? There are many perceived barriers companies face when thinking about approaching analytics:
I don’t think my company wants to spend money on this yet.
We don’t have the budget for this.
I don’t know where my data is.
I don’t know how to get to my data.
I don’t even know what is being stored.
I wouldn’t know the first thing about sorting out the data.
I cannot interpret this data.
I cannot communicate this effectively to my manager/head of department/CEO.
With the right tools and the right approach, companies can easily overcome these barriers so they can leverage their data for competitive advantage.
We’ve written a short guide to help you think about whether your organisation is ready to take the leap. Click on the button below to fill in a quick form and get the guide delivered to your inbox.
If you think your organisation is ready to have a conversation about analytics, give us a call. We’d be happy to have a no-obligation consultation with you to see where your organisation stands, and how we can help.
Analytics: Are you ready to take the leap? May 31st, 2017Minerra Editor