Introduction to the data strategy
Business intelligence solutions process and present existing data to provide fresh insights, reducing the time and resources required to reach a point of action. However, acquiring data can be difficult and expensive. In fact, when we first start implementing a new business intelligence strategy, the cost of data acquisition may be a barrier to many of our candidate projects. For this reason, the data strategy is an essential pillar of the overarching business intelligence strategy.
Throughout this series of articles I use the term data availability. This umbrella term covers the factors that determine the effort and cost of using the data. If data availability is high, it should be relatively inexpensive to use the data in our BI solutions. Conversely, if data availability is low then acquiring the data for the BI solution will be expensive and risky. The data pillar of the BI strategy has one simple aim:
Increase the availability of data to everyone in the organisation
A data strategy does not have a discrete beginning and end. Like the process improvement strategy, it is a continuous commitment. The goal is to improve the availability of useful data, thus lowering the bar to subsequent projects that also require it.
What do we cover in the data strategy?
We start by discussing common factors that reduce data availability in every organisation. We then consider practical strategies to increase data availability by tackling each one of the availability factors in turn. Data warehousing is currently the most widely accepted method of increasing data availability and so we discuss the characteristics of a data warehouse that make it an ideal data source for business intelligence solutions. Further articles relating to Data Strategy discuss some of the challenges of implementing a data strategy including ETL, data ownership, and defining business rules.
Although this may sound like a technical area, I keep the discussions practical and business oriented. The data strategy is predominantly about raising awareness of the value of data as a business asset. We need to appreciate the business issues before we move to technical solutions.
Data strategy or data warehouse strategy?
Some readers may question the title of this chapter. Is a data strategy just a data warehouse strategy by a different name? A data warehouse will be a central aspect of a data strategy in most organisations but the two concepts are not identical. The data strategy starts from the premise that a BI solution may use any data stored inside or outside of the organisation. It is a secondary question as to whether we need to put that data in a data warehouse as an intermediate step to making it available to BI tools. The goal of the data strategy is to increase the availability of the data irrespective of the location. To illustrate this point, we consider some examples of where the BI solution may extract the data directly from source.
Data outside of the data warehouse
The following scenarios are just some examples of where we may choose not to use a data warehouse in a business intelligence solution. A data warehouse in a large corporate environment may be difficult and expensive to change but we should not automatically discard BI projects just because the data warehouse does not currently hold our target data.
Analytical databases
The increasing power and capacity of end user tools and analytical databases make it practical to consume large volumes of data outside of a formal data warehouse environment. For tactical projects, proof of concept, or speculative analysis, it is often viable to move the data straight from source into the BI tool.
External data
The variety and volume of data that is available outside of the organisation is another reason to consider data outside of the DW. It may be expensive, impractical, or unnecessary to duplicate all these potential sources in the corporate data warehouse. BI tools can combine data from the data warehouse and external sources without the overhead of a formal data warehouse project.
DW update schedule
We may need to go straight to the source system if the BI solution needs to use the latest operational data and this is not available in the data warehouse. Changing update schedules in the data warehouse itself may not be straightforward and the cost may be prohibitive for a short BI project.