Business Intelligence Infrastructure

Key Points

  • A single powerful off-the-shelf server is preferable to several budget servers
  • BI infrastructure and tools should be available directly to the power users. Avoid stifling innovation with heavy handed security
  • Relax traditional requirements for separate Development, UAT, OAT, and Production environments in the early stages of a BI project

BI tools are resource hungry – we measure their efficiency in person-hours and business value, not CPU cycles

About CMBI

Business intelligence infrastructure

When you first start to implement the business intelligence technology strategy, it may be sufficient to host all your BI components on a single server. In my experience, a single powerful off-the-shelf server is preferable to several budget servers. I make this point because hardware virtualisation enables server administrators to deploy server instances with a lower specification than a modern laptop. These small servers are not appropriate for BI development.
BI tools are resource hungry – we measure their efficiency in person-hours and business value, not CPU cycles. The resource utilisation pattern of BI is different from operational systems. The server will either be 80%+ utilised or not at all. It is common to schedule ETL jobs overnight, regular report processing in the early morning, with ad hoc querying and analytics running during the day. With server utilisation staggered over the course of the day, it makes sense that each discrete process has access to all the available hardware resources.

Environments and deployment strategy

The traditional approach to software development is to have different environments for development, user acceptance testing, and production systems. This provides distinct areas to build, test, and integrate new development work with the existing solution. There are several reasons why you may wish to modify and relax this traditional approach for your BI strategy. We will consider these in the sub sections below.

Data volumes

BI tends to deal with large data volumes. It is one thing to have several environments when you are managing a collection of code modules and a small volume of test data. It is quite another to replicate a multi-terabyte database across three environments to give realistic indications of performance and scalability. The peculiarities of BI development and testing mean users often find it hard to give feedback unless you are showing them up-to-date production quality data. Poor performance (often query response time) is one of the major reasons why BI solutions are under-utilised. If you are developing and testing with a subset of data, it is much more difficult to predict production performance.

Agile development

Power users may be discussing a business problem with a colleague one minute, and designing the solution the next. This extremely agile form of development is possible because power users understand the business domain and the capabilities and limitations of the data and technology. This pattern should be encouraged because power users will be a key source of innovation in any business. They can informally design and test new solutions without the resource sapping demands of a formal project process. If the solution is useful then this is the time to apply more rigour to the design and ongoing maintenance.
In practice, this means that the BI infrastructure and tools should be available directly to the power user. It is a crushing disincentive to using the BI environment if the power user must go through a highly bureaucratic process of authorisation and approval every time they want to introduce a new capability. The most likely outcome of restricting access will be that the development once again goes underground using personal databases and spreadsheets, instead of utilising more efficient specialist tools. The worst outcome is that this type of development is completely stifled.

Requirements volatility

Refer to the discussion of the path to structured and measurable decision support as shown in the Figure below.

Stepping stone path to decision support
The stepping stone path to decision support

Moving the BI solution from supporting the business process to driving it requires a continual cycle of development and feedback. Dashboards and reports will typically undergo a number of revisions before they arrive at the final design. Each revision and innovation opens the minds of users to further possibilities and brings the solution ever closer to supporting a point of action. Meanwhile by publishing the imperfect product in early iterations we still provide better support than if no solution existed at all. We should welcome and support this feedback loop. It is clear evidence that the users are actively engaged in the process. Sometimes change requests may seem superficial but as a rule, it is better to accept these and maintain an open dialogue, than introduce a formal change control process.
A lengthy deployment process can manage risk but at a high price. Users who accept some responsibility for the outputs in early iterations benefit tremendously from early exposure to the solution. My experience is overwhelmingly that users will be realistic and responsible in their use of the data and accepting of the limitations inherent in more advanced data manipulation and presentation.

Server resources and utilisation patterns

Given a fixed technology budget, it follows that as we increase the number of environments we reduce the budget allocated to each.
The pattern of utilisation for BI solutions is very different to operational systems. Operational systems support many concurrent users across a fixed number of constrained processes. Some BI processes may resemble this pattern but technologies like ETL, OLAP queries, and data mining model training will fully utilise a powerful server until the algorithm, query, or batch process completes. Because server utilisation is relatively infrequent but highly resource intensive, scheduling batch loads and complex data manipulation will make best use of server resources.
In the early iterations of BI projects, it is preferable to have one powerful server to support fast iterations of development, batch processing, and query response. The user community is likely to be quite small at this stage in the program. Shared access to resources is best managed through informal communication and scheduling of regular resource intensive processes outside working hours. As the user community and number of active projects grows, the increased sponsorship and utilisation of the BI platform will justify adding other environments.

 

See Also

For comments and feedback or to talk to CMBI about your BI and DW requirements please visit our Contact page or email insight@cmbi.com.au