Power Pivot sample workbook - Australian Census Data

Key Points

  • Power Pivot is part of Excel and Power BI Desktop
  • Download this sample Power Pivot model and have some fun analysing Australian Census Data
  • The Power Pivot model demonstrates how a simple data model can provide insight and usability
  • The model demonstrates the basics of Power Pivot, Power Query, and DAX

Kick-start your understanding of Power Pivot, Power Query and Power BI with this sample model from CMBI. The model integrates two Australian Census Data Packs from 2011 and 2016.

Download the sample Power Pivot model

CMBI have developed this Australian Census Power Pivot model using Excel 2016 and Power BI Desktop - a FREE self service analytics tool from Microsoft.

These files are quite large because they contain the fully processed Census Data set. A much smaller unprocessed version of the Power Pivot file is also available in the Other Support Materials below

Other Support Materials

The downloads above are fully processed Power BI models with everything you need to start analysing data and building reports. However, if you want to extend or reprocess the data you will also need some additional files

Data Pack files

We used two suburb level data packs to build this model. You can download the data packs from https://datapacks.censusdata.abs.gov.au/datapacks/

  • 2016_GCP_SSC_for_AUS_short-header.zip
  • 2011_BCP_SSC_for_AUST_short-header.zip

Term of use

We designed the model as an engaging Power BI training aid. We have made all the sample files available for download. Please feel free to download, extend, and distribute them.

When using the samples we kindly make the following requests

  • Continue to credit the ABS in line with their terms of use for census data
  • Credit CMBI and/or continue to include the About page as a visible worksheet
  • CMBI provides the sample AS IS for the purposes of demonstrating Power BI capabilities and provides no warranties explicit or implied for use of the resulting data

Introduction to Australian Census Power Pivot Model

This CMBI Census Power Pivot sample model uses Power Query and Power Pivot to integrate two complete ABS Census data packs from the 2011 and 2016. The integration makes it much easier to analyse a wide range of census data across and within different suburbs.


Australian Census Excel Pivot Table with conditional formatting

We can use the Power Pivot model to analyse 1000s of statistical data points across all the suburbs in Australia. For each data point and suburb, we can find the movement between the two census dates, calculate the data point percentage of the suburb’s population, and find the suburb rank and rank movement. In total, the Power Pivot model contains over 40 million data points.

Power Pivot Sample Data Mode

The Power Pivot model integrates data from over 100 ABS Census files into three tables


Census Power Pivot Model Tables

  • 1. Suburb Location contains the list of Australian suburbs, their State, and their latitude and longitude so that we can visualise statistics on the map.
  • 2. Census Metric contains a list of all the statistical census metrics in the census Data Packs. For instance, Persons over 100 years old is a census metric.
  • 3. Census Pivot contains the 40+ million data points each of which references a Census Metric and a Suburb

About the ABS Data Packs

The ABS data packs are a great resource but are quite complex to understand and integrate. Each data packs comes as a collection of around 60 flattened CSV files. Each of these files contains codes and information about these codes is in various metadata and location data files, also included in the ABS data pack.

We used two suburb level data packs to build this model. You can download the data packs from https://datapacks.censusdata.abs.gov.au/datapacks/

  • 2016_GCP_SSC_for_AUS_short-header.zip
  • 2011_BCP_SSC_for_AUST_short-header.zip

We chose suburb level data because it is the most familiar geographic sub division for most people. Postcodes level statistics are also available from the ABS but these are less granular. Other statistical divisions like LGA (Local Government Area) can be volatile and are less familiar to most people.

Integrating the data pack files with Power Query

The Power Pivot model integrates data from over 130 separate files. These files are a combination of metadata, reference data, and the statistical data pack files themselves. We have categorised the Power Queries in the workbook using folders.


Workbook Queries Pane

  • Parameters folder contains the Power Query parameters that hold the location of the census files we integrate. (if you want to rerun the queries you will need to adjust these parameters to the location where you have placed the census files)
  • Metadata folder contains the queries that extract the Census metadata for the two Census years. This metadata is the list of statistical data points and categories
  • Census Transform contains the queries that integrate the 128 Census Data Pack csv files
  • Functions contains two similar Power Query functions that take a file name as a parameter and perform a number of transformations on the files to make it ready for integration into the Power Pivot model

Power Query dependencies

A useful feature of Power Query is the ability to see Query Dependencies.


Power Query Dependencies window

The figure above illustrates the dependencies between the final Power Pivot Table: Census Pivot, and the underlying Power Queries and data sources.

  • 1. Census Pivot is the final table loaded to Power Pivot
  • 2. Census Pivot is the combination of data from the Census_2011_Base and the Census_2016_Base queries.
  • 3. The Census_2011_Base relies on the parameter pCensusSurburbFolder2011 to locate the data pack files.
  • 4. The pCensusSurburbFolder2011 points to this file location
  • 5. The Census_2011_Base also relies on the fuPivotCensus2011 function which applies a set of transformations to each file in the data pack directory

Power Query functions

One very useful feature of Power Query is the ability to convert a set of transformation into a function and reuse them many times. In our Census model, we have many census data pack files and we want to apply the same set of transformations to each of these files.

The easiest way to create a Power Query function is to go through the normal process of connecting to some data and doing transformations using the regular Power Query window. Having done this, we can then parameterise the query and make it generic. It is a very similar process to recording a macro in Excel and then editing it using VBA Editor.


Power Query Function in Advanced Editor

To create a Power Query function from an existing query

  • 1. Duplicate the original query and name it appropriately then go to Advanced Editor
  • 2. Type an additional couple of lines at the top of the query: let Source = (parameter as text) =>
  • 3. After this additional code follows the original Power Query and substitute areas we want to parameterise with the parameter name we created at the top of the query

There are plenty of blogs on creating Power Query functions. They are definitely worth exploring.

Sample Census Reports

We have created two sample reports to illustrate the breadth and depth of the data and to demonstrate some useful Power Pivot and Pivot Table functionality

Suburb Compare Report

The Suburb Compare Excel worksheet contains a list of suburbs. We have chosen to filter this Pivot Table for NSW suburbs. The Census Metric in the Filter section of the Pivot Table contains a long list of Census Metrics. You can search using the text search box or simply browse using the scrollbars.


Compare Suburbs for a Census Metric

We have applied various settings to this Pivot Table to enhance usability

  • Suburbs rows filtered to suburbs with two # Data Points. This means they have data for 2011 and 2016 and which makes them more interesting
  • Sorted by # Census 2016 to see the most significant suburb contributors at the top of the list
  • Conditional formatting on % of Population Change 2011 to 2016 (see metric explanation below)
  • Ranks of each Suburb and change in rank between the two census dates

Single Suburb Report

Whereas the Suburb Compare report compares many suburbs against a single metric, the Single Suburb report compares many metrics for one suburb.


Compare Metrics for a Census Suburb

The settings we have applied to this Pivot Table are

  • Census Metrics rows with two # Data Points. Again, this means they have data for 2011 and 2016 which makes them more interesting
  • Sorted by # Census 2016 to see the most significant metrics as a % of the suburb population at the top of the list
  • Conditional formatting on % of Population Change 2011 to 2016
  • Ranks of each Suburb and change in rank between the two census dates

Suburb Name Uniqueness

A suburb name is not necessarily unique within Australia or even within a State. The most unambiguous column in the model is Suburb Description. However, sometimes the same suburb has slightly different labels between the two census periods. The # Data Points is the best indicator that the rows contains multiple physical suburbs. If it is greater than 2 then analyse the data using the Suburb Description.

Conditional Formatting in Pivot Tables

One of the nice features of Pivot Tables is that we can associate conditional formatting with the Pivot Table value rather than a fixed range. This means that if the Pivot Table number of rows change – when we change a filter value for instance – the conditional formatting remains intact for all the Pivot Table rows.


Associate Conditional Formatting with Pivot Table value

  • 1. Associate the rule with all the cells in the Pivot Table for a specific Pivot Table value
  • 2. You can edit the number range for each of the arrows. In our two sample reports we have used slightly different ranges for the conditional formatting to highlight significant variances

DAX Calculations

We have added a number of DAX calculations to the model which highlight the significance of and change in census metrics. These calculations also highlight some of the key features of the DAX calculation language.

In the sections below, we have included a selection of the calculations in the model that broadly represent the features used.

# Census:=SUM('Census Pivot'[Census Value])

# Census measure aggregates the Census Values for a given selection in the Pivot Table. Not all aggregations of Census Value will be useful calculations. For instance, the addition of Total Females metric and Total Persons metric would not give a sensible value. However, the additional of Total Females 0 – 4 and Total Females 10 – 14 would.

In short, we should use this measure with caution and ensure we know what underlying values we are aggregating. That is one reason why we include the # Data Points calculation in the Pivot Tables, so we can see how many discrete Census Values we are aggregating.

# Census 2016:=CALCULATE([# Census],'Census Pivot'[Census Year]=2016)

The # Census 2016 builds on the # Census measure described in the previous section. The Census Pivot table contains some rows for Census 2011 and some rows for Census 2016. The # Census 2016 aggregates only rows relating to the 2016 Census. There is a similar measure for 2011. We use these two measures in further model calculations that find the difference in value between the two census dates.

# Suburb Rank 2016:=RANKX(ALL('Suburb Location'[Suburb Name]),[# Census 2016])

The # Suburb Rank 2016 calculates the rank of the suburb for 2016 Census Values. So for instance, if we selected the Total Persons metric and then selected State=NSW, and Suburb Name=Castle Hill then we would get # Suburb Rank 2016 of 3. This is indicating that Castle Hill has the third largest Total Persons in NSW. Power Pivot calculates the RANK value in the context of other filters. For example, if we removed the NSW filter then we would get a Rank of 11 for Castle, showing that Castle Hill is the 11th largest suburb, Australia wide.

# Rank Change 2011 to 2016:=[# Suburb Rank 2016] - [# Suburb Rank 2011]

The model contains a number of calculations that compare 2011 and 2016 census values. The # Rank Change 2011 and 2016 shows the difference in Rank between 2011 and 2016. This gives an indication of whether the suburb has changed relative to the change in other suburbs.

# Population 2016:=IF(ISBLANK( [# Data Points]),BLANK(), CALCULATE([# Census 2016], ALL('Census Metric'), 'Census Metric'[Census Metric]= "Total Persons Persons"))

The # Population 2016 calculates the Total Persons metric even if we have selected another metric. This is useful because it allows us to compare a metric to the overall population of the suburb or selection (see below).

% Population 2016:=DIVIDE([# Census 2016],[# Population 2016])

The % Population 2016 calculates the current metric/selection as a % of the Total Population. For instance, for Castle Hill it is useful to know that 13,799 persons speak Other Language at home but it provides further context to know that this accounts for 34.85% of Castle Hill’s population.

# Suburb Rank % Population 2016:=RANKX(ALL('Suburb Location'[Suburb Name]), [% Population 2016])

The # Suburb Rank % Population 2016 calculates the suburb rank for % of population of the metric. Following the Castle Hill example above, we see that Castle Hill is ranked 261 as a % of Person speaking another language at home within NSW.

Importing the Excel Power Pivot model into Power BI

One of the advantages of developing Power Pivot models in Excel is that you can import them into Power BI using Power BI Desktop. You cannot import a Power BI Desktop model into Excel so if you want to have a model that you can use in both tools then we recommend you develop the initial model in Excel


Import Excel Power Pivot model into Power BI Desktop

Having developed the Excel Power Pivot model it is just a few clicks to create a Power BI Desktop version of the model. After opening Power BI Desktop

  • 1. Select the File menu
  • 2. Select Import
  • 3. Select Excel workbook contents

You then select the path of the existing Excel Power Pivot model and Power BI Desktop imports the Data Model, DAX calculations, and Power Queries. You can now take advantage of the all the Power BI features including

  • Interactive visualisations and cross filtering
  • Publishing to mobile devices and the web
  • Performing Q & A natural language queries
  • Extending and enhancing the model with the latest DAX functions

Power BI Census Model on the web

We converted the Power Pivot model to a Power BI model using the process described above. Having done this we created some visualisations and published the model to the Power BI web service and then to the web where any web user worldwide can view and interact with the reports.

http://www.cmbi.com.au/Power_BI_ABS_Census_Model_Sample.html

For ease of use, we have published the report pages as separate displays down the page but you can navigate between the different reports from any of the visualisations using the Power BI toolbar.


Navigation toolbar on Power BI Web

  • 1. You can navigate between the report pages using the > controls
  • 2. You can share this report with a friend or colleague using social media (please do!)

Understanding the Power Pivot SHELL File

The Excel file CMBI_ABS_Australian_Census_Power_BI_Sample_V1.0 SHELL.xlsx is a much smaller file than the full model because it contains an unprocessed Census Pivot table. If you want to process the data using this file then there are four steps:

  • 1. Download the relevant data packs from the ABS website
  • 2. Set the Power Query parameter pLoadFiles to the second option: .csv
  • 3. Set the other parameters to points to the location where you placed the data folders and files
  • 4. Process the Census Pivot Power Query query

Set the pLoadFiles parameter

The Power Query parameter accepts two values. You can change the parameter by following the steps in the screenshot below:

  • 1. Right click edit the pLoadFiles query
  • 2. Select .csv from the Current Value dropdown
  • 3. Close and Load


Changing the value of a Power Query parameter

Load data into the model

Repeat the process above for each of the other parameters which contains the folder and file paths of each of the source data elements

Finally, right click the Census Pivot query in the Census Transforms group and click Refresh… The query is very resource intensive and may take 10 minutes or more to complete.


Other Power Pivot and Power Query samples to download

CMBI's Sydney house price Power Pivot sample includes an Excel file to download and detailed notes.

CMBI's World Cup 2018 Power Query sample includes an Excel file to download, notes, and instructional video.

CMBI's Australian Census Sample includes an Excel and Power BI Desktop file to download, notes, and online interactive dashboard.


Want to learn more about Power Pivot?

CMBI provides onsite training for all levels of Power Pivot users and developers.

See our Training page to learn more about CMBI's training and mentoring programs.


About CMBI

CMBI designs and develops business intelligence and analytics solutions for clients in Sydney and regional NSW. CMBI technology expertise includes Power BI, Power Pivot, Excel, SQL Server, Analysis Services, SharePoint BI, and Tableau. Contact us today to see how we can help you make your BI goals a reality.