Why Visual Analytics?
In Part 1, I defined what I mean by “visual analytics”, explored its current usage (or lack thereof) in this sector, shared some of the reasons why it is underutilised and why visual analytics is such a great fit for the retail sector. In Part 2, I discussed how we can (by overcoming previous obstacles) make data visualisation the “standard” method for analysing information (and making fact-based decisions).
In Part 3, I want to show how this can work in practice, showing a snippet from a recent project that demonstrates the advantages of visual analytics (over traditional methods) to answer complicated business questions, and derive the “holy grail” of actionable insights.
5 minute sample from a workshop
The following covers a 5 minute section from a series of (half day) workshops, we’ve been running. Although I will detail out the analysis and insights, my intention is not to run through the analytics themselves, but to demonstrate how the knowledge (and expertise) of both the retail specialist visual data analyst, and the retail buyer come together to augment the “intelligence” of both, through the power of data visualisation.
We’ve been running a project with a retailer and one of the things we’ve been looking at, are the differences in customer preferences in different regions of the UK—How big are these differences?, How important are certain products and local brands (or categories) to different parts of the UK? Are the differences so great that space given to stores in a region should differ from the norm? Should local brands be pushed, in their “home” area? How far does this “home” area stretch? And so on. We have been running workshops with the buying teams to investigate/answer these questions.
Problem 1 - The Data
To look at the data over geography and by product, we needed the data at store and product level. With 500+ stores, 40k+ products, and 52 weeks, a quick bit of maths will tell you this isn’t something that can be done in Excel. Even if the data would “fit”, humans just can’t process this much tabular data, even for a single category.
We also wanted all the possible detail we’d need, in one place, to further reduce the barrier to asking the next question. In doing this, we keep the analytics flow going (having to go back to the database and re-run some reports would break this flow). We took in and organized one year of sales data by store/product, and added some geographic aggregations (TV region, county, etc.) to the data (as well as some further pre-processing and reshaping). This was around ½ billion rows of data (a non-trivial amount), and using our data domain knowledge of data-modelling, spatial data, Alteryx (data preparation*), and Tableau (my data viz “weapon of choice”!), we were able to create an interactive model that was both complete, and fast to use (again we want to reduce any friction to insight).
*The choice of which data technology to use requires experience of both the data and retail world. As a “live” analytics model, I wanted to be able to quickly change/add elements to the data flow (e.g. experience tells me that a retailer's product master data, as stored in their data warehouse, isn’t always how the buyers look at their data. They often have “spreadsheets” of the product hierarchy they use, on their desktops. By choosing Alteryx, I was able to quickly amend the ETL to include these files (or any other information they had, such as store clustering), in the session. Alteryx is also really good at working with spatial data.
Problem 2 - Complexity Reduction
Using our retail domain knowledge we were able to also create some “composite” metrics. The problem is a complicated one, where we need to consider many factors, so reducing the dimensionality (as long as the method is valid) simplifies the process of getting to insights. For example, we want to compare all stores on a comparative basis, but stores are of different sizes, have different footfall, and have different customer missions. From our knowledge (and experience in the sector) we were able to combine multiple measures into a few metrics. It was also important that the buyer understood these metrics, and had confidence in them. So we demonstrated how the measure worked in their language and used familiar examples (e.g. Newcastle Brown Ale in the North East, which we’d expect to over-trade there, and furniture polish, which we’d expect to have no regionality) and were able to demonstrate that this measure picked up regionality. Once they had confidence in these metrics, we could forget the complexity of the metrics and just look for blue and red things! All this comes from a deep knowledge of the retail (and specifically grocery) sector.
Problem 3 - Visual Exploration Model
Using our data visualisation domain expertise and Tableau expertise, we built a series of interactive dashboards, allowing us to start off at a high level, asking questions like “Which sets of products over-trade in certain regions?”, and “How many products/brands over-trade in each region”...gradually allowing us to focus and drill down to even finer grains of detail, as the questions were asked. Much of this came from the buyer and their specific category knowledge, allowing us to quickly focus on products, product groups and regions to dig into. The below dashboard** is just one of the dashboards we used and is designed to answer a specific question efficiently (rather than the mega-dashboard, designed to show everything...and ends up showing nothing!...again knowledge/experience of visualisation and analytics leads to this design choice)
** In the below example, the scale of the measures has been altered, and stores anonymised, but the insights and findings are real (although I won’t mention the actual product/product-group)
For a product (or selection of products) this shows the composite metric by store, ordered by that composite measure (and coloured by region)
Shows the actual rate of sales (£ per store per week), to give true scale (and is familiar to the buyer)
A map, where the stores are coloured and sized on a composite metric
A Pareto curve for the product(s), by store (the shape/steepness of this tells us about the localisation of the product/product group)
Uses a composite metric over the entire group of products, which the selected products belong to. This indicates the “health” of the entire group of products, in the stores (compared to both the estate and a store’s region)
In this example we are looking at a product from a supplier based on the South West coast, and we can see the local importance of this product and how far its “influence” stretches, effortlessly. The buyer was amazed at the rate of sale in the products “home” area. They usually only see the (Average of Averages) figure, which is the gold reference line (15 times less than the top store). Next they wanted to see how “important” this product was in these (South West) stores, OK let’s click on some stores...
This brings us up these stores (6), and shows the ranking of this product among its product group (peers). It's also worth noting that the data behind the above view is around 20k rows...much easier to digest than a 20k row spreadsheet of the same information.
It is the number 1 or 2 product in these stores. As the buyer usually see the estate average figure, this product is usually around the bottom of the product ranking. Actually what we found had been happening was...a local product was launched (generally in its home region, on a trial basis) the rate of sale would be really high, so its distribution is increased to more stores. The rate of sale is then diluted, by being in the "wrong stores", and it now looks like a poor performer (and if's a fresh product, it also has high wastage). The product is then de-listed (everywhere), and a category-myth "that local products have been tried and don’t work" is born.
Here are some of the actions taken
Product is “ring fenced” in the “home” stores so can not be de-listed. The “home” range is defined by the buyer based on sales, wastage (if fresh product), composite metric. We can export the list of stores and some key measures needed by supply chain and range planning in 2 clicks.
“Home” stores are given extra space for these products, and supply and visibility is prioritised. As a local brand will generally be of higher margin than a large national, every sale you can direct through the local product will be margin enhancing to the category.
Recognition that in the “home” stores these local products are the major brand, so deserve the same support (Space, Point of Sale support...etc.)
Space is reduced (or removed) for this product, outside its “home” stores (freeing this space up and saving wastage, in the case of fresh products). Wastage is a big focus for retailers at the moment, for both financial and PR reasons.
On Figure 1 (5), it can be seen that the ranking of stores for this product (broadly) correlates with the category doing better. So even outside the “home” area this product sells better, where the category performs better. This suggests that, outside the South West, people are buying the product, despite it not being from their region (albeit in lower levels), where this product ranks 6 or 7 in these stores. This helped direct the local sourcing team on where (and what products) to look for, as it is plausible that if these areas had their own local version, it could be the number 1 product there...driving the category even harder (along with the halo effect, and many other good things that come from selling local products, and supporting local suppliers).
...and this is all from a five-minute snippet from our half-day workshop (and just one category...we looked at 70).
Hopefully this has demonstrated that a well-built visual model (utilising the visual analysts retail and data/visualisation knowledge) and working directly with the buyer (and the specific category knowledge they bring), we were able to ask and answer very deep questions, in a fact-driven and efficient way.
...I’d also add that the buyer found this really engaging and fun (yes, that’s the word they used!), which isn’t something I’ve often heard from a buyer regarding the data analysis side of their role. This also increased (IMHO) their analytical thinking, and how they viewed their category. Following the session the buyer was able to take the model (having worked live with me for half a day) and explore questions on their own...Actually using the model (once you are comfortable with what you are looking at) is just clicking on maps/charts, drop-down selectors and sliders...so no more complicated than booking a flight!