Organizations often claim to be data-driven, relying on the utilization of data to base their product and engineering decisions. But not all data is equal. We have what’s called vanity metrics, misleading and misrepresentative, that are often mascaraed as forcing factors for making decisions.
Vanity metrics are metrics that make you look good to others but do not help you understand your own performance in a way that informs future strategies. These metrics are exciting to point to if you want to appear to be improving, but they often aren’t actionable and aren’t related to anything you can control or repeat in a meaningful way. Vanity metrics are most often contrasted against actionable metrics, which is data that helps you make decisions and helps your business reach its goals or grow.Tableau
In Amazon’s sphere, we often refer to vanity metrics as output metrics. A common problem, teams would tend to focus on output metrics and not input metrics. Take the following analogy, as explained in the book, Working Backwards, by Colin Bryar and Bill Carr:
Before you can improve any system . . . you must understand how the inputs affect the outputs of the system. You must be able to change the inputs (and possibly the system) in order to achieve the desired results. This will require a sustained effort, constancy of purpose, and an environment where continual improvement is the operating philosophy.
Working Backwards (2021, Bryar, Carr)
Input metrics are controllable metrics, known in the industry as leading indicators, whereas output metrics are known as lagging indicators.
“The right input metrics get the entire organization focused on the things that matter most. Finding exactly the right one is an iterative process that needs to happen with every input metric.”
Once teams define and refine their input metrics, it is presented in an Amazon cadence, called a Weekly Business Review.
Amazon’s Metrics Lifecycle
Through a continuous improvement cycle, Amazon leverages DMAIC, short for define, measure, Analyze, Improve and Control. This mechanism is leveraged from the industry-famous Six Sigma framework.
1. Identify Input Metrics
What is the metrics that is controllable that subsequently frames and influences the output metrics? Bryar and Carr recall the following anecdote:
One mistake we made at Amazon as we started expanding from books into other categories was choosing input metrics focused around selection, that is, how many items Amazon offered for sale. Each item is described on a “detail page” that includes a description of the item, images, customer reviews, availability (e.g., ships in 24 hours), price, and the “buy” box or button. One of the metrics we initially chose for selection was the number of new detail pages created, on the assumption that more pages meant better selection.Working Backwards (2021, Bryar, Carr) as citied in holistics.io
Once we identified this metric, it had an immediate effect on the actions of the retail teams. They became excessively focused on adding new detail pages—each team added tens, hundreds, even thousands of items to their categories that had not previously been available on Amazon.
(…) We soon saw that an increase in the number of detail pages, while seeming to improve selection, did not produce a rise in sales, the output metric. Analysis showed that the teams, while chasing an increase in the number of items, had sometimes purchased products that were not in high demand.
When we realized that the teams had chosen the wrong input metric—which was revealed via the WBR process—we changed the metric to reflect consumer demand instead. Over multiple WBR meetings, we asked ourselves, “If we work to change this selection metric, as currently defined, will it result in the desired output?” As we gathered more data and observed the business, this particular selection metric evolved over time from
– number of detail pages, which we refined to
– number of detail page views (you don’t get credit for a new detail page if customers don’t view it), which then became
– the percentage of detail page views where the products were in stock (you don’t get credit if you add items but can’t keep them in stock), which was ultimately finalized as
– the percentage of detail page views where the products were in stock and immediately ready for two-day shipping, which ended up being called ‘Fast Track In Stock’.
Instrumentation comes next to help validate your input metrics. From your hypothesis to measurement, you ensure your tooling removes bias in measurement, and set forth a mechanism for how to audit your metrics.
As your measurement matures in the lifecycle, “you develop a comprehensive understanding of the underlying drivers behind the metrics” (2021, Bryar, Carr). The authors also call this, reducing the variance, to ensure the process is predictable and controllable.
Charlie Bell, an SVP in AWS, has a saying: “when you encounter a problem, the probability you’re actually looking at the actual root cause of the problem in the initial 24 hours is pretty close to zero, because it turns out that behind every issue there’s a very interesting story.”Working Backwards (2021, Bryar, Carr) as citied in holistics.io
Here comes the iteration part, where you look at the output metrics and use that as a guide to improve your product feature. You make changes that will lead to improvements in the output metrics. One thing the authors have noted is that as you improve your features, your input metrics end up becoming less useful, in which case it is okay to deprecate them, in favor of more useful metrics.
The final phase is control, a measure to ensure your mechanism is optimally performing and not regressing. This may eventually lead to a complete automation and other improvements.
The Amazon Deck
Within the Weekly Business Reviews (WBRs) at Amazon, a deck consists of the most important metrics in an organization. holistics.io highlight a few notable properties of a deck:
- The deck represents an end-to-end view of the business.This is deliberate — the authors write that “while departments shown on org charts are simple and separate, business activities usually are not. The deck presents a consistent, end-to-end review of the business each week that is designed to follow the customer experience with Amazon. This flow from topic to topic can reveal the interconnectedness of seemingly independent activities.”
- The deck is primarily charts, graphs and data tables. Since there are hundreds of visualizations to review, written notes will bog the meeting down too much. Two notable exceptions to this rule are ‘exception reporting’, as well as the ‘voice of the customer’ anecdotes that customer service is allowed to insert into the metrics deck.
- There is no ideal number of metrics to review. Amazon itself constantly adds, modifies and removes metrics from the WBR deck as business needs evolve.
- Emerging patterns are a key focus. You want trend lines, and you want to know them long before they show up in a quarterly or yearly result.
- Graphs are usually plotted against a comparable prior period. Metrics make sense when compared against prior periods, so that you have a proper apples-to-apples comparison (for instance, you’ll want to compare holiday periods to a prior holiday period, not to a slow period).
- Graphs show two or more timelines, for example, trailing 6-week and trailing 12-months. Small but important issues tend to only show up in shorter trend lines; they tend to be smoothed out in longer ones.
- Anecdotes and exception reporting are woven into the deck. The only exception to the ‘charts, graphs and data tables’ rule are anecdotes and exception reporting. About which, more later.