“Data-Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media.
Why was person-level measurement important?
Much of the value of digital advertising historically has been associated with its ability to be more finely measurable than traditional broadcast media such as print or linear TV. Device-level tracking data enabled deterministic, bottoms-up measurement for the first time. Bottoms-up measurement offers marketers more detail than traditional top-down aggregate approaches such as media mix modeling. The focus of innovation, as a result, has been on the data set – associating devices with people through device graphs to deliver increasingly accurate deterministic measurement at the person-level. This focus on the data set has come at the expense of measurement methodology, which is often ignored or left at “default settings” in a platform or media plan.
But has bottoms-up measurement ever been purely deterministic? Of course not. All advertising measurement is probabilistic. The differences between statistical models such as Shapley Regression or Random Forest used by attribution or brand lift platforms is far more important – in terms of how uncertainty is measured – than whether the bottoms-up data set is actually at an individual person level or, for example, at the level of cohorts comprised of multiple individuals.
It’s important to recall the specific advantage of bottoms-up digital advertising measurement: With the granular data sets yielded by digital, it can be easier to identify the causal drivers of ad effectiveness. However, the data set doesn’t need to be person-level to achieve this. A large set of groups of individuals (e.g., thousands of ZIP+fours) can still yield enough variance in combinations of drivers for a measurement model to parse the incremental impact of each driver, thus overcoming what is known as multicollinearity.
What’s next? Micro cohorts.
Enter micro cohorts. Micro cohorts are small groups of people, such as a household, a small geographical area or any collection of up to a few hundred people, that are measured as a privacy-safe surrogate for the underlying individuals. In cross-channel measurement with no unifying 1:1 identifier, you don’t know which individuals were exposed to a campaign, but you can determine the exposure probability of each micro cohort, along with other demographic and behavioral data at the micro cohort level.
Since there can be thousands or tens of thousands of micro cohorts in a data set, bottoms-up approaches to measurement can be ported over to a micro cohort-level data set, with a small post-hoc adjustment to results to account for the probability of exposure.
Concerned about incrementality and ensuring that your driver analysis shows causal impact and not just correlations? Measuring incrementality is no different with micro cohorts than with person-level tracking. Media buying platforms can create hold-out groups at the micro-cohort level to conduct randomized controlled experiments. If hold-outs aren’t possible, a sufficiently large set of micro cohorts within a campaign can be analyzed for causal drivers using econometric innovations in recent years known as Causal Machine Learning.
Does the need for an exposure probability adjustment render the results less accurate? No. In fact, the gold standard for incrementality measurement – the Ghost Ads methodology introduced by Google – relies upon adjustment of results to account for exposure probability when used in programmatic platforms. This variation of Ghost Ads, known as Predicted Ghost Ads, accounts for the fact that users in a treatment group may win a DSP auction, but lose the exchange or header auction and not be exposed to an ad.
Micro cohort-level measurement is attractive in several other ways as well. Non-reliance on 1:1 identifiers such as hashed emails and device IDs mitigates the user-level contamination of test and control groups that comes as more and more users become unidentifiable and fewer impressions are able to be tracked. Also, with only about 10% of internet traffic logged in, micro cohorts offer an approach to cover an entire digital media plan. Lastly, brands can still use person-level first-party data (or opted in second- and third-party data) to validate and calibrate micro cohort-level models.
Innovation is already happening
So, what are some companies innovating in micro cohort measurement?
Clean Rooms from Google, LiveRamp, InfoSum and Habu: Marketers cannot access user-level data via clean rooms; they protect user privacy by aggregating output into groupings. Google ADH, for example, enforces a query threshold of 50 users. Therefore, exporting data from clean rooms into a custom or cross-platform measurement model will by definition require the inclusion of micro cohorts with associated exposure probabilities.
Crossix: In order to meet the strict privacy requirements of healthcare ad measurement, Crossix ties media exposure to individual health behavior and aggregates the data to a micro-cohort level based on campaign metadata. This allows Crossix DIFA customers to see the impact of healthcare advertising on groups of people, including metrics such as doctor visits and new/continuing prescriptions.
Webkit Privacy Preserving Click Attribution: One of several proposed browser-based measurement approaches, this scheme limits measurement of conversions to just 64 possible campaign IDs and prevents user-level identification. Probably not granular enough to be “micro cohorts,” but consistent with the idea that protecting privacy requires anonymizing users inside aggregate groups.
Further innovation in privacy-friendly measurement will depend on whether media buyers are ready to let go of yesterday’s dream of cross-domain, person-level tracking, and embrace a future of innovative micro cohort-based data. As data sets lose granularity, marketers will also have to start evaluating measurement methods, and cease relying on the “brute force” of ever larger and more detailed user-level graphs. The future will belong to those with the most meaningfully segmented audience cohorts and measurement models that take account of a changing privacy landscape.