Opponents of a browser-based Do-Not-Track choice often point to the question of definition: What is “tracking” and will web users really understand what they are choosing if they turn it off? Will “tracking” be defined in a way that destroys web-site analytics and other useful (and less-concerning) data collection practices?
While these questions generally are not hard to answer, there is one important obstacle to a consistent and workable definition of “tracking.” This has to do with the collection and use of referring URL information.
Whenever you click on a link that delivers you from one site to another, browsers automatically transmit to the destination website the URL of the page where you came from. This communicates to the new site something about your behavior on the previous website, including the website itself and potentially the nature of the content you were viewing. When your click comes from a search engine, the referring URL also includes the query you entered into the search engine. It’s not obvious that most users understand this flow of information about their behavior, even though the web is wired to work this way.
This collection of data seems squarely within common conceptions of “tracking” when defined as the collection or use of of an individual’s actions across different websites (see the Center for Democracy and Technology draft approach and our own working definition). When referring URL data from the first site is added to data from the second site, the the result is a profile based on cross-site behavior. This profile may include the most intimate artifact of intent, a search query, which the user may not even realize is being shared.
For the purposes of Do-Not-Track policy, there are at least three ways to handle referring URL data:
1. Require strict compliance.
An uncompromising approach would make no exception for referring URL data. Websites would need to refrain from collecting it when a Do-Not-Track election is in place.
Such an approach would create tremendous practical problems. The burden of Do-Not-Track compliance would fall not only on a few hundred companies dedicated to tracking, but also millions of websites that routinely use such cross-site data for any number of purposes: to understand where traffic comes from, customize landing pages, track advertising attribution and to determine which search terms to buy in search ads. While not all of these applications feed a cross-site profile associated with an individual computer or device, systems typically are not engineered to avoid that result.
2. Define “tracking” to exclude referring-URL information.
This approach avoids the practical issues, at the cost of less consistency and clarity for users. It would be no longer be accurate to present a simple Do-Not-Track choice, such as, “Should companies ask before using or sharing what you do across websites?”
On the other hand, there is a qualitative difference between continuous collection of profile data across multiple websites and a profile gathered primarily on a single website versus more collection of more incidental behavioral data that relates only to how users navigate to a single website. Excluding referring-URL data from the Do-Not-Track framework does not necessarily undermine its core purpose.
3. Exempt direct first-party collection, but encourage or require intermediaries to comply.
It is technically feasible for search engines to strip search query from referring URLs if the user has a Do-Not-Track election in place. Also, providers of website analytics and tools — the means by which the vast majority of websites collect and use referring-URLs — could also filter out that data for users who have Do-Not-Track elections in place. As a practical matter, a handful of search engines and a few hundred analytics providers could implement Do-Not-Track in a way that preserves the consistency and clarity of the “tracking” definition. Aggregate information, not associated with a specific user, computer or device, could still be collected and deployed.
With a dominant position in both web search and website analytics, it will ultimately be up to Google to decide how a Do-Not-Track election bears on the use of referring URL information. Something tells me that, even if Google otherwise accepts Do-Not-Track (which they haven’t), Google won’t read it to apply to referring URL data. The free movement of referring URL data is simply too critical to the search advertising ecosystem — and therefore Google’s revenue — to expect any other outcome.