Even with “Do Not Track” features now built into new versions of the most popular web browsers, there is no official definition of what constitutes “tracking.” This fundamental rule determines what data may or may not be collected from users making a global privacy election. At PrivacyChoice it defines which companies are in or out of the Tracker Index, which in turn powers our tracking control services, including TrackerBlock for Firefox and Internet Explorer.
This post outlines the PrivacyChoice working definition of “tracking.” In my view, an workable definition should meet three requirements:
- Scope – is it broad enough to include core user concerns, but not so broad that it interferes with user expectations?
- Comprehension - can it be communicated briefly and unambiguously to users?
- Verification - can compliance with the election be verified both by tracking companies and web users?
In a thoughtful and comprehensive analysis, the Center for Democracy and Technology provided a draft definition of “tracking” toward these goals. Our definition restates and refines the CDT’s approach, reflecting what we have learned in classifying hundreds of companies in our index. We expect that this definition will evolve in line those used by lawmakers, regulators and industry organizations, and we welcome comments.
Tracking defined as a choice for the user
The most important version of the definition is the one presented to web users, which must be in brief, plain terms. Here’s how it can be presented in the context of a global election, such as check box choice in web browser settings:
“Companies should ask before using or sharing what I do across websites.”
Tracking defined in detail
This is our more detailed working definition of “tracking,” and conforms in large part with the CDT definition:
The non-consensual use or transfer
of behavioral data collected
across websites or applications
as to an individual, computer or device.
Taking apart the definition phrase by phrase shows how it may or may not include certain kinds of activities:
- First-party use of behavioral data is deemed to be consensual insofar as such use is obvious to the user. Individual websites do not engage in “tracking” when they customize content and ads within the bounds of their own site.
- By explicitly indicating a different intent, the user may negate the generic do-not-track election on any site or with any tracking company.
- The aggregation of data across differently branded sites, even if commonly owned, may not be considered “consensual” if this would not be expected to be understood by the user.
“use or transfer“
- The relevant activities are defined as the “use or transfer” of data, not the act of data collection itself. (It may be inevitable that behavioral data may still be collected, such as when the election is made through a do-not-track header rather than in-browser blocking.)
- Even though data may be collected, companies can demonstrate compliance with the election by either not retaining the data, or by adhering to internal controls as to if and how it can be accessed.
“behavioral data collected“
- This could include almost any interaction initiated by the user, such as pages viewed or search queries entered.
- This may include demographic data (like gender or age), when inferred from behavioral data or shared across sites. When gathered directly from users, the use or sharing may be excluded as consensual if disclosure was provided and consent obtained.
- It would not include data that does not involve a user interaction. For example, merely counting the number of times a user has seen an ad (such as for the purpose of avoiding duplication) is not behavioral data.
- Companies can specifically affirm particular interactions as not involving “tracking,” by including an unambiguous string within the URL or cookie used for this purpose. In the PrivacyChoice tracking protection lists for IE9, interactions are not blocked when they include a “not_tracking” string. This undertaking is a promise to the user, and if inappropriately used, would constitute a deceptive practice subject to industry or regulatory enforcement.
“across websites or applications“
- This requirement supports the objective to control the use or transfer of cross-site or cross-application profile information.
- Providing data access to third-party vendors, like website analytics providers, does not typically result in aggregation of individual profiles across websites, so it would not be deemed to be “tracking.” However, the use of cross-site analytics data (such as the benchmarking option in Google Analytics), could be considered tracking, depending on whether an individual’s cross-site activities is available to be queried by any party other than the site where collected. (Unfortunately, I’m not aware of a means to determine whether a particular website has enabled to pooling of data for benchmarking.)
- Gleaning navigational information and search queries from referring URLs, a widely used and immensely valuable technique used by websites, involves the collection of behavior across websites; the user behavior on Google (entering a query) is used together with behavior on the website that they visit by clicking on search results. This issue will be explored in a follow-on post.
- A website’s use of data collected within the single site for the purpose of retargeting on other sites could potentially be outside the definition of “tracking,” depending on whether retargeting requires information transfer to third-parties; it might not constitute retargeting if third-party recipients are strictly vendors with no further ability to use, share or combine the information with other data. This issue also will be explored in another post.
“individual, computer or device“
- The use or transfer of aggregated data is not covered. Data is “aggregated” when it cannot be associated with any individual computer or device.
Not included: Purpose or timing
- This definition of tracking does not depend on the purpose of the activity. The consumer concern is the aggregation of cross-site activities for any purpose, or however transferred. This avoids the inherent ambiguity and uncertainty of a purpose-based definition. It also means that a do-not-track election could apply outside of advertising and marketing; it may include market research to the extent that cross-site user actions are aggregated.
- Unlike the CDT initial definition, there is no requirement that tracking occur “over time,” since it doesn’t necessarily add meaning to “across websites or applications.”
- The definition does not directly address the use of offline data. However, to the extent “joining” offline data to an online profile requires a transfer from one party to another, it would be deemed to be “tracking.”
- External classifications of “tracking” activity are necessarily based on incomplete information; there is no foolproof way for a third-party to determine whether or how behavioral information is actually collected or used through server processes.
- When PrivacyChoice systems see a company or domain engaged in the potential collection of data through unique identifiers (such as cookies or IP addresses) across different websites, we adopt a presumption that “tracking” is occurring. This presumption can be overcome by explicit statements that describe the company’s business activities or data and privacy practices.