The Cookie directive and HTML5

In 2002, the European community introduced Directive 2002/58/EC, commonly known as the Directive on privacy and electronic communications. Amongst other provision, it has the subarticle 5(3) which has made it known as ”The cookie directive”, as the subarticle states that information may be stored in or retrieved from end user computers only if the user is made aware of this and is given the opportunity to refuse this storage or retrieving. In 2009 the directive was amended (2009/136/EC) so that storage or retrieval is only permitted if the user has given his or her consent.

The full text of the amended subarticle is as follows:

3. Member States shall ensure that the storing of information, or the gaining of access to information already stored, in the terminal equipment of a subscriber or user is only allowed on condition that the subscriber or user concerned has given his or her consent, having been provided with clear and comprehensive information, in accordance with Directive 95/46/EC, inter alia, about the purposes of the processing. This shall not prevent any technical storage or access for the sole purpose of carrying out the transmission of a communication over an electronic communications network, or as strictly necessary in order for the provider of an information society service explicitly requested by the subscriber or user to provide the service.

This regulation is generally understood to apply to the HTTP State Management Mechanism (RFC 6265, earlier RFC 2965, RFC 2109), most commonly known as ”Cookies”. In fact, preamble 25 of 2002/58/EC and preamble 66 of 2009/136/EC explicitly mention cookies as one example of such mechanisms. National regulations and in particular guidelines have focused on this particular mechanism for storing and accessing information on end user computers over a network.

But the directive text can clearly apply to other mechanisms apart from HTTP Cookies. Among mechanisms that permit similar storage and retrieval of information are the Local Shared Object mechanism found in Flash, the userData functionality in Internet Explorer, and more recently, a variety of mechanisms being defined and implemented under the html5 umbrella.

Two questions are therefore interesting:

  1. Are there mechanisms in html5 that allow user tracking (including by third parties) in a way that is not subject to the consent requirement?
  2. Are there mechanisms in html5 that have no privacy concerns, yet is subject to the consent requirement?

The first question is the most sensitive, and the hardest to answer. But consider a javascript that is served by by a third-party ad network, and is included by a number of unrelated content sites. If such a script:

  1. Generates a local GUID on the client (ie an identifier that the ad network did not choose)
  2. Stores this GUID in local storage.
  3. Sends this GUID back to the ad network using a background XMLHTTPRequest (and, presumably some other information, such as the URL of the page embedding the script) to the ad network.

(Step 1-2 are skipped if the GUID is already present in local storage)

Such a script has the same ability to track a user’s movement across sites, and to assign a user (or rather his/her computer) a permanent identifier. But does it require consent according to article 5(3)? One way to argue that it does not, is to take note of preamble 66: ”Third parties may wish to store information on the equipment of a user, or gain access to information already stored, for a number of purposes”. It may be argued that step 1-2 does not mean that it is the third party (the ad network) that stores the information (indeed, the ad network does not know what information is stored). If the third party hasn’t stored the information, then the gaining of information in step 3 might not be be subject to the rule as well, since the wording seem to require that information gained by a third party must have been previously stored by the same party. (If there is no requirement that the information gained must have been stored by the same party, one must note that every third party whose resources are included by a web page automatically gains access to a lot of information, such as the User-agent string, and ask if that information gaining is subject to the directive as well).

I will concede that this argument is not strong, as it’s assumption that step 1-2 does not constitute information storage by the third party, when the third party is responsible for sending the javascript code that ultimately results in information being stored. It seems functionally equivalent to traditional HTTP Cookie-based storage of information. But the difference is that using this method, the third party does not specify what information should be stored. Could this not be significant?

The second question seems easier to answer. Consider Offline web applications. These are web pages that contain a reference to all resources (HTML, Javascript, CSS) they require in order to work. A browser supporting offline applications will download all these resources so that the application works even if there’s no internet connection. Note that if the browser does not support offline apps, they still work — they just require you to be online. A simple example containing a version of the Halma game is described by Mark Pilgrim.

This mechanism causes the storing of information on the end user computer. This storage is not strictly necessary in order to provide the service (remember, the app works without the mechanism if the user is online — offline support is just a nice-to-have). No information is ever accessed by the provider of the game, but this is not a requirement of the directive, storing of information is enough. Thus, consent is needed. And yet there are no privacy concerns (no personal identifiable information is ever retrieved).

The aim of article 5(3) was to regulate certain usages of cookies percieved to be illegitimate. But it was written to be technology neutral, as new techniques similar to HTTP cookies were sure to be created after the directive (The diabolical evercookie uses 12 additional mechanisms, including a brilliantly twisted way of storing information in the users browsers history of visited URLs). The problem is that such mechanisms are only similar, not identical. This make writing technology neutral legislation really difficult.