Developer

Consider privacy issues in tracking Web server activity

Do you have clients that track user data? Just how well have they spelled out their tracking methods? While most clients are aware of the P3P standard, some need help. This article will guide you through the thorny thicket of privacy standards.


Many of us have spent a good deal of time at Amazon.com, and as regular visitors we're used to seeing recommendations offered to us on the home page, or reminders of what we last purchased. The site bases its recommendations, of course, on those past purchases. Amazon's server follows us around the store, scribbling down every stop we make and noting every item we look at. It's intended as a convenience, and to prompt us to buy, buy, buy. Amazon wins through suggestive selling; we, the users, are pleased because we have the illusion of being a "favorite" customer, as if the clerk remembers what we like.

This kind of user profiling is increasingly common in the Web universe, enabled by more sophisticated user tracking analysis software and advanced server logging capabilities. The rationale is obvious: a Web site's efficiency can be greatly enhanced if user activity is tracked, and why not? It may not be my next door neighbor's business where I browse the latest science fiction paperbacks, but I fully expect Amazon to take an interest.

We participate in this process with Amazon willingly and find it a pleasure. But it is also the case that there are Web sites out there passively compiling profiles on users that become data for marketers. Purchase patterns can be discerned, or a company's purchasing power may be assessable. Ethical questions arise: Can a company gather this data via Web site without the customer's consent? Can a company pass this information along to other businesses with impunity?

The customer's trail of bread crumbs
User activity can be tracked on Web sites by way of cookies, the text-file transaction record that captures the state of exchange between client and server. How is this done? By means of a third-party cookie or a unique identifier passed to a browsing user by way of the tracking company's advertisement graphic. When the user visits the company doing the data collection, whatever cookies have been accumulated by the user—not only from visiting the company's site, but also from visits to those sites on which its advertising graphic appears—are accessible to the tracking company. The user's browsing and usage patterns are now known to the tracking company.

If your client is using this technique, they may be well within their rights if there's a clear understanding of what's occurring (as in the Amazon example). But you may wish to be aware of the W3C privacy policy covering these issues, and what you can do to make clients aware of their rights regarding choosing whether or not to let activities be tracked.

P3P
The Platform for Privacy Preferences is the standard for privacy tools for both buyer and seller in Web commerce. It is a reporting system for Web sites that describes end-user data collection, how it is done, what data is stored, how long it is retained, who has access to it, what it is used for, whether the data is shared, and whether or not the end user is offered the option of declining to have their data recorded. In addition, P3P allows for presentation of a site's privacy policy to its users.

Most browsers can now be configured to request a Web site's privacy policy upon connection. They can also store the user's privacy preferences, and check these against the site policy before proceeding. The user is thus warned if the site does not conform to privacy practices acceptable to the user. P3P performs a number of important detail functions, including the attachment of "compact" policies to cookies and the mapping of policies to their source URLs. In addition, P3P can inspire responsible policy on the server administration side, as well.

Configuring a client's server for fair play in user tracking
Any company that services customers via the Web should consider P3P a professional standard to be implemented and made very public as a matter of course. P3P should be deployed on your client's Web servers and configured to communicate with user browsers.

How is this done? There are many ways, all using free downloadable P3P software generators. You can find these generators at the W3C Web site. A text version of your client's policy should be easily accessed from their site's home page. Put links back to this text policy on their sites' additional pages. In addition, you can inform user browsers of a client's policy reference file location on their server with an HTTP header (how you do this depends on the server software you use).

Individual data vs. group data
It's important to realize in any data collection done for research purposes that there are two classes of subject data: individual and group. Individual data represents information that remains tied to a particular visitor; group data consists of aggregate totals, in which the identity of individual participants isn't retained.

Honesty is the best policy
Are your clients gathering site activity data for marketing purposes, compiling average page usage, and so on? Are they counting clicks to see which features on their site are most popular? If the identity of individual users is not important to their analysis, they may wish to find some way to let their customers know this. Besides, the customers will be more likely to participate. And if your client is tracking an individual's usage patterns strictly to profile them in order to offer customized product offerings, let the customer know that it's about them. Finally, if your client is playing the third-party cookie game and monitoring users across multiple sites, why not let them know? Honesty is an important component in trust, and trust is always a good thing in business relations. Help your client gather their data, but keep everything in the open.

About Scott Robinson

Scott Robinson is a 20-year IT veteran with extensive experience in business intelligence and systems integration. An enterprise architect with a background in social psychology, he frequently consults and lectures on analytics, business intelligence...

Editor's Picks