Security

Implementing privacy/preference policies with P3P

Do you have a security policy for your Web site? P3P is an XML standard for describing privacy and/or user preference policies for a Web site. Read this article to learn how to implement P3P now.


By Roy Hoobler

Platform for Privacy Preferences (P3P) is an XML standard that describes the privacy and/or user preference policies for a Web site. The P3P vocabulary allows Web site owners to describe what information is collected on their site and how it is used. Without P3P implemented, some users must set their security settings to a lower level when submitting forms or browsing sites utilizing cookies. Using P3P allows agents to be built that will act on behalf of the user's interest. This article outlines the process of using P3P.

The W3C states five goals for implementing P3P:
  • ·        A standard schema for collected Web site data, known as the P3P Base Data Schema
  • ·        A standard set of uses, recipients, data categories, and other privacy disclosures
  • ·        An XML format for expressing a privacy policy
  • ·        A means of associating privacy policies with Web pages or sites, and cookies
  • ·        A mechanism for transporting P3P policies over HTTP


Implementing P3P
The functionality of P3P is broken into three parts: one or more policy files, a policy reference file, and HTTP headers passed from your server. The policy files should be placed in a w3c directory (/w3c) on the Web server.

Plan and review your privacy policy page on your Web site. If you do not have one yet, you will need one before implementing P3P. A sample policy page is available at the Sun Web site. Now, let’s take a closer look at the steps involved with implementing P3P.

1. Creating a policy file (Policy.p3p)
First, you must create a policy file. The XML policy file precisely describes what and how information is used. Keep in mind that P3P is a positive language. That is, only the data collected needs to be described in the XML file. The P3P specification does not describe what data or processes are not included. This policy file contains a lot of information, but IBM's P3P Policy Editor does a good job creating these files.

The policy file contains at least one statement about the Web site. Within this statement is information about what data is collected and how it is going to be used. Listing A is a good start to implementing your own policy.

The data collected follows the P3P Base Data Schema for data structures. If multiple statements were implemented, there could be a different purpose and list of data for each statement. (Perhaps a site needs separate statements about cookies, registration, and purchasing.) IBM's P3P Policy Editor does not include the specific data fields that are being collected; however, following the W3C proposal, I recommend including the specific data collected by a site (such as “user home address”). Also worth mentioning is having a Save Zone (statement). A Save Zone is part of the Web site that does not gather any user-identifiable information.

2. Creating a policy reference file (Policy.xml)
After you write the policy file, you must create a policy reference file. It is possible for different Web site directories to use different policies. However, most Web sites will use a single policy for the entire site. Creating the policy reference file is the simplest step in the process, but be certain the P3P policy file URL and the policy name (#generalPolicy) are correct. The Include element is a simple path to the directory covered by the policy. The example policy reference in Listing B includes everything under the root directory.

3. Configuring the server
Actually, using P3P begins with configuring your server to pass an HTTP header pointing to the policy reference file. Setup is different for each server on the market. If you do not have access to the server, you can use the <link> tag or write code.

In the following example, P3P is the name of the HTTP header. Everything after the colon (:) is the header value, which is in two parts: the URL to the policy reference file and the Compact Policy (CP).
P3P:policyref="http://www.mysite.com/w3c/p3p.xml" CP="ALL DSP COR NID CUR OUR IND PUR"

The CP is a list of three-character codes. (These are well documented on the W3C site.) Here you can quickly see this site collects data for OUR company for an INDefinite period for PURchasing items or completing CURrent activity. There may also be DiSPutes. Nonidentifiable data is stored as cookies (NID). Because our site stores the customer number as a cookie, NID should be taken out.

The CP should match the full policy; so, if you use NID in the CP, you should include the <nonident/> tag in your policy file. The validation tool mentioned in the next section checks for some of these matches, and Internet Explorer 6.0 will check and invalidate the policy file if there are discrepancies.

If you use a shared server or use a Web server that makes it difficult to configure HTTP headers, you can add them to your HTML or Java code. I added the following code to my JSP files:
response.setHeader("P3P","policyref=\"http://www.mysite.com/w3c/p3p.xml\" CP=\"ALL DSP COR CUR OUR IND PUR\"");

If your site is HTML-based, you can use the <link> tag:
<link rel="P3Pv1" href="/w3c/p3p.xml"></link>

Validating and testing
The IBM P3P Editor can validate your P3P file but not your implementation. Fortunately, the W3C has a validation tool online. You enter the URL of your home page and watch it work. It does a good job describing syntax or configuration errors with your P3P implementation. You can also view your policy using IE 6.0 by selecting View—>Privacy Report from the menu listing all sites; select your site and click the Summary button.

Summary
In the end, two files—the policy reference file (P3P.xml) and policy File (Policy.p3p)—are created. The Policy.p3p file is the full policy; it is referenced by P3P.xml (or Policy.xml), which can be found via the HTTP headers. The W3C strongly recommends that you place these files in a /w3c directory on the Web server. If everyone is using the same directory, user agents will be able to find these files even if the HTTP headers are not received.

Editor's Picks