The use of HTML meta tags has always been a contentious subject. In the past, search engines utilized them to assign page ranking as well as to display page information. While the meta description tag is still utilized by some search engines, all other techniques are almost obsolete. The usage of meta tags has evolved to a more formal method for describing a page's content and purpose. Let's take a closer look at using meta tags today, as well as how the Dublin Core Metadata Initiative might influence the way they're used in the future.
What is a meta tag?
Meta tags are information inserted into the head element of a Web page. This information is not displayed to the user. Meta tags are used to communicate information that does not concern the user. There are two types of meta tags: HTTP-EQUIV and NAME.
HTTP-EQUIV tags are equivalent to HTTP headers. They can be used to control browser behavior. There are numerous tags supported, and the following list contains a sampling including the most popular elements:
- Expiresc—The date and time after which the document should be considered expired.
- Pragma—Controls caching in HTTP/1.0. Value must be no-cache to disable page caching.
- Content-Language—May be used to declare the natural language of the document.
- PICS-Label—Platform-Independent Content rating Scheme. Typically used to declare a document's rating in terms of adult content (sex, violence, etc.), although the scheme is very flexible and may be used for other purposes.
The following HTML example uses these tags to set the page expiration date, tells the browser to not cache the page, and sets the language.
<meta HTTP-EQUIV="expires" CONTENT="Mon, 9 Mar 2004 01:00:00 GMT" />
<meta HTTP-EQUIV="pragma" CONTENT="no-cache" />
<meta HTTP-EQUIV="Content-Language" CONTENT="en-US" />
Meta tags with a NAME attribute are used for other purposes, which do not correspond to HTTP headers. This may include the name of the page author, description of the page, or the character set utilized. Here is a partial list of meta NAME elements:
- Description—A short, plain-language description of the document.
- Keywords—Keywords used by search engines to index your document in addition to words from the title and document body. Typically used for synonyms and alternates of title words.
- Author—The Web page author's name.
- Generator—Typically the name and version number of a publishing tool used to create the page. Tools such as Microsoft FrontPage insert their product names in this tag.
- Robots—Used to control Web robot processing of the page. A NOINDEX value signals that the page should not be indexed.
An example provides the details on the syntax for using these elements within your pages. The following HTML snippet includes meta tags to keep the page from being indexed, as well as a description, keywords, and the author's name.
<meta NAME="ROBOTS" content="NOINDEX" />
<meta NAME="description" content="Builder.com meta tag article">
<meta NAME="keywords" content="html, meta, web, CNet, Builder.com">
<meta NAME="author" content="Tony Patton">
These tags can also be useful for internal documentation. In addition, you can easily add your own meta tags, since they are not seen by the user and will be ignored by the browser. For instance, you could add meta tags to track the date the page was last modified and who made the change:
<meta NAME="last-modified" content="Wed, 11 Mar 2004 06:00:00 GMT">
<meta NAME="modified-by" content="Jane Doe">
Creating your own tags does restrict them to internal usage. On the other hand, there are numerous current initiatives that are pushing for a uniform meta tag summary. One such endeavor is the Dublin Core Metadata Initiative (DCMI).
Dublin Core (DC)
While the name Dublin Core often conjures images of Ireland, it was actually developed in Dublin, Ohio USA. The initiative was created to address an apparent crisis for Web search and retrieval. The crisis is that current Web search engines only cover a fraction of the Internet. To address the problem, it was decided to develop a standardized vocabulary for resource (Web page) descriptions. A controlled vocabulary was developed by a group of SGML veterans under the name of Dublin Core (DC). The DC includes:
- Metadata Element Set—Defines the basic semantics for Web resources that can be mixed through namespaces with problem-specific metadata.
- Core Qualifiers—Used for element refinement. They make the meaning of an element more specific. These are attributes that may be included with an element.
Let's examine a brief list of items from the Metadata Element Set:
- Title—Name given to a resource.
- Creator—Entity that created the resource.
- Subject—Topic of the content or resource.
- Description—Describes the content.
- Publisher—Entity that published the content.
- Date—Date associated with the resource life cycle.
- Format—The format of the resource.
- Language—Language used in the resource.
An example provides a better clue how these elements may be used in your Web content.
<title>Builder.com Dublin Core example</title>
<meta NAME="Generator" content="HTMLed Pro 3.0">
<meta NAME="DC.Creator" content="Tony Patton">
<meta NAME="DC.Title" content="Using Dublin Core on your Website">
<meta NAME="DC.Date" content="2004-03-09">
<meta NAME="DC.Format" content="text/html">
<meta NAME="DC.Language" content="en">
<meta NAME="" content="">
Using the Dublin Core syntax for your meta elements instead of developing a custom vocabulary provides a consistent and standard approach to describing the content within your site. The Dublin Core syntax may be combined with standard meta tags with no problems. Dublin Core is supported by various search engines (such as Ultraseek, for example) that may be used to add search features to your own site.
Meta tags have evolved from a resource for gaining search engine visibility to actually describing the content. Dublin Core is one approach to standard metadata, and it has received much support but is not the only initiative. Another approach to standard metadata is The Warwick Framework, and others may follow.