An ontology defines the terms used to describe and represent an area of knowledge. Ontologies are critical for applications that need to search across or merge information from diverse communities. Although XML DTDs and XML Schemas are sufficient for exchanging data between parties who have agreed to the definitions beforehand, their lack of semantics prevents machines from reliably performing this task with new XML vocabularies.
The Semantic Web is a vision for the future of the Web in which information is given explicit meaning, making it easier for machines to automatically process and integrate information available on the Web. The Semantic Web will build on XML's ability to define customized tagging schemes and RDF's flexible approach to representing data.
The next element required for the Semantic Web is the OWL Web ontology language (OWL), which can formally describe the semantics of classes and properties used in Web documents. For machines to perform useful reasoning tasks on these documents, the language must go beyond the basic semantics of RDF Schema. In this article, I'll briefly review several use cases that show the need for the OWL.
Ontologies are usually expressed in a logic-based language, so that detailed, accurate, consistent, sound, and meaningful distinctions can be made among the classes, properties, and relations. Some ontology tools can perform automated reasoning using the ontologies, and thus provide advanced services to intelligent applications such as conceptual/semantic search and retrieval, software agents, decision support, speech and natural language understanding, knowledge management, intelligent databases, and electronic commerce.
Visit these Web sites if you would like to get some more background information on ontology:
The OWL language provides three increasingly expressive sublanguages designed for use by specific communities of implementers and users:
- OWL Lite supports those users primarily needing a classification hierarchy and simple constraint features.
- OWL DL supports those users who want the maximum expressiveness without losing computational completeness (all entailments are guaranteed to be computed) and decidability (all computations will finish in finite time) of reasoning systems. OWL DL was designed to support the existing Description Logic business segment and has desirable computational properties for reasoning systems.
- OWL Full is meant for users who want maximum expressiveness and the syntactic freedom of RDF with no computational guarantees. OWL Full allowsan ontology to augment the meaning of the predefined RDF or OWL vocabulary.
Before you can use a set of terms, you need a precise indication of what specific vocabularies are being used. A standard initial component of an ontology includes a set of XML namespace declarations enclosed in an opening rdf:RDF tag:
xmlns:drinks =" http://www.site.org/2003/ontology/drinks#"
Once namespaces are established, we normally include a collection of assertions about the ontology grouped under an owl:Ontology tag. These tags, shown below, support such critical housekeeping tasks as comments, version control, and inclusion of other ontologies:
<rdfs:comment>OWL ontology of drinks at site.org</rdfs:comment>
<owl:imports rdf:resource=" http://www.site.org/2003/ontology/wines.owl "/>
The owl:Ontology element is the place to collect much of the OWL metadata for the document. The rdf:about attribute provides a name or reference for the ontology. Where the value of the attribute is empty (i.e., the standard case), the name of the ontology is the base URI of the owl:Ontology element.
A Web portal is a Web site that provides information content on a common topic (for example, a specific city or domain of interest). A Web portal allows individuals who are interested in the topic to receive news, find and talk to one another, build a community, and find links to other Web resources of common interest. A simple index of subject areas may not provide the community with sufficient ability to search for the content that its members require. To allow more intelligent syndication, Web portals can define an ontology for the community.
This ontology can provide a terminology for describing content and axioms that define terms using other terms from the ontology. When combined with facts, these definitions allow other facts that are necessarily true to be inferred. These inferences can, in turn, allow users to obtain search results from the portal, which are impossible to obtain from conventional retrieval systems.
Of course, such a technique relies on content providers annotating their pages with the Web ontology language, but if you assume that these owners will try to distribute their content as widely as possible, it seems that they may be willing to do this.
OWL also can be used to provide semantic annotations for collections of images, audio, or other non-textual objects. It is even more difficult for machines to extract meaningful semantics from multimedia than it is to extract semantics from natural language text.
Multimedia ontologies can be of two types: media-specific or content-specific. Media-specific ontologies could have taxonomies of different media types and describe properties of different media. For example, video may include properties to identify the length of the clip and scene breaks. Content-specific ontologies could describe the subject of the resource, such as the setting or participants. Since such ontologies are not specific to the media, they could be reused by other documents that deal with the same domain.
Corporate Web site management
Companies often have numerous Web pages concerning things such as press releases, product offerings, case studies, corporate procedures, internal product briefings and comparisons, white papers, and process descriptions. Ontologies can be used to index these documents and provide a better means of retrieval.
Although many large organizations have a taxonomy for organizing their information, this is often insufficient. A single taxonomy is often limiting because many things can fall under multiple categories. Furthermore, the ability to search on values for different parameters is often more useful than a keyword search with taxonomies.
A typical problem for corporate Web site users is that they may not share terminology with the authors of the desired content. For such problems, it would be useful for each class of user to have different ontologies of terms, but have each ontology interrelated so translations can be performed automatically.
Documentation can be of several different types, including design documentation, manufacturing documentation, or testing documentation. These document sets each have a hierarchical structure, but the structures differ between the sets. There is also a set of implied axes which cross-link the documentation sets.
Ontologies can be used to build an information model which allows the exploration of the information space in terms of the items that are represented, the associations between the items, the properties of the items, and the links to documentation which describes and defines them (i.e., the external justification for the existence of the item in the model). That is to say that the ontology and taxonomy are not independent of the physical items they represent, but may be developed and/or explored in parallel.
Another common use of this kind of ontology is to support the visualization and editing of charts that show snapshots of the information space centered on a particular concept (e.g., a class or instance). These are typically activity-rule diagrams or entity-relationship diagrams.
The Semantic Web can provide agents with the capability to understand and integrate diverse information resources. When building the actual services, the information may come from a number of sources, such as portals, service-specific sites, reservation sites, and the general Web.
Tasks and design goals
There are over one billion pages on the Web, and the potential application of the Semantic Web to embedded devices and agents indicates that even larger amounts of information eventually must be handled. The OWL language should support reasoning systems that scale well. However, the language should also be as expressive as possible, so that users can state the kinds of knowledge important to their applications.
There are also important issues regarding the distinction between a class and an individual in OWL. A class is simply a name and collection of properties that describe a set of individuals. Individuals are the members of those sets. Thus, classes should correspond to naturally occurring sets of things in a domain of discourse, and individuals should correspond to actual entities that can be grouped into these classes.
The OWL language is designed to express a wide variety of knowledge, but it also provides for an efficient means to reason with it. Since these two requirements are typically at odds, the goal of the OWL language is to find a balance that supports the ability to express the most important kinds of knowledge.