LBase: A tool to generalize the Semantic Web

The Semantic Web is one vision of the future of the Web in which the information expressed has an explicit meaning. To manage many semantic languages, the W3C has proposed the LBase tool to define the semantics for all Web semantic languages.

The Semantic Web is a vision for the future of the Web in which information is given explicit meaning, making it easier for machines to automatically process and integrate information available on the Web. The Semantic Web will build on XML's ability to define customized tagging schemes and the resource description framework's (RDF) flexible approach to representing data.

Ontology languages: Current approach
The first level above RDF required for the Semantic Web is an ontology language that can formally describe the meaning of terminology used in Web documents. If machines are expected to perform useful reasoning tasks on these documents, the language must go beyond the basic semantics of RDF Schema. An abstract ontology language uses the term ontology, which, in turn, defines the terms used to describe and represent an area of knowledge. Ontologies are used by people, databases, and applications that need to share domain information (a domain is just a specific subject area or area of knowledge, like medicine, tool manufacturing, real estate, automobile repair, financial management, etc.). Ontologies include computer-usable definitions of basic concepts in the domain and the relationships among them (note that here and throughout this document, definition is not used in the technical sense understood by logicians). They encode knowledge in a domain and also knowledge that spans domains.

A model-theoretic semantics for a language assumes that the language refers to a world, and describes the minimal conditions that a world must satisfy in order to assign an appropriate meaning for every expression in the language. A particular world is called an interpretation, so that model theory might be better called interpretation theory. The idea is to provide a mathematical account of the properties that any such interpretation must have, making as few assumptions as possible about its actual nature or intrinsic structure.

There will be many Semantic Web languages, most of which will be built on top of more basic Semantic Web language(s). It is important that this layering be clean and simple, not just for human understandability, but also to enable the construction of robust semantic Web agents that use these languages. At the moment each semantic language is defined in terms of its own model theory, and each with its own notion and syntax.

It is obvious for the moment that difficult problems can arise when layering model theories for extensions to the RDF layer of the Semantic Web. The most relevant experience is developing a Web Ontology Language (OWL). Moreover, this strategy places a very high burden on the basic layer, since it is difficult to anticipate the semantic demands which will be made by all future higher layers, and the expectations of different development and user communities may conflict.

Alternative approach to defining the semantics
For achieving interoperability in defining the semantics for different semantic Web languages, W3C issued a Note about a basic language, LBase, which is expressive enough to state the content of all currently proposed Web languages, and has a fixed, clear model-theoretic semantics.

Each semantic Web language is defined by specifying how expressions in that particular language map into equivalent expressions in LBase. It also provides axioms written in LBase that constrain the intended meanings of a semantic-language-special vocabulary. The LBase meaning of any expression in anysemantic Web language can then be determined by mapping it into LBase and adding the appropriate language axioms, if there are any.

The intended result is that the model theory of LBase is the model theory of all the Semantic Web Languages, even though the languages themselves are different. This makes it possible to use a single inference mechanism to work on these different languages. Although it will be possible to exploit restrictions on the languages to provide better performance, the existence of a reference proof system is likely to be of use to developers. This also allows the meanings of expressions in different semantic languages to be compared and combined, which is very difficult when they all have distinct model theories.

LBase language
LBase is not being proposed as a semantic Web language. It is a tool for specifying the semantics of different semantic languages. It uses the logical form of the target language as an explication of the intended meaning of the semantic language. The syntax and semantics of LBase are not designed as a programming language.

By using a well understood logic—i.e., first order logic (A Mathematical Introduction to Logic, H.B.Enderton, 2nd edition, 2001, Harcourt/Academic Press)—as the core of LBase, and providing for mutually consistent mappings of different semantic languages into LBase, we ensure that the content expressed in several languages can be combined consistently, avoiding paradoxes and other problems.

The concept of semantic language has a strict ground in mathematical logic, and uses first order logic as its main logical engine. First-order logic is a theory in symbolic logic that states such quantified statements as "there exists an object such that…" or "for all objects, it is the case that…". First-order logic is distinguished from higher-order logic in that it does not allow statements such as "for every property, it is the case that..." or "there exists a set of objects such that...." Nevertheless, first-order logic is strong enough to formalize all of set theory and thereby virtually all of mathematics. It is the classical logical theory underlying mathematics.

Mapping type/class language into a predicate/application language also ensures that set-theoretical paradoxes do not arise. Although the use of this technique does not in itself guarantee that mappings between the syntax of different languages will always be consistent, it does provide a general framework for detecting and identifying potential inconsistencies.

Any first-order logic is based on a set of atomic terms, which are used as the basic referring expressions in the syntax. These include names, which refer to entities in the domain, special names, and variables. LBase distinguishes the special class of urirefs, defined to be a URI reference. Urirefs are used to refer to both individuals and relations between the individuals. LBase allows for various collections of special names with fixed meanings defined by other specifications (external to the LBase specification).

Any LBase language is defined with respect to a vocabulary, which is a set of non-special names. Every LBase vocabulary is required to contain all urirefs, but other expressions are allowed. It is required that every LBase interpretation provide a meaning for every special name, but these interpretations are fixed, so special names are not counted as part of the vocabulary.

Editor's Picks