Developer

A 'case' for XML naming: Standards reduce ambiguity

One of the best things about XML is its flexibility, but that can also lead to problems. Implementing a standard naming convention can help eliminate a lot of confusion. Here's a look at some options for naming XML entities.


When designing an XML solution, you are usually faced with creating a set of specifications that define the structure of your data. Many decisions must be made regarding this data, including how to implement a naming scheme. In this article, we'll look at various options for naming XML entities.

Overview
Some companies standardize on one technical naming scheme, while others merely provide vague recommendations. The truth is, it really doesn’t matter which approach you take as long as you have a standard and you follow it.

Several schemes have been used in standardizing XML entity names, some of which are borrowed from programming languages. The major characteristics of these schemes are the use of upper- and lowercase letters, the delimitation of words or abbreviations, and how (if at all) you implement abbreviations.

The case for casing
There are four main options for how you use casing within your XML specifications:
  • Proper casing
  • Camel casing
  • Lowercasing
  • Uppercasing

Proper casing capitalizes the first letter of words or word parts to create the element names. Examples of proper casing are:
<CustomerName>
<LineItem>
<ShippingAddress>
<Age>


Camel casing is called such because the “hump," or uppercase letters, appear in the middle of the word rather than at the beginning. Java uses camel casing. Here are some examples of camel casing:
<firstName>
<socialSecurityNumber>
<birthDate>
<weight>


Upper- and lowercasing are also quite common. This approach removes any question about whether a specific word or abbreviation should be capitalized. Some examples of upper- and lowercasing are:
<ORDERNUMBER>
<SHIPMENTID>
<cancelorder>
<trackingnumber>


Take it to delimit
Delimiters are often a part of naming standards. They're used because too many objects have long names that are difficult to distinguish, such as CustomerNumber, CustomerIDNumber, and CustomerAccountNumber. Using a delimiter to separate the components in a name helps the user to better understand what the value is representing.

Various delimiters can be used in XML entities, but the most popular is the underscore character:
<Customer_Name>
<Order_Number>
<ship_date>


Sometimes, a delimiter is used in conjunction with a casing standard to help clarify not only the data that the entity contains, but also what type of data it is. For example:
<CustomerNumber_Integer>
<address_string>
<STATE_STRING_2>


A brief on abbreviation
Many of the words used in XML entities are quite long, and often, a series of several words makes up a single XML entity. To help alleviate the burden of typing long entity names such as
<CustomerPrimaryBillingAddressCity>

many organizations elect to use abbreviations where necessary. You can choose to always use them or only in special cases, such as when the length of an entity name exceeds 20 characters. Some common abbreviations are:
  • Cust for Customer
  • Addr for Address
  • Num or No for Number
  • Id for Identifier

Summary
XML is a robust language for expressing your organization’s data, and it offers many ways to describe it. By implementing standards that define the naming schemes used for XML entities, you can help remove ambiguities. We have shown you some ideas and options for implementing XML naming schemes that should get you started.

Editor's Picks