Digital
telephony is spreading wide and fast, entailing new technologies for voice communication. In
this article, I’ll
review a new XML language intended for usage in the telecommunications industry.

State Chart XML (SCXML) is a candidate
for the control language within VoiceXML 3.0 (currently under development by the W3C
Voice Browser working group), which includes CCXML 2.0 (anticipated development
in 2006 by the Voice Browser working group), and the multimodal authoring
language (under development by the
Multimodal Interaction working group
.

It is intended to be a general-purpose
state machine language that can be used in multiple contexts, including VoiceXML 3.0, CCXML 2.0, and the MultiModal
Interaction framework.

First we’ll take an overview of what this
development effort is all for, then we’ll look at SCXML notation in more
detail, and finally we will look at Harel State
Tables notation logic and its reflection in SCXML.

What is it all for?

The new language provides a
state machine notation that combines concepts from CCXML and Harel State Tables.

CCXML is an event-based state machine language designed to
support call control features in Voice Applications (specifically including VoiceXML but not limited to it). The CCXML 1.0 specification
defines both a state machine and event handing syntax and a standardized set of
call control elements. CCXML language is out of the scope of the current
article, and will be reviewed separately in another article.

Harel State Tables are state machine notations that are part of UML language.
They offer a clean and well-thought out semantic for sophisticated constructs
such as a parallel states. However, they have been defined as a graphical
specification language and hence do not have an XML representation. The event
processing model and the semantics of states and transitions are defined in
detail in the UML
specification
. UML language is also out of the scope of this article;
however we’ll review most basic concepts of UML and its equivalent in SCXML
below.

The goal of SCXML specification is to combine
Harel semantics with an XML syntax that is a logical
extension of CCXML’s state and event notation. SCXML
can be used in many ways, including but not limited to:

  • As the state machine
    framework for a future version of CCXML;
  • As a higher-level dialog
    language controlling VoiceXML 3.0’s encapsulated
    speech modules (voice form, voice picklist, etc.);
  • As a voice application metalanguage, where in addition to VoiceXML
    3.0 functionality, it may also control database access and business logic
    modules;
  • As a multimodal control
    language in the MultiModal Interaction framework, combining VoiceXML 3.0 dialogs with dialogs in other modalities
    including keyboard and mouse, ink, vision, haptics, etc. It may also control
    combined modalities such as lip-reading (combined speech recognition and
    vision) speech input with keyboard as fallback, and multiple keyboards for
    multi-user editing;
  • As a dialog control
    language invoking speech recognition, DTMF recognition, speech synthesis, audio
    record, and audio playback services;
  • As an extended call center
    management language, combining CCXML call control functionality with
    computer-telephony integration for call centers that integrate telephone calls
    with computer screen pops, as well as other types of message exchange such as
    chats, instant messaging, etc.;
  • As a general process
    control language in other contexts not involving speech processing.

Harel State Table Notation and Semantics

For the sake of simplicity let’s have a look at the most important features of state machines as defined
in UML and their representation in SCXML. All examples are taken from the specification
of SCXML.

Note: The
XML code examples are included in the downloadable zip
file
.

Figure A shows a
pretty simple state machine with three states S1, S2, and S3. Its SCXML
equivalent is simple and intuitively understandable as well: (example1.txt). These states linked by transitions, which are triggered by events.

Figure A

Simple state machine

Example 1

<?xml version=”1.0″ encoding=”us-ascii”?>
<scxml version=”1.0″ xmlns=”http://www.w3.org/2005/07/scxml” initialstate=”S1″>
  <state id=”S1″>
    <transition event=”Event1″ target=”S2″/>
  </state>
 
  <state id=”S2″>
    <transition event=”Event2″ cond=”X>0″ target=”S1″/>
    <transition event=”Event2″ cond =”X<0″ target next=”S3″/>
  </state>
 
  <state id=”S3″>
  </state>
 
</scxml>

In this example State S1 transitions to state
S2 when event Event1 occurs. This is
the only way the system can move from S1 to S2, since it is the only transition
between those two states. Any other events that occur while the system is in S1
will therefore be ignored.

The logic in S2 is slightly more complex. It
has two transitions, both triggered by event Event2. Both transitions have guard conditions that check the value
of the variable X. If X < 0, S2 will transition to S3 when Event2 occurs. If X > 0, it will
transition back to S2. Note that there is no condition covering the case where
X = 0. Therefore the system will remain in state S2 if X = 0 when Event2
occurs. As in state S1, all events not mentioned in transitions will be
ignored. In this example only <state> and <transition> elements are
used.

Harel State Tables can embed the code for execution on entering or exiting the
state. In Figure B we see the same
three states, but now S1 has an OnEntry handler that decrements X , while S2 has an OnExit handler
that also decrements X. S3 has both OnEntry and OnExit handlers, both of which increment X.

Figure B

OnEntry and OnExit added

Now suppose that X has a current value of 2,
and that the system enters S1. When it does so, X is decremented to 1. When Event1 occurs, the system will
transition to S2, as before. Now when Event2
occurs, X is equal to 1, so the system will transition back to S1. As the
system exits state S2, the OnExit handler will
decrement X to 0. Note that this happens after
the transition is selected and the guard condition has been evaluated. Example
2 contains SCXML equivalent of this chart (example2.txt). We can see the new
elements <onexit>
and <onentry>
providing state transition handlers, and the <datamodel> element providing the data
model used in the chart. Data models will also be discussed below.

Example 2

<?xml version=”1.0″ encoding=”us-ascii”?>
<scxml version=”1.0″ xmlns=”http://www.w3.org/2005/07/scxml” initialstate=”S1″>
  <datamodel>
    <data name=”X” expr=”2″/>
  </datamodel>
 
  <state id=”S1″>
    <onentry>
      <assign name=”X” expr=”X–“/>
    </onentry>
    <transition event=”Event1″ target=”S2″/>
   </state>
 
  <state id=”S2″>
    <transition event=”Event2″ cond= “X>0” target=”S1″/>
    <transition event=”Event2″ cond =”X><0″ target=”S3″/>
    <onexit>
      <assign name=”X” expr=”X–“/>
    </onexit>   
  </state>
 
  <state id=”S3″>
    <onentry>
      <assign name=”X” expr=”X++”/>
    </onentry>
    <onexit>
      <assign name=”Y” expr=”0″/>
    </onexit>      
  </state>
 
</scxml>

Harel State Tables also include the concept of a complex state, namely one that
has substates (example3.txt). In this particular case, S1 has sequential
substates. When a state machine is in a state with such substates, it must also
be in one and only one of the substates. The substates have a kind of
“Or” semantics and can be thought of as representing a decomposition
of the parent state. A transition moving the system to S1 may specify S11 or
S12 as its “target” directly. However, to handle the case where it
doesn’t, S1 contains an <initial>
element defining the default substate that it will
start in if the transition simplify specifies S1 as its destination.

Example 3

<?xml version=”1.0″ encoding=”us-ascii”?>
<scxml version=”1.0″ xmlns=”http://www.w3.org/2005/07/scxml” initialstate=”S1″>

  <state id=”S1″>
    <state id=”S11″>
    </state>
    <state id=”s12″>
    </state>
    <initial>
      <transition target next=”S11″/>
    </initial>   
  </state>
</scxml>

In addition to sequential substates, Harel State Charts provide concurrent substates, which represent a form of
“And” logic. When a state machine is in a state with concurrent
substates it must be simultaneously in each
of the concurrent child states.

Given the basic Harel
semantics S1 and S2 will function independently, so that S1 will move from S11
to S12 to S13 without regard to what S2 is doing, and vice-versa. But now
suppose that we want to ensure that S2 does not transition from S22 to S23
until S1 has left S11. Figure C
shows how we can do this using a <join>
state. The SCXML equivalent is in example4.txt.

Figure C

Adding the <join> state

Example 4

<?xml version=”1.0″ encoding=”us-ascii”?>
<scxml version=”1.0″ xmlns=”http://www.w3.org/2005/07/scxml” initialstate=”S1″>

  <state>
    <parallel>
     
      <state id=”S1″>
        <state id=”S11″>
          <transition target next =”S12 synch1″/>
        </state>
        <state id=”S12″>
          <transition target next=”S13″/>
        </state>
        <state id=”S13″/>
        </state>
     
      <state id=”S2″>
        <state id=”S21″>
          <transition target=”S12″/>
        </state>
        <state id=”S22>
          <transition target=”j1″/>
         </state>
        <state id=”S23″/>
        <join id=”j1″>
        <transition target=”S23″/>
        </join>
      </state>
   
    </parallel>
  </state>
</scxml>

We have inserted a <join> pseudo-state between S22 and S23, with a condition-less
incoming transition from S22 and a condition-less outgoing transition to S23.
(The <join> pseudo-state is
represented as a vertical bar with multiple incoming transitions and a single
outgoing transition.) Then we took the transition leaving S11 and made it
branch to include the <join>
state as an extra target. (In the diagram this is shown by inserting a ‘fork’
vertical bar, with a single incoming transition and multiple outgoing
transitions. In our SCXML notation, we use multiple values in the “target”
attribute).

Now when the machine reaches S22, it will
wait until the <join> state is
active (and therefore ready to transition) before entering the <join> and transitioning to S23.
Similarly, when S1 is ready to leave S11, it will wait until S2 reaches S22
before entering the <join>
state. In either case, the effect of the <join>
state is to cause S1 and S2 wait for each other and then to enter S12 and S23
simultaneously.

Other state machine notations

A number of other XML-based state machine
notations have been developed, but none serves the same purpose as SCXML. XMI is a notation developed for representing UML diagrams,
including Harel State charts. However it is intended
as a machine interchange format and is not readily authorable by humans.

ebXML is
a language for business process specification intended to support B2B
e-commerce applications. It contains a state machine language that is in some
ways similar to the one presented here, but its syntax and semantics are
closely tied to its intended use in e-commerce. It is therefore not suitable as
a general-purpose state machine language.

XTND, also called XML Transition Network Definition, is a
notation for simple finite state machines but lacks Harel’s
notions of hierarchical and parallel states and are thus not suitable for a
general-purpose state machine that is semantically equivalent to Harelstatecharts.

SCXML notation

Basics

<scxml> Previous
XML examples have shown that the top-level root element is <scxml>. There is one required
attribute “initialstate”.
It is the id of the initial state for the document. The state-machine will
transition to this state as the last step in document initialization. <scxml>
may have <state> and <datamodel>
elements as a children.

<state> Holds the
representation of a state. Among others it may have a “final”
attribute indicating whether this is a final state or not. Final states may not
have substates or outgoing transitions. The <state>
element may have the following children elements:

  • <onentry>
    Optional element holding executable content to be run upon entering this <state>.
  • <onexit>
    Optional element holding executable content to be run when exiting this <state>.
  • <transition> Defines an outgoing transition
    from this <state>.
  • <initial> A child which identifies initial state for state machines that have substates.
    This child is required if and
    only if the machine has one or more <state>
    children.
  • <state> Defines a sequential substate of the parent
    state.
  • <parallel> Defines a set of parallel substates, which may occur at 0 or 1 times. It
    is incompatible with the <state>
    property.
  • <history> A child which represents the descendant state that the parent state was in
    the last time the system transitioned from
    the parent. A transition with this state as its target is in fact a transition
    to one of the other descendant states of the parent.
  • <join> A pseudo-state that is useful for synchronizing parallel regions.
  • <datamodel>
    Optional specification of a data model.
  • <invoke> Used to invoke external components, which may occur 0 or 1 times. It is incompatible
    with the <state> and <parallel> elements.

Most of these elements we have already
seen in the examples above.

Transitions between states are triggered by
events and conditionalized via guard-conditions. The optional attribute
“target” of the element <transition>
specifies the destination of the transition, which may be a <state> or a <parallel> region. Any executable content contained in a
transition is executed after the <onexit> handlers of the source state and before the <onentry>
handlers of the target state. Within both the ‘cond’ attribute of the transition and
the executable content, the special variable ‘_eventdata’ may be used to access data
contained within the triggering event.

The <parallel>
element is a wrapper that encapsulates parallel state machines. The <parallel> element has <onenter>
and <onexit>
properties analogous to <state>.
In addition, the <parallel>
element holds a set of <state>
elements that execute in parallel and join
at the <onexit>
handler of <parallel> element.

Pseudo-states

Pseudo-state elements do
not have the full properties of states. However, they can be used like states
in many places. In particular, they can often be the “target” of the <transition> element.

The <initial>
element represents the default initial state for a complex state with
sequential substates. Suppose <state>
S1 has child states S11, S12, and S13. If the system is in S1, it must also be
in one (and only one) of S11, S12, or S13. A <transition> in a distinct <state> S2 may take S11, S12, or S13 as its
“target”, but it may also simply specify the parent S1. In that case,
the <initial> child of S1
specifies which of S11, S12, or S13 the system should transition to. You can
see how this works in example3 (example3.txt). The
<transition> element embedded here is a conditionless transition (i.e.,
one without a <cond>
or <event> property). Such a
transition is always enabled and will be taken as soon as the state is entered.

The <history>
pseudo-state allows for ‘pause and resume’ control flow. Whenever a complex <state> is exited, its <history> pseudo-state, if
present, records the state configuration at exit.

The <join>
pseudo-state provides barrier logic allowing the synchronization of defined
threads of control. Its semantcs have already been
discussed in example4 (example4.txt).

Data model

To provide a uniform method of access to both
local and remote data, a <datamodel> tag was introduced which encapsulates any
number of <data> elements. Each
such element defines a single data element whose value is an XML tree. The XML
tree may be fetched from an external source or specified in-line. Data accessed
via the <data> tag is local to
the SCXML interpreter and not shared with any external resource.

How XML trees that compose the data model
will be specified, accessed, and modified in future versions and
implementations of SCXML interpreter is still being discussed.

The <datamodel> element is a wrapper for data model. Its
children <data> element may
occur zero or more times. Each instance defines a named XML data tree. Trees are
created when the document is loaded.

Executable content

Executable content consists of actions that
are performed as part of taking transitions and entering and leaving states (<onentry>
and <onexit>,
etc.) In general, such content must be able to modify the data model, raise
events, and invoke functionality in the underlying platform. On the other hand,
executable content may not cause transitions or any form of change of state,
except indirectly, by raising events that are then caught by transitions. The
executable model is taken from CCXML, and will be discussed in the next
article.

The <assign>
tag may be used to modify the data model. Variables local to a specific block
of executable content may be declared using the <var> element and are initialized with
the results of evaluating the optional <expr> attribute as an expression.

The <script>
element encloses computations written in the ECMAScript.
According to current specifications, an implementation must support the ECMAScript Compact Profile and may support the full ECMA-262 ECMAScript
specification. The implementation must evaluate <script> in executable content as it is processed.

There are also some other elements which may
be used for execution logic, such as <if>,
<else>, and <elseif>.

Events

Events are one of the basic concepts in SCXML
since they drive most transitions. Events have names which are matched against
the <event> attribute of
transitions. In many applications, it is important for SCXML to receive events
from external entities (for example those to which it contacted with <invoke> and <send>). For the most part, the set of events raised during
the execution of an SCXML document is application-specific and generated under
author control by use of the <send>
tag. However, certain events are mandatory and generated automatically by the
interpreter.