RWTH Aachen University
New XML (eXtensible Markup Language) files are added daily to databases triggered by the increased popularity of XML as the new database exchange standard. XML file comparison and clustering are two challenging tasks still accomplished predominantly manually. XML schema contains information about data structure, types and labels found in an XML file. By reducing the XML schema tree to its significant nodes the task of finding structural equivalent schemas, and implicit XML files that refer to the same entities, is simplified.