Report on the XML Mining Track at INEX 2007 Categorization and Clustering of XML Documents
Source: University of Paris
The XML Document Mining track1 was launched for exploring two main ideas: first identifying key problems for mining semi-structured documents and new challenges of this emerging field and second studying and assessing the potential of machine learning techniques for dealing with generic Machine Learning (ML) tasks in the structured domain i.e. classification and clustering of semi structured documents. This track has run for three editions during INEX 2005, 2006 and 2007 and the fourth phase is currently being launched. The two first editions have been summarized in an other report () and the paper focuses here on the 2007 edition.