A Comparison of Propositionalization Strategies for Creating Features from Linked Open Data
Linked Open Data (LOD) has been recognized as a valuable source for background information in data mining. However, most data mining tools require features in propositional form, i.e., binary, nominal or numerical features associated with an instance, while Linked Open Data (LOD) sources are usually graphs by nature. In this paper, the authors compare different strategies for creating propositional features from LOD (a process called propositionalization), and present experiments on different tasks, i.e., classification, regression, and outlier detection. They show that the choice of the strategy can have a strong influence on the results.