A Comparison of Propositionalization Strategies for Creating Features from Linked Open Data

Linked Open Data (LOD) has been recognized as a valuable source for background information in data mining. However, most data mining tools require features in propositional form, i.e., binary, nominal or numerical features associated with an instance, while Linked Open Data (LOD) sources are usually graphs by nature. In this paper, the authors compare different strategies for creating propositional features from LOD (a process called propositionalization), and present experiments on different tasks, i.e., classification, regression, and outlier detection. They show that the choice of the strategy can have a strong influence on the results.

Provided by: University of Mannheim Topic: Data Management Date Added: Aug 2014 Format: PDF

Find By Topic