Association for Computing Machinery
Analysts regularly wrangle data into a form suitable for computational tools through a tedious process that delays more substantive analysis. While interactive tools can assist data transformation, analysts must still conceptualize the desired output state, formulate a transformation strategy, and specify complex transforms. The authors present a model to proactively suggest data transforms which map input data to a relational format expected by analysis tools. To guide search through the space of transforms, they propose a metric that scores tables according to type homogeneity, sparsity and the presence of delimiters.