V-MDAV: A Multivariate Microaggregation With Variable Group Size
Micro-aggregation is a clustering problem with minimum size constraints on the resulting clusters or groups; the number of groups is unconstrained and the within-group homogeneity should be maximized. In the context of privacy in statistical databases, micro-aggregation is a well-known approach to obtaining anonymized versions of confidential microdata. Optimally solving micro-aggregation on multivariate data sets is known to be difficult (NP-hard). Therefore, heuristic methods are used in practice. This paper presents a new heuristic approach to multivariate micro-aggregation, which provides variable-sized groups (and thus higher within-group homogeneity) with a computational cost similar to the one of fixed-size micro-aggregation heuristics.