Journal of Machine Learning Research (JMLR)
In this paper, the authors present a new adaptive feature scaling scheme for ultrahigh-dimensional feature selection on big data, and then reformulate it as a convex Semi-Infinite Programming (SIP) problem. To address the SIP, they propose an efficient feature generating paradigm. Different from traditional gradient-based approaches that conduct optimization on all input features, the proposed paradigm iteratively activates a group of features, and solves a sequence of Multiple Kernel Learning (MKL) sub-problems. To further speed up the training, they propose to solve the MKL sub-problems in their primal forms through a modified accelerated proximal gradient approach.