On the Computation of Stochastic Search Variable Selection in Linear Regression With UDFs
Computing Bayesian statistics with traditional techniques is extremely slow, specially when large data has to be exported from a relational DBMS. The authors propose algorithms for large scale processing of Stochastic Search Variable Selection (SSVS) for linear regression that can work entirely inside a DBMS. The traditional SSVS algorithm requires multiple scans of the input data in order to compute a regression model. Due to their optimizations, SSVS can be done in either one scan over the input table for large number of records with sufficient statistics, or one scan per iteration for high-dimensional data.