The authors develop methods for accelerating metric similarity search that are effective on modern hardware. They algorithms factor into easily parallelizable components, making them simple to deploy and efficient on multicore CPUs and GPUs. Despite the simple structure of the algorithms, their search performance is provably sublinear in the size of the database, with a factor dependent only on its intrinsic dimensionality. They demonstrate that they methods provide substantial speedups on a range of datasets and hardware platforms. In particular, they present results on a 48-core server machine, on graphics hardware, and on a multicore desktop.