Spatiotemporal Saliency Detection Via Sparse Representation
Multimedia applications like retrieval, copy detection etc. can gain from saliency detection, which is essentially a method to identify areas in images and videos that capture the attention of the human visual system. In this paper, the authors propose a new spatiotemporal saliency framework for videos based on sparse representation. For temporal saliency, they model the movement of the target patch as a reconstruction process, and the overlapping patches in neighboring frames are used to reconstruct the target patch. The learned coefficients encode the positions of the matched patches, which are able to represent the motion trajectory of the target patch. They also introduce a smoothing term into their sparse coding framework to learn coherent motion trajectories.