Superpixel-Based Spatiotemporal Saliency Detection
Superpixel-Based Spatiotemporal Saliency Detection
Zhi Liu1
Xiang Zhang2
Shuhua Luo1
Olivier Le Meur3
Image and Video Processing LAB,Shanghai University1
University of Electronic Science and Technology of China, Chengdu2
University of Rennes,France3
Abstract
This paper proposes a superpixel-based spatiotemporal saliency model for saliency detection in videos. Based on the superpixel representation of video frames, motion histograms and color histograms are extracted at the superpixel level as local features and frame level as global features. Then, superpixel-level temporal saliency is measured by integrating motion distinctiveness of superpixels with a scheme of temporal saliency prediction and adjustment, and superpixel-level spatial saliency is measured by evaluating global contrast and spatial sparsity of superpixels. Finally, a pixel-level saliency derivation method is used to generate pixel-level temporal and spatial saliency maps, and an adaptive fusion method is exploited to integrate them into the spatiotemporal saliency map. Experimental results on two public datasets demonstrate that the proposed model outperforms six state-of-the-art spatiotemporal saliency models in terms of both saliency detection and human fixation prediction.
SP Model
Pictorial illustration of SP spatiotemporal saliency model
Results
Spatiotemporal saliency maps for sample video frames in DS1. (a) video frame, (b) ground truths, and spatiotemporal saliency maps generated using, (c) SR [17], (d) CE [20], (e) QFT [7], (f) MB [10], (g) PS [35], (h) DC [41], and (i) our SP model
Spatiotemporal saliency maps for three video clips (shown with an interval of 15, 10, and 10 frames, respectively, from top to bottom) in DS1. (a) video frames, (b) ground truths, and spatiotemporal saliency maps generated using (c) SR [17], (d) CE [20], (e) QFT [7], (f) MB [10], (g) PS [35], (h) DC [41], and (i) our SP model.
Spatiotemporal saliency maps for video clips (shown with an interval of 10 frames) from the category abnormal (top), surveillance (middle), and crowd (bottom) in DS2. (a) Video frames with fixations, (b) fixation heatmaps superimposed on video frames, and spatiotemporal saliency maps generated using (c) SR [17], (d) CE [20], (e) QFT [7], (f) MB [10], (g) PS [35], (h) DC [41], and (i) our SP model. In (a), the blue asterisks are used to mark fixation points, and the red ovals overlaid on the two video frames are used to show the examples of background regions of Type A (see Table I). (Better viewed in color; see the color online version.)
Spatiotemporal saliency maps for video clips (shown with an interval of 10 frames) from the category moving (top), noise (middle), and surveillance (bottom) in DS2. (a) Video frames with fixations, (b) fixation heatmaps superimposed on video frames, and spatiotemporal saliency maps generated using (c) SR [17], (d) CE [20], (e) QFT [7], (f) MB [10], (g) PS [35], (h) DC [41], and (i) our SP model. In (a), the blue asterisks are used to mark fixation points. (Better viewed in color; see the color online version.)
Quantitative Comparison
ROC curves of different saliency models on DS1. (Better viewed in color; see the color online version.
Average AUC achieved using different saliency models on the basis of each video category and the overall dataset DS2
Average CC achieved using different saliency models on the basis of each video category and the overall dataset DS2.
Average NSS achieved using different saliency models on the basis of each video category and the overall dataset DS2.
Citation
Zhi Liu; Xiang Zhang; Shuhua Luo; Le Meur, O., "Superpixel-Based Spatiotemporal Saliency Detection," Circuits and Systems for Video Technology, IEEE Transactions on , vol.24, no.9, pp.1522,1540, Sept. 2014.
@ARTICLE{Superpixel-Based Spatiotemporal Saliency Detection_TCSVT2014,
author={Liu, Zhi and Zhang, Xiang and Shuhua,Luo and Le Meur, Olivier},
journal={IEEE Transactions on Circuits and Systems for Video Technology},
title={Superpixel-Based Spatiotemporal Saliency Detection},
year={2014},
month={Sept},
volume={24},
number={9},
pages={1522-1540},
doi={doi: 10.1109/TCSVT.2014.2308642},
ISSN={1051-8215 },}
.
Downloads
|
“Superpixel-Based Spatiotemporal Saliency Detection”
Z. Liu, X. Zhang, S. Luoand O. Le Meur,
IEEE Transactions on Circuits and Systems for Video Technology, vol. 24, no. 9, pp. 1522-11540, Sept. 2014.
[Paper]
[MATLAB Code]
|