Convolutional Learning Of Spatio-Temporal Features

ECCV'10: Proceedings of the 11th European conference on Computer vision: Part VI(2010)

引用 882|浏览226
暂无评分
摘要
We address the problem of learning good features for understanding video data. We introduce a model that learns latent representations of image sequences from pairs of successive images. The convolutional architecture of our model allows it to scale to realistic image sizes whilst using a compact parametrization. In experiments on the NORB dataset, we show our model extracts latent "flow fields" which correspond to the transformation between the pair of input frames. We also use our model to extract low-level motion features in a multi-stage architecture for action recognition, demonstrating competitive performance on both the KTH and Hollywood2 datasets.
更多
查看译文
关键词
unsupervised learning,restricted Boltzmann machines,convolutional nets,optical flow,video analysis,activity recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要