Generating Synthetic Training Data for Video Surveillance Applications

Johannes Eschner, BACHELOR’S THESIS

semanticscholar(2019)

引用 0|浏览0
暂无评分
摘要
As the demand for ever-more capable computer vision systems has been increasing in recent years, there is a growing need for labeled ground-truth data for such systems. These ground-truth datasets are used for the training and evaluation of computer vision algorithms and are usually created by manually annotating images or image sequences with semantic labels. Synthetic video generation provides an alternative approach to the problem of generating labels. Here, the label data and the image sequences can be created simultaneously by utilizing a 3D render engine. Many of the existing frameworks for generating such synthetic datasets focus the context of autonomous driving, where vast amounts of labeled input data are needed. In this thesis an implementation of a synthetic data generation framework for evaluating tracking algorithms in the context of video surveillance is presented. This framework uses a commercially available game engine as a renderer to generate synthetic video clips that depict different scenarios that can occur in a video surveillance setting. These scenarios include a multitude of interactions of different characters in a reconstructed environment. A collection of such synthetic clips is then compared to real videos by using it as an input for two different tracking algorithms. While producing synthetic ground-truth data in real time using a game engine is less work intensive than manual annotation, the results of the evaluation show that both tracking algorithms perform better on real data. This suggests that the synthetic data coming from the framework is limited in its suitability for evaluating tracking algorithms.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要