Segment-Then-Rank: Non-Factoid Question Answering On Instructional Videos

national conference on artificial intelligence(2020)

引用 24|浏览48
暂无评分
摘要
We study the problem of non-factoid QA on instructional videos. Existing work focuses either on visual or textual modality of video content, to find matching answers to the question. However, neither is flexible enough for our problem setting of non-factoid answers with varying lengths. Motivated by this, we propose a two-stage model: (a) multimodal segmentation of video into span candidates and (b) length-adaptive ranking of the candidates to the question. First, for segmentation, we propose Segmenter for generating span candidates of diverse length, considering both textual and visual modality. Second, for ranking, we propose Ranker to score the candidates, dynamically combining the two models with complementary strength for both short and long spans respectively. Experimental result demonstrates that our model achieves state-of-the-art performance.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要