Semantic Embeddings for Food Search Using Siamese Networks
2020 4TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2020(2020)
摘要
Efficient and effective search is a key driver of business in e-commerce. Functionally, most search systems consist of retrieval and ranking phases. While the use of methods like Learning to Rank (LTR) for (re)ranking has been studied widely, most retrieval systems in the industry are still predominantly based on variants of text matching. Because text matching cannot capture the semantic intent of the query, most out-of-vocabulary (OOV) queries are either not handled at all or poorly handled by matching to similarly-spelled entities. For niche e-commerce like food delivery apps operating on phonetically spelled, non-Western dish names, this problem is even more acute. Pre-trained word embedding models are of limited help because the majority of dish names are words that occur rarely or not at all in most openly available vocabularies. In this work, we present experiments and efficient Siamese network based models to learn dish embeddings from scratch. Compared to current baselines, we demonstrate that these models lead to a 3--5% improvement in Mean Reciprocal Rank (MRR) and Recall@k. We also quantify, using a combination of in-house Food Taxonomy and the Davies-Bouldin (DB) index, that the new embeddings capture semantic information with an improvement of up to 20% over baseline.
更多查看译文
关键词
Word embeddings,Siamese networks,FastText,Food Search
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要