A Framework of Petroleum Information Retrieval System Based on Web Scraping with Python
2018 15th International Conference on Service Systems and Service Management (ICSSSM)(2018)
摘要
It is very necessary to build a customized retrieval system in the era of the big information explosion. This paper gives a framework of petroleum information retrieval system which will be used by petroleum exploration and development researchers. First, we use the open source framework SCRAPY to build a crawler system to crawl the information that business people pay attention to. Then k-means algorithm is used to cluster the crawled documents, therefore the key information is extracted and presented in the system. The actual effect in production and operation shows that this customized retrieval system is efficient and agile, it improves the efficiency, accuracy and automation level of the work.
更多查看译文
关键词
web crawler,k-means,clustering,information retrieval,petroleum
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要