Chrome Extension
WeChat Mini Program
Use on ChatGLM

Short text classification applied to item description: Some methods evaluation

Gilsiley Henrique Darú, Felipe Daltrozo da Motta Motta,Antonio Castelo,Gustavo Valentim Loch

Semina: Ciências Exatas e Tecnológicas(2022)

Cited 1|Views1
No score
Abstract
The increasing demand for information classification based on content in the age of social media and e-commerce has led to the need for automated product classification using their descriptions. This study aims to evaluate various techniques for this task, with a focus on descriptions written in Portuguese. A pipeline is implemented to preprocess the data, including lowercasing, accent removal, and unigram tokenization. The bag of words method is then used to convert text into numerical data, and five classification techniques are applied: argmaxtf, argmaxtfnorm, argmaxtfidf from information retrieval, and two machine learning methods logistic regression and support vector machines. The performance of each technique is evaluated using simple accuracy via thirty-fold cross validation. The results show that logistic regression achieves the highest mean accuracy among the evaluated techniques.
More
Translated text
Key words
Text classification,Product description,Short text,Logistic regression,Bag of words
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined