Who Answers It Better? An In-Depth Analysis of ChatGPT and Stack Overflow Answers to Software Engineering Questions

Samia Kabir,David N. Udo-Imeh,Bonan Kou,Tianyi Zhang

CoRR（2023）

Cited 0|Views15

No score

Abstract

Q&A platforms have been an integral part of the web-help-seeking behavior of programmers over the past decade. However, with the recent introduction of ChatGPT, the paradigm of web-help-seeking behavior is experiencing a shift. Despite the popularity of ChatGPT, no comprehensive study has been conducted to evaluate the characteristics or usability of ChatGPT's answers to software engineering questions. To bridge the gap, we conducted the first in-depth analysis of ChatGPT's answers to 517 Stack Overflow (SO) questions and examined the correctness, consistency, comprehensiveness, and conciseness of ChatGPT's answers. Furthermore, we conducted a large-scale linguistic analysis, and a user study to understand the characteristics of ChatGPT answers from linguistic and human aspects. Our analysis shows that 52\% of ChatGPT answers are incorrect and 77\% are verbose. Nonetheless, ChatGPT answers are still preferred 39.34\% of the time due to their comprehensiveness and well-articulated language style. Our result implies the necessity of close examination and rectification of errors in ChatGPT, at the same time creating awareness among its users of the risks associated with seemingly correct ChatGPT answers.

Translated text

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined