The Promises and Pitfalls of Using Language Models to Measure Instruction Quality in Education
arxiv(2024)
摘要
Assessing instruction quality is a fundamental component of any improvement
efforts in the education system. However, traditional manual assessments are
expensive, subjective, and heavily dependent on observers' expertise and
idiosyncratic factors, preventing teachers from getting timely and frequent
feedback. Different from prior research that mostly focuses on low-inference
instructional practices on a singular basis, this paper presents the first
study that leverages Natural Language Processing (NLP) techniques to assess
multiple high-inference instructional practices in two distinct educational
settings: in-person K-12 classrooms and simulated performance tasks for
pre-service teachers. This is also the first study that applies NLP to measure
a teaching practice that is widely acknowledged to be particularly effective
for students with special needs. We confront two challenges inherent in
NLP-based instructional analysis, including noisy and long input data and
highly skewed distributions of human ratings. Our results suggest that
pretrained Language Models (PLMs) demonstrate performances comparable to the
agreement level of human raters for variables that are more discrete and
require lower inference, but their efficacy diminishes with more complex
teaching practices. Interestingly, using only teachers' utterances as input
yields strong results for student-centered variables, alleviating common
concerns over the difficulty of collecting and transcribing high-quality
student speech data in in-person teaching settings. Our findings highlight both
the potential and the limitations of current NLP techniques in the education
domain, opening avenues for further exploration.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要