Machine Learning Driven Language Assessment

Burr Settles; Masato Hagiwara; Geoffrey T. LaFlair

Vol. 8 (2020)

TACL approved

Machine Learning Driven Language Assessment

Published 2020-04-28

Burr Settles
Masato Hagiwara
Geoffrey T. LaFlair

Burr Settles
Duolingo

Masato Hagiwara
Duolingo

Geoffrey T. LaFlair
Duolingo

Abstract

We describe a method for rapidly creating language proficiency assessments, and provide experimental evidence that such tests can be valid, reliable, and secure. Our approach is the first to use machine learning and natural language processing to induce proficiency scales based on a given standard, and then use linguistic models to estimate item difficulty directly for computer-adaptive testing. This alleviates the need for expensive pilot testing with human subjects. We used these methods to develop an online proficiency exam called the Duolingo English Test, and demonstrate that its scores align significantly with other high-stakes English assessments. Furthermore, our approach produces test scores that are highly reliable, while generating item banks large enough to satisfy security requirements.

(presented at ACL 2020) Article at MIT Press