COVID-19 Learning Intervention Impact: Data-Driven Evaluation

Mirna Elizondo; June Yu; Daniel Payan; Li Feng; Jelena Tesic

doi:10.36227/techrxiv.171392833.35385517/v1

loading page

COVID-19 Learning Intervention Impact: Data-Driven Evaluation

Mirna Elizondo,
June Yu,
Daniel Payan,
Li Feng,
Jelena Tesic

Abstract

During the COVID-19 pandemic, there was a notable decrease in student learning rates across public school systems in the United States, undoing years of progress. This research examines nine publicly accessible data outlets within a data science framework to explore their potential to understand and mitigate factors contributing to learning loss. The data was sourced from various entities, including the Census Bureau 2010, USAFACTS, Texas Department of State Health Services (DSHS), the National Center for Education Statistics (CCD), U.S. Bureau of Labor Statistics (LAUS), and three sources from the Texas Education Agency (STAAR, TEA, ADA, ESSER). We present an end-to-end large-scale educational data modeling pipeline that integrates, cleans, and implements automated attribute importance analysis for deriving meaningful insights from educational data. We aim to address several key research inquiries: i) Do students from low-income backgrounds and minority groups exhibit heightened learning loss? ii) How does learning loss vary across different grade levels? iii) What impact does the decision to reopen schools or school districts have on student learning loss? iv) Is there a relationship between the mode of instruction (hybrid, remote, in-person) and learning loss? v) Is the school or district attendance inversely associated with learning loss? vi) Does the local or regional infection rate correlate with increased learning loss? vii) How does the local unemployment rate influence learning losses? Our investigation demonstrated the superior performance of gradient-boosting algorithms, particularly XGBoost and CatBoost, in handling missing values. During this period, the mode of instruction and prior score emerged as the primary resilience factors, alongside low income and grade level, which were found to be the most influential factors in predicting learning loss for both math and reading. We demonstrate a novel data-driven approach to discover insights from an extensive collection of heterogeneous public data sources and offer an actionable understanding to policymakers to identify learning-loss tendencies and prevent them in public schools.

18 Apr 2024Submitted to TechRxiv

24 Apr 2024Published in TechRxiv

Abstract

Peer review timeline