ITEM RESPONSE THEORY ANALYSIS OF THE EARLY CHILDHOOD ENVIRONMENT RATING SCALE – THIRD EDITION (ECERS-3)
Degree type
Graduate group
Discipline
Psychology
Subject
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Contributor
Abstract
The Early Childhood Environment Rating Scale – Third Edition (ECERS-3) is an observation scale designed to measure quality in early childhood and education (ECE) centers. It follows the original ECERS and the ECERS-Revised (ECERS-R), both of which have seen widespread use in research and accountability. The ECERS-3 is gaining traction in the ECE space, being used in research as well as in accountability systems across 22 states. The ECERS-3 comprises 35 items grouped into six predefined subscales. Scores on the items and subscales can be used to identify areas of relative strength and weakness to inform quality improvement planning, and the average score across all items is purported to describe global classroom quality. Validity evidence for the ECERS-3 is sparse, but two published studies did not find support for the global classroom quality concept nor for all the predefined subscales. But these analyses have methodological limitations, namely, they used methods that did not account for the categorical nature of the items. Further, neither study reviewed the functioning of the ECERS-3 item rating scales, where study of the ECERS-R revealed problems with the empirical ordering of item scores that bring into question their validity. This study re-analyzed data from a previous analysis of the ECERS-3 (Early et al., 2018). The internal structure of the ECERS-3 was re-examined using methods appropriate for ordinal data and to explore the possibility that the global classroom quality construct manifested as a second-order factor. The generalized partial credit model was applied to the full set of items to study the functioning of the item rating scales. Results suggested that the ECERS-3 comprised of two factors (Learning Activities/Materials and Teacher Practices/Interactions), but a second-order factor was not explored because the minimum of three first-order factors was not met. Analysis of the item rating scales showed evidence of problematic functioning for all items, typically involving more than one score category. Follow-up analyses suggested the issue was related to item construction and the scoring procedure. Limitations and implications are discussed.