Date of Award


Degree Type


Degree Name

Doctor of Philosophy (PhD)

Graduate Group


First Advisor

Ryan S. Baker


The purpose of this dissertation was to develop and use a platform that facilitates Massive Open Online Course (MOOC) replication research. Replication and the verification of previously published findings is an essential step in the scientific process. Unfortunately, a replication crisis has long plagued scientific research, affecting even the field of education. As a result, the validity of more and more published findings is coming into question. Research on MOOCs have not been exempt from this. Due to a number of limiting technical barriers, MOOC literature suffers from such issues as contradictory findings between published works and the unconscious skewing of results caused by overfitting to single datasets. The MOOC Replication Framework (MORF) was developed to allow researchers to bypass these technical barriers. Researchers are able to design their own MOOC analyses and have MORF conduct it for them across its massive store of MOOC data. The first study in this dissertation, which describes the work that went into building the platform that would eventually turn into MORF, conducted a feasibility study that aimed to investigate whether the platform was able to perform the tasks it was built for. This was done through the replication of previously published findings within a single dataset. The second study describes the initial architecture of MORF and sought to demonstrate the platform’s scaled feasibility to conduct large-scale replication research. This was done through the execution of a large-scale replication study against data from an entire University’s roster of MOOCs. Finally, the third study highlighted how MORF’s architecture allows for the execution of more than just replication studies. This was done through the execution of a novel research study that sought to analyze the generalizability of predictive models of completion between the countries present in MORF’s expansive dataset—an important issue to address given the massive enrollment numbers of MOOCs from all around the world.