Current Projects


Post-doctoral project 
Role: designer & developer
Funding: Pittsburgh Science of Learning Center

Hmm-scalable is a toolkit for building hidden Markov models (HMMs) at scale. It was originally developed to work with Bayesian Knowledge Tracing (BKT) models frequently used in the area of Intelligent Tutoring Systems (ITS). It accepts a simple tab-separated text file and has been successfully tested on datasets as large as one hundred million records (rf. Yudelson, et al., 2014). Hmm-scalable can fit and cross-validate models using a number of algorithms including expectation-maximization (EM), stochastic gradient descent, conjugate gradient descent, etc. In addition to standard BKT model, hmm-scalable is capable of fitting individualized models with factors accounting for student variance (rf. Yudelson et al., 2013) as well as parameter multiplexing HMM's. (more)

Quick BKT

Side/fun project
Role: designer & developer

Quick BKT is a quick helper interface for tracing/debugging/developing Bayesian Knowledge Tracing models. It covers computing of the skill mastery (latent), performance (observation), as well as forward and backward variables. (more)

Automated Student Modeling in a Programming MOOC

Collaborative project
Role: data analyst, co-advisor 

This is a joint investigation of a Big Data set focusing on student learning in a Programming MOOC (Java). The data is collected at the University of Helsinki and is comprised of student work on assigned problems in a modified IDE.

We are focusing on student behaviors and learning. Behaviors are patterns of student editing the code of the problem solutions. We are utilizing automatically parsed programming constructs to model student learning. The overarching goal is to use traces of behavior and learning to build assistive technology to help students learn more efficiently. (more)

Methods: Big Data, Clustering Algorithms, AWK Script, Shell Script, Logistic Regression Analysis, Regularization, User Modeling, Spectral Clustering, Hierarchical Linear Models, LIBLINEAR, LIBLINEAR-mixed-models

Publications: Yudelson, et al., 2014b, Hosseini, et al., 2017

Closing the Big Loop

Research project
Role: lead data analyst

Carnegie Learning, Inc. has accumulated a large body of data on student learning in Cognitive Tutor Algebra® in the order of hundreds of schools, tens of thousands of students, and tens of millions of student problem-solving steps. This project is focusing on harnessing the Big Data to set up a close the loop workflow for iterative improvement of the Cognitive Tutor®. The improvement is driven by using machine learning to tune the parameters of the Bayesian cognitive model that tracks students' progress and controls problem selection.

First results (Yudelson, et al., 2014a) show that only a subset of the Big Data — the Better Data — needs to be utilized to drive the optimization. Preliminary investigation of the Bayesian model parameter optimization effects for college-level math courses shows that students could save up to 37% of their time spent in Cognitive Tutor ® (in preparation).

Past Projects


Doctoral dissertation project
Role: designer & developer

PERSEUS is a Personalization Service Engine. It provides adaptive support for non-personalized (educational) hypermedia systems by abstracting content presentation/aggregation from user modeling. PERSEUS protocols are based on RDF and RSS 1.0. Although, PERSEUS was initially developed for ADAPT2 framework, its data model permits seamless support of any other hypermedia application. Currently, PERSEUS provides social navigation, topic-based navigation, concept-based navigation, and adaptive filtering techniques. (more)

Knowledge Tree

Graduate research project
Role: designer & developer

Knowledge Tree is a link aggregating portal. It presents content structured according to the folder-document paradigm. Knowledge Tree provides authentication and authorization and implements a simplified form of access control. It supports collaborative authoring and social annotation. (more)


Graduate research project
Role: designer & developer

CUMULATE is a centralized user modeling server built for the ADAPT2 architecture. It is mainly targeted at providing user modeling support for adaptive educational hypermedia (AEH) systems. CUMULATE maintains a set of overlay models of student's' knowledge. It uses several techniques for computing student models, including thresholded averaging, asymptotic user knowledge assessment, time-spent-reading. (more)


Graduate research project
Role: co-designer & co-developer

ADAPT2 (read adapt-square) — Advanced Distributed Architecture for Personalized Teaching and Training — is a framework targeted at providing personalization and adaptation services for developers of content that lacks personalization. ADAPT2 consists of the following principal components: a portal, a centralized user modeling server, adaptive content navigation shells, content delivery servers, and a personalized service engine (more).


Graduate research project
Role: data analyst

SlideTutor is a medical educational system that provides a virtual apprenticeship. It is a simulated environment for learning accurate pathologic diagnosis and reporting. SlideTutor can be used by pathology residents, fellows, and practicing pathologists and provides virtual slides that one can examine just as cases under your microscope. It monitors user’s work and steps in with helpful explanations in case of a mistake. SlideTutor keeps track of user’s learning and adapts its' interaction to fit specific educational needs (more).

My involvement in the project resulted in a publication that discussed an approach to comparing cognitive models tracking student progress in SlideTutor using a combination of point and curve metrics (Yudelson, et al., 2008).

Michael V. Yudelson © 2017

Adapted from 960 grid