Data Prepreparation

Initial Loading of Data

The R script sets a seed for reproducibility, loads a dataset into val, and begins preprocessing by renaming a column for clarity. It converts val into a data table for efficient manipulation, then generates unstratified and student-stratified cross-validation folds to facilitate model evaluation. The script calculates time in seconds from a baseline date for each trial, orders the data by student ID and time, and creates a binary response variable based on the outcome. It computes durations for activities and applies functions to model time effects, indicating a comprehensive setup for analyzing learning patterns or predicting outcomes based on historical educational data.


The computeSpacingPredictors function in Logistic Knowledge Tracing (LKT) is designed to calculate time-based features from student interaction data, focusing on the intervals between these interactions. These features capture the temporal dynamics of learning and are critical for enhancing the predictive power of logistic regression models within LKT.

Key Inputs:
  • data: A dataset with student interaction logs including timestamps.
  • KCs: A list of knowledge components for which spacing features are calculated.
  • Extract Timing Data: The function identifies timestamps or relative event times.
  • Calculate Intervals: For each student and KC, it computes the time elapsed since their last interaction.
  • Feature Engineering: Generates various features like immediate prior spacing, average spacing, and relative and absolute trial times.