Developing Procrastination Feedback for Student Software Developers

This is a brief overview of the following research papers by myself, Steve Edwards, and Cliff Shaffer:

I am summarising these papers together because they are closely related.

Summary

When students worked earlier and more often, they produced projects that:

  • were more correct,
  • were completed earlier,
  • took no more or less time to complete

So working earlier and more often doesn’t seem to be giving the student more time to complete projects, just more constructive time.

Motivation

Observing the development process

  • executions
  • compilations
  • file saves
  • line-level edits

We use these data to capture, characterise, and determine the effectiveness of the software development process undertaken by students. This process involved ingesting a large (ish) volume of data and turning it into an objective measurement of some aspect of the programming process (in this case, procrastination).

When do students work on software projects?

As an example, consider the figure below, which shows how a real student distributed their work across the days on which they worked on a project.

A bar chart showing the amount of work put in by a student on each day from August 28 to September 14.
A bar chart showing the amount of work put in by a student on each day from August 28 to September 14.
The mean edit time for a student, drawn from real data.

The red line on September 14 indicates the project deadline, and the black line on September 8 indicates the student’s “mean edit time”, which is 6 days before the deadline. A sizeable portion of work was done within the period of September 1 to September 8, and daily work was much higher during the last three days of the project lifecycle. This leads the mean edit time to be roughly in the middle of those time periods. The student’s score is therefore sensitive to not only the days on which was done, but also to the amount of work that was done on those days. Since this is simply a mean edit time, we can measure this with solution code, test code, or both.

We might also have measured the median edit time (i.e., on what day was half the work done on a project?). However, we opted for the mean since it is more sensitive to outliers, which are important when measuring procrastination (e.g., large amounts of code being written toward the end of a project timeline).

The figure below indicates distributions of the mean edit time for solution code and for test code, across all project implementations.

Two box-and-whisker plots which show the mean edit times for solution code and test code.
Two box-and-whisker plots which show the mean edit times for solution code and test code.
On average, students tended to write code fewer than 10 days before the project deadline.

Figure 2 tells us that students tended to work rather close to the deadline, even though they were given about 30 days to work on projects. Similar distributions of mean times were observed for solution code (μ=8.48, σ=6.44), test code (μ=7.78, σ=7.04), program executions (μ=8.86, σ=8.82), and test executions (μ=7.09, σ=7.10). Test editing and launching tends to occur slightly closer to the project deadline, but this difference appears to be negligible.

How valid is our measurement?

In general, students felt that our measurements were accurate. Additionally, students believed that feedback driven by a measure such as this could help them stay on track on future programming projects. They stated unconditionally that they would make more of an effort to improve their programming practice if they were given feedback about their process between assignments.

Can this measurement explain differences in project outcomes?

  • Project correctness, measured as the percentage of instructor-written reference tests passed,
  • Time of completion, measured as the number of hours before the deadline the project was completed, and
  • Total time spent, measured by adding up the lengths of all work sessions spent on the project

We used within-subjects comparisons to make inferences, allowing us to control for traits unique to individual students. Different students’ behaviours and outcomes could be symptoms of some other unknown factor (e.g., differing course loads or prior experience), making such inferences weaker. To test for relationships with the outcome variables, we used an ANCOVA with repeated measures for each student. Students were subjects, and assignments served as repeated measures (with unequal variances), allowing within-subjects comparisons. In other words, each student’s software development habits were measured repeatedly (assignments), and differences in outcomes for the same student were analysed.

Results are summarised below.

When students worked earlier and more often, they tended to produce programs with higher correctness. To illustrate this, we split the dataset roughly into half: those projects that had “solved” the assigned problem (53%), and those that had not (47%). The figure below shows the difference in edit mean times between these populations.

Two box-and-whisker plots showing the mean edit time for solved and unsolved projects.
Two box-and-whisker plots showing the mean edit time for solved and unsolved projects.
Comparison of solution edit times between projects that correctly solved an assignment, and those that did not.

When students worked earlier and more often, they tended to finish their projects earlier. This is so intuitive, it’s almost tautological. It is encouraging that the measurement is able to discriminate between early and late project submissions.

Two box-and-whisker plots showing the mean edit time for late and on-time submissions.
Two box-and-whisker plots showing the mean edit time for late and on-time submissions.
Comparison of solution edit times between on-time and late submissions.

Finally, there was no relationship between total amount of time spent on the project and the solution edit mean time.

Final Remarks

I’m an Assistant Professor in Computer Science and Software Engineering at Cal Poly. I summarise my research in computing education and software engineering.