Tuesday, February 25, 2014

The Validity of the CS222 Final Project

Grant Wiggins most recent blog post shares a practical test of validity for assignments. When he was asked whether a given assignment was valid, he posed two questions in response:
  • Could the students do a great job on the task, but not meet the goal of the assessment?
  • Could the students do a poor job on the task but still provide lots of evidence that they can otherwise meet the goal of the assessment?
If the answer to either is "yes," then the assessment is likely invalid for its goal.

This made me think about the final project in CS222. (I used to call it "the six-week project," but since expanding it to nine weeks, I have not found a catchy name for it. "The nine-week project" doesn't have the same ring to it.) Briefly, work in teams of three or four to pitch an original software development project, and this project becomes their focus for most of the semester. In theory, this project is supposed to serve as a context for students to explore course content, such as refactoring, Clean Code, design patterns, and human-centered design. These elements are emphasized in the rubric by which the three project milestones are evaluated. However, despite the rubric's being public, most of the teams inevitably miss points across all elements of the project—in all three milestones! Having the project delivered in three milestones is, in part, scaffolding: the feedback on the first two milestones is primarily formative. Could the problem be that the project is not valid for my goals?

Let's look at this semester. My course description is public, and the department's syllabus specifies the following student learning outcomes for the course:
Upon completion of this course, a student will be able to:

  1. Explain and apply pair programming
  2. Explain and apply test-driven development
  3. Use the interactive debugger in an integrated development environment
  4. Apply code review techniques to identify design and implementation defects
  5. Apply principles of object-oriented design to create and analyze domain models
  6. Explain and apply model-view separation
  7. Apply user-centered philosophy to design simple graphical user interfaces, justify usability considerations, and implement the user-interface in a high-level programming language.
  8. Planmanagepresent, and evaluate multi-week programming projects—including software feature estimation and milestone reporting—created by small teams
  9. Demonstrate reflective practice for professional improvement
The final project is designed to permit students to demonstrate mastery in all of these outcomes. However, since Pair Programming is required in a two-week warm-up project, it is not mandated in the final project, and so outcome #1 can be dropped from discussion. The final project description shows the rubric for milestone presentations, and the milestone evaluation rubric—which is available to the students via Blackboard—breaks down as follows:
  • Clean Code, structural principles: 6 points
  • Clean Code, process principles: 6 points
  • Version control: 3 points
  • Project configuration: 3 points
The Clean Code principles address objectives #2, #5, and #6. I use in-class activities to enforce #3 and #4, and the milestone presentations have to address #7. Clean Code together with version control and project configuration address #8, and if students haven't hit #9 through reflective essays, they certainly encounter this in the final exam. 

With the facts established, we return to Wiggins' Socratic questions.

Could the students do a great job on the task, but not meet the goal of the assessment?

Answering this requires consideration of what "the task" is. I have discovered that teams will frequently focus only on the functional correctness of the application: given their first for-credit opportunity to make anything, they want to invest energy into making that thing work! Students seem to perceive the task to be "make this thing work," despite my explicit and repeated admonition to follow the requirements. It frustrates me that students don't perceive the relationship between the two: they don't see the "content" of CS222 as being something that can help them make their projects work, but rather as isolated knowledge. That is, their collective action belies devaluation of these ideas.

It seems that teams try to do a great job on the task they have defined, and some indeed do, but this comes at the cost of doing a poor job on the task I have defined—which, of course, is the task that is evaluated by the rubric. Hence, students cannot do a great job on the task without meeting the goal of the assessment as measured by the rubric.

Could the students do a poor job on the task but still provide lots of evidence that they can otherwise meet the goal of the assessment?
I have no existence proof that this can happen. In about three years of teaching this course, I have never had a team show me well-factored, well-designed code that did not also meet their own functional requirements. Could it happen? I suspect not. All the tools and techniques that we discuss in CS222 are included specifically because they help people make software, and teams that have used these techniques also make good software. If such a team ever encountered a case where they could not meet the self-imposed functional requirements, they could simply modify these requirements as part of a milestone presentation re-estimation.

Conclusions

This analysis shows that the project is valid for its objectives, but student actions are not aligned with the explicit goals of the project. This leads to two questions: Why do students—who are normally grade-driven—reject the framing of the project? How do I encourage them to move in more fruitful paths?

I received useful feedback through course evaluations about an earlier, structural course problem: I used to talk about requirements analysis tools while the teams were in their first iterations. Since switching to the achievements-based approach last Fall, I have been more careful to introduce CRC cards, risk matrices, and user stories before the first iteration. Now, I can lean on teams to use these tools to drive their early efforts. Unfortunately, there is still little evidence that they do: when left to their own devices, the teams devolve into undisciplined ad hoc development. That is, they use hope-driven development, a term I have coined for the process where you write a bunch of code, turn it in, and hope you get points. I have tried explaining to teams that this is foolish when the rubric is published before the assignment is even given, but somewhere, there's a breakdown.

I will note here that  I don't require one particular approach in any iteration. That is, I don't force teams to use CRC cards, risk matrices, or user stories. If they do, they earn an achievement, which helps earn a high grade in the class, but I intentionally give them the option to fail. I just wish fewer students took that option. I have considered moving back to a model where I institute formal assignments and require students to try these tools, but I would rather do something more studio-based. Take CRC cards for example: right now, we spend a day on them, each team building up CRC cards for the two-week project, and then we talk about similarities and differences among these. This is a good use of a class period, but few students ever touch the CRC cards again. We could spend another class period having each team do a CRC analysis of their final projects and share these. I won't know if this helps until I try it, and at this point, that means next Fall.

No comments:

Post a Comment