Steven Volk, November 8, 2015

Seth Anderson, “Violate This Parking Dibb,” CC
Clark Kerr, the former Chancellor of the University of California-Berkeley, in one of his many flashes of wit and wisdom, once observed, “I have sometimes thought of the modern university as a series of individual faculty entrepreneurs held together by a common grievance over parking.” The same could probably be said about grading. If there is one thing that we agree upon as faculty, it is an aversion to grading. I am no longer surprised when colleagues tell me that not having to grade papers is what finally convinced them to hang up their spurs and retire.
And it’s not just faculty who complain: grading is an equal opportunity grievance. Grading is not particularly high up on the students’ ten-best list of what they like about a college education. They certainly press us to explain what, precisely, it would take to turn a B+ into an A-. The “why-can’t-they-be-like-we-were” lobby outside of the academy see grade inflation as an indication of how faculty have caved to student demands. The media portray us as spineless, liberal wimps for not dolling out a hefty portion of C’s and D’s. And deans? Well, perhaps the day has passed when they attempted, subtly or not, to make a point by sending around a memo illustrating how the grades we gave compared with others in our department, division, and college.
Besides the fact that grading, if taken seriously, takes a huge amount of time, we struggle with it because (outside of multiple choice exams), grading is almost always more subjective than we’re comfortable with, and certainly more subjective than students expect it to be. What, precisely, would a student have to do to turn a B+ into an A-?. That’s a fair question to be asking, but, even if we grade with a rubric, we are still making fine shades of distinction that the starkness of the B+ simply can’t capture.

Image credit: John Tenniel‘s illustration for Lewis Carroll‘s poem “The Walrus and the Carpenter” (Wikipedia, Public Domain)
Nor does this take into account the fact that we’re humans, not machines. Are you a bit more generous with the first paper you read in the morning because you’re fresh and nicely caffeinated? Or are you a bit less charitable…because you’re fresh and nicely caffeinated? Our eyes cross by the 15th paper, and sometimes that will mean the student gets the A- (dear lord, anything to get me finished with this mountainous stack), and sometimes the B+ (don’t you have anything original to say? I’ve read the same idea 14 times already). It may all even out in the end, but try explaining that to the concerned student sitting across the desk from you.
Is There a Fairer Way?
We, of course, are a college that gives final grades, and even if we resist that by giving out Hampshire-College-style narratives rather than grades throughout the semester, we’re going to face the grading dilemma at the end of the semester. Is there any way to make that process fairer if not better?
David Gooblar, in his always-interesting “Pedagogy Unbound” column, recently looked at one aspect of fairness in grading: removing the “halo effect,” a specific kind of confirmation bias where favorable impressions in one area carry over to others. Those who worry about fairness in grading might wonder if, without their conscious knowledge, they are grading students higher if they did well on earlier work. B+ or A-? She got an “A” on her first two papers, so A- it is. Some would argue that we should be grading students’ work “blind,” with randomly assigned numbers replacing the names – much like an orchestra audition where the candidate performs from behind a screen. That way we won’t know what grades “33” or “7” got on their prior work.
There is some literature on “blind” grading and its impact on the halo effect as well as whether knowing the gender or race of the student influences grading, but it is hardly definitive. In a 2013 article in the Teaching of Psychology, John M. Malouff, Ashley J. Emmerton and Nicola S. Schutte, reported on a study of 126 instructors who were randomly assigned to grade a student giving a poor oral presentation or the same student giving a good oral presentation. All graders then assessed an unrelated piece of written work by the student. As hypothesized, the graders assigned significantly higher scores to written work following the better oral presentation.
On the other hand, studies looking for correlations between gender and grading have not been able to find much evidence to bolster their case (see here and here). Similar studies conducted in medical school also suggested that there was no widespread gender or racial bias in the grading of freshman medical students, although one can fairly question whether the same results would have obtained if the context were not medical school but rather a sophomore writing class or an 8th grade geometry setting. (And, to be clear, race has been shown to be a significant factor in the outcomes of standardized testing as bias is often built directly into the test.)
As Gooblar also points out, there also is much to be lost when we don’t know whose paper we are reading. Not only does it make it harder for students to come talk to you as they are working on their papers, but you have no way of knowing whether a particular student has made progress in a specific area you identified in some earlier work. And while much of our feedback is paper- specific, much is also person-specific, geared to issues that you have been discussing with the student.

“Skilled and unskilled laborers taking the TVA examination at the highschool building, Clinton, Tennessee.” – NARA – 532813. Wikimedia public domain
Is There a Better Way?
Certainly. I would recommend bringing a good 18-year old single malt scotch with you when you sit down to grade. Well, maybe not for those first papers in the morning. But, seriously, there are some things to keep in mind. What are we looking for in any grading system? The following 15 criteria are taken from Linda B. Nilson, Specifications Grading: Restoring Rigor, Motivating Students, and Saving Faculty Time (Stylus Publishing, 2015). Nilson directs the Office of Teaching Effectiveness and Innovation (OTEI) at Clemson University. Grading systems, for Nilson, should embody the following criteria, although she will argue that they rarely do.
- Uphold high academic standards.
- Reflect student learning outcomes.
- Motivate students to learn.
- Motivate students to excel.
- Discourage cheating.
- Reduce student stress.
- Make students feel responsible for their grades.
- Minimize conflict between faculty and students.
- Save faculty time.
- Give students feedback they will use.
- Make expectations clear.
- Foster higher-order cognitive development and creativity.
- Assess authentically.
- Have high interrater agreement
- Be simple.
When reading the list the first time, I thought: Right, and why not add “bring peace to the Middle East” to the list? They all seem impossible tasks. But Nilson’s book sets out an argument for an alternative which, even if I’m not fully convinced at the end of the day, is worth a look. Nilson argues in favor of what she calls “specifications grading,” a type of grading that is similar to “contract grading” already implemented at Oberlin by a number of faculty. The basic idea is that in the class syllabus the instructor discusses precisely what students must do to get a particular grade, and that they can decide on this basis what specific grade, from an A to a D, they will be attempting to earn. Nilson discussed three central aspects of this grading system in an interview with Robert Talbert of the “Casting Out Nines” blog a year ago.
First, you grade all assignments and tests satisfactory/unsatisfactory, pass/fail, where you set “pass” at B or better work. Students earn full credit or no credit depending on whether their work meets the specs that you laid out for it. No partial credit. Think of the specs as a one-level, one-dimensional rubric, as simple as “completeness” – for instance, all the questions are answered or all the problems attempted in good faith, or the work satisfies the assignments (follows the directions) and meets a required length. Or the specs may be more complex – a description of, for example, the characteristics of a good literature review or the contents of each section of a proposal. You must write the specs very carefully and clearly. They must describe exactly what features in the work you are going to look for. You might include that the work be submitted on time. For the students, it’s all or nothing. No sliding by. No blowing off the directions. No betting on partial credit for sloppy, last-minute work.
Second, specs grading adds “second chances” and flexibility with a token system. Students start the course with 1, 2, or 3 virtual tokens that they can exchange to revise an unsatisfactory assignment or test or get a 24-hour extension on an assignment. […]
Third, specs grading trades in the point system for “bundles” of assignments and tests associated with final course grades. Students choose the grade they want to earn. To get above an F, they must complete all the assignments and tests in a given bundle at a satisfactory level. For higher grades, they complete bundles of more work, more challenging work, or both. In addition, each bundle marks the achievement of certain learning outcomes. The book contains many variations on bundles from real courses.
Specs grading, according to Nilson, assumes that there is no reason why students shouldn’t be able to achieve the outcome(s) the specs describe. Specifications are basically directions on how to produce a B-level-or-better work or the parameters within which students create a product. If students don’t understand them, they have to ask questions.
In a workshop for the faculty at the University of Pittsburgh, Nilson explained that the key to specifications grading is shifting the burden from grading to developing clear explanations for desired outcomes. Under specifications grading, students will (hopefully) fully understand faculty expectations, and the research confirms that when students know what faculty are looking for, they are more likely to do better.
Further, specification grading can also shift the way that faculty think about their students. Students taking a course in a non-major field “might decide they have better things to do this semester [than spend most of their time with that course]. What grade they wind up with says nothing about their capabilities, to me. It might say something about their time schedule,” Nilson added.
I’m not fully convinced that specifications grading is for me (and the faculty at the Pittsburgh workshop raised just the sort of questions I would have). But the points Nilson raises – the value of explaining clearly what students should be doing in an assignment, the determination of high standards, the placing of more responsibility in students’ hands – all of these points encourage me to study her approach further.
And those of you who use some form of “contract” or alternative grading? What has been your experience?