How I Learned to Stop Worrying and Love Grade Inflation

It’s my first year teaching high school, and I’m handing back 10th grade English papers for the first time. This kid named Benny gets his paper back and rifles through it to check out the grade on the last page. He did OK — got a B. Most of the kids got grades in the B-range. A few did better. A few did worse. It was a distribution of grades familiar to most high school teachers. Benny then asks me, “Can I do a rewrite?” I look him right in the eye, and say, “No.” You see: I can’t let him rewrite it. I can’t let any of the students rewrite their papers. Do you know why? Because they’re likely to produce better papers that deserve higher grades. And I can’t let that happen because I’m more terrified of being accused of grade inflation.

I’m worried that the administration is going to take one look at my grade book, and think I’m a pushover because I give out too many A’s and not enough B’s or C’s. (This actually happened once my first year of teaching. The 9th grade Dean saw my grade book, pointed to a student who received an A and said, “Vanessa is not an A student.” True story.) I desperately want to be seen as a “legit teacher” with high academic standards and integrity. So I say no to Benny because at this point in my career I’m actually more worried about accusations of grade inflation than I am about his learning.

To understand my response to Benny, you need to know that there was a good dose of cultural anxiety about grade inflation when I began teaching high school 10 years ago. Op-Ed pieces in the Washington Post and New York Times were complaining about the rising percentage of students receiving A’s at American colleges and universities.

Educational thinkers and researchers were speculating back then that grade inflation was bad news. Some thought grade inflation was the inevitable impact of consumerism on education. Students were getting more A’s because they insisted on higher grades in exchange for higher tuition payments. Alternately, grade inflation could be the product of increased emphasis on student feedback in teacher evaluation. Teachers eager to solidify their employment status sought to ingratiate themselves with students in an informal exchange for positive results of student feedback on their teaching.

Other educational writers tried to suggest that grade inflation was actually good news because grade inflation could simply be the product of students getting smarter and working harder. It turns out that most subsequent studies have concluded that students are generally demonstrating lower academic achievement than in the past, and spending 13 fewer hours per week in class and on homework than their peers forty years before. It looked like grade inflation was nothing but bad news for an American educational system that some fear is failing to prepare students for an increasingly competitive world.

My story is one way in which grade inflation could be good news. There’s one explanation of grade inflation that few researchers consider, and even fewer take seriously. Grade inflation could be the effect of better teaching — better teaching that can happen precisely when teachers stop worrying about grade inflation, and focus more on helping all students learn and grow as much as possible. What I see now is that my anxiety about grade inflation prevented me from being the best teacher I can be.

Let’s start by thinking about the concept of “grade inflation.” What’s the desired state of affairs that has been inflated?

Well, consider this: In 2004, Princeton University responded to its concerns about grade inflation by encouraging academic departments to limit the percentage of A’s and A-’s its professors handed out to no more than 35% of total grades given to undergraduates. A recent report by an internal committee at Princeton recommended abandoning the policy since it claims it has contributed to undergraduate stress, but let’s think about the deeper assumptions behind Princeton’s idea. Princeton’s policy assumed that in a given population of students, no more than 35% of them would legitimately attain the level of mastery associated with A-level work. One could assume that the majority of the grades would probably be B-level grades, and a few on the lower end would be C-level grades.

Worries about grade inflation often seem to rely on the assumption that the goal of good teaching is to reveal the normal distribution or “bell-curve” of ability that lies latent in any student population. We are thus skeptical that a teacher with integrity and high academic standards would testify that more than a few of his or her students have attained the highest levels of mastery.

But where’d we get this assumption? In the 19th century, the incredibly influential Belgian mathematician Adolphe Quetelet developed his concept of the “average man,” a fictional man that embodies the mean variables of a normal distribution of human traits. Quetelet, for instance, could show how the relationship between crime, gender, age, and employment status followed a normal distribution pattern in society. But Quetelet’s success in explaining certain social phenomena may have spilled over into other fields where it may not belong. For instance, what reason do we have to think that Quetelet’s “social physics” developed almost 200 years ago to explain crime, marriage, or suicide rates should be applied to today’s classroom where the aim is not to measure facts but to cultivate human potential?

Without critical awareness, the bell-curve assumption can have a powerful and I think problematic effect on how we teach. When we assume that educational outcomes should fit the bell-curve with only a minority of students achieving mastery, we tend to teach in a way that will help only a minority of students do exceptional work. Conversely, we’ll leave an equal number of students behind doing unimpressive work, and the majority of our students settling for mediocrity. When teachers have the bell-curve in mind as a mark of their success, they tend to design their lessons so that the majority of students will not have the opportunity to push past their initial limitations. The final irony is that in pursuit of teaching excellence, some of us unwittingly prepare our students not to learn.

Here’s how it worked for me when I worried about grade inflation: My teaching was more about sorting my students according to the bell curve than helping them all attain mastery. The students would write a paper, and I’d grade it with the expectation that I’d see a few A’s, mostly B’s, and a few C’s. I didn’t offer my students the opportunity to rewrite papers on the basis of my feedback because I knew that if they did they’d likely screw up my bell-curve, and the bell curve was the indicator of my legitimacy as a teacher. I passed this off as a kind of rigor — “Hey, I’m a tough teacher!” — but the truth is that I was prioritizing an irrational prejudice regarding the expected distribution of grades over real and sustained opportunities for my students to continue learning and improving their skills.

I don’t mean to suggest that all concerns about grade inflation are silly. We should have rigorous academic standards, and apply them fairly and consistently. But grade inflation isn’t necessarily a sign of the educational apocalypse. It could just be that we’re finding more effective ways to teach each and every kid, and letting them set the limit to their progress — not some dead, Belgian guy. Today I think the purpose of great teaching isn’t to reveal the bell-curve in a classroom, but to shatter it.

So now my students re-write papers and submit test corrections whenever doing so is likely to deepen and reinforce their learning and skill development. And you know what? I give out more A’s than I used to, and a whole lot fewer C’s, but I sleep better because I believe that my grade book is not the product of a loss of integrity or lack of rigor. I’ve just decided to care more about learning than the bell curve. We need to be careful about what we expect to find in any group of people because sometimes that’s all we’ll end up seeing.

Recovery Options and the World of Tomorrow

So I’m sitting in a department meeting with the upper school science department. They’re a bunch of go-getters because they somehow managed to go as a team to this multi-day conference in New York about standards-based assessment (SBA) in science education. That means measuring student achievement on the basis of clear, transparent and fine-grained standards of knowledge and skill. The pure distillation of the approach even eschews report cards with letter grades as a convenient but ultimately unhelpful way to document student learning. SBA teachers build rubrics (graphic organizers of learning standards) often with explicit levels of proficiency and then use them as the basis of student assessment. Anyway, what makes the science team so impressive is that they went to this conference and then were determined to immediately implement an SBA approach to assessment in a school community that tends to get pretty nervous about any modifications to the transcript that will paper the way to college and university.

One of the usual features of an SBA approach is to offer chances for students to retake assessments to demonstrate competence on skills and knowledge they initially struggle with. Sometimes teachers call these chances the cheeky term "recovery options.” As the science department starts to grapple with the rationale and logistical challenges of offering recovery options to their students, worries emerge and I confess they resonate with me. First, what are saying to kids by offering them recovery options about personal responsibility? Second, does a system that provides multiples chances to show what you know and can do coddle kids to the point that we’re no longer preparing them to excel in an increasingly fast-paced and competitive workplace?

What are we saying to kids about personal responsibility when we offer them recovery options? One one hand, we’re perhaps saying that a kid doesn’t have to prepare too rigorously for an assessment because if they blow it they’ll have a chance to take it again. But here’s the thing: They’ll have to take it again. So what does that say about personal responsibility? I think one thing kids could understand is that they can choose to blow off preparing for an assessment if they want, but if they want to do well they’ll have to eventually choose to prepare to do their best. Without recovery options, students know that the teacher is saying, “Hey, this is your moment to show what you can do. If you blow it, too bad.” Traditionally, kids are motivated as much by fear and anxiety of a single opportunity to demonstrate ability. In this usual approach — the one I knew growing up — I experienced the motivation to prepare for an assessment as external. It was the teacher saying this is your one shot, so you better study hard. I didn’t always feel that the motivation was coming from my own desire, but rather the anxiety I experienced by the teacher imposing on me the one time I could show I what I could do.

With recovery options, it’s of course possible that students could blow off the initial assessment and do quite poorly. Of course, students can also do the same in the traditional system, and then that’s the end of the story. With recovery options, the student who does poorly either because they’re struggling with comprehension or motivation will be faced with the consequences of their limitations and then challenged to do something about them. They now have the opportunity to either seek additional help to understand the material or discuss with the teacher, parents, friends, professionals their motivational challenges. In terms of personal responsibility, there might be more of an opportunity for growth of this important disposition in a system with recovery options than without. At least it doesn’t seem to me that saying to a kid, “Yeah, you can blow this test off if you want. But the request for you to show what you know isn’t going to recede into the misty past. You’ll have to wallow in the consequences of your decision because recovery options keep knocking on your door to say you can do better. Are you going to take responsibility to try?”

Regarding the second concern about schools abrogating their duty to prepare young people to excel in the workplace, I suppose the real question is the relationship between SBA models with recovery options and the emerging conditions of the 21st century workplace. There are probably jobs in which it’s essential that one get it right the first time, and only people who are quick learners who thrive in high-stakes situations that don’t offer remedies for failure need apply. If a Martian were to come down to earth and we asked the alien to infer the conditions of the 21st century workplace on the basis of traditional grade-based assessment and no recovery options, the Martian might conclude that the vast proportion of 21st century jobs involve singular, high-stakes pitches to clients in a competitive and fast-moving corporate environment. My wife’s friend, Emily, is a marketing executive in the pharmaceutical industry. She works under tight time constraints, flies to pitch meetings with potential clients, and gets one shot to win their business. Emily probably does great in a traditional assessment model without recovery options.

But what about the backend workers at her firm after she wins the account? I’m talking about the creative types who are building visuals and copy in support of a marketing strategy. Do they need to get it right the first time? No. What they need to excel is a different disposition — the ability to throw out ideas, take critical feedback, and then try again to get it right. What I’m trying to say is that we need an educational system that has a more inclusive conception of the personal strengths and dispositions that are valued in a range of jobs, and not just a narrow slice. Because a SBA model that provides rich feedback to guide subsequent effort and the provision of multiple opportunities to integrate that feedback into improved performance is exactly the sort of schooling that will prepare young people for certain kinds of jobs. For the Emily’s of the world who suffer from no deficit of intrinsic motivation, learn quickly and well, and thrive in high-stakes performance settings, they don’t need rich feedback and recovery options as much as her future co-worker who thrives more on rich feedback than simply performance rankings (like grades), and needs more time to get it right.

I think what this shows is that an assessment model that values a certain set of personal dispositions and cognitive abilities is often devaluing an alternate set that is not clearly less valuable in the grand scheme of things. Because while SBA and recovery options may seem indulgent or wrong-headed to those who value jobs that reward quick learning and high-stakes performance, they are exactly the sort of conditions conducive to jobs that reward the careful integration of feedback from multiple sources and the dispositional persistence necessary to stick with the long and frustrating twists and turns that characterize creative production.

On the Subjectivity of Feedback (and Grading)

It doesn’t take much experience as a teacher handing back student work before one is confronted with a student’s protest after looking at feedback that “grading is subjective.”

The complaint is a common one but also worthy of examination because it helps us clarify both the most fair and effective way to evaluate student work and also the necessary and appropriate role of teacher judgment in evaluating student work.

Let’s start be analyzing what could be meant when a student complains that a teacher’s assessment of his or her work is “subjective.” In the most pejorative sense it could mean that a teacher’s judgment is capricious, unprincipled, or worse — an expression of the the teacher’s personal regard (or lack thereof) for the student. If that’s true of the teacher’s evaluation, then we all need to admit something seriously wrong is at play, and the vast apparatus of how schools evaluate student achievement is potentially irrational and biased.

But there’s another sense in which judgment may be “subjective,” and it’s the one that should earn our respect or at least equanimity. When we say a judgment is “subjective” the next question should be “subject to what?” As above, if it’s subject to no consistent, clear standards or perhaps based on personal regard, then the student’s complaint hits its mark. But what if the teacher’s “subjective” judgment is subject to his or her interpretation of the relationship between the student work and a clear, shared, accessible, even “objective” standard of excellence — what teachers like to call a “rubric.” This is still very much a “subjective” judgment involved on the part of the teacher, but this judgment differs ethically and epistemologically in important ways from the teacher who judges student work without the aid of an explicit and shared rubric.

To illustrate this difference let’s think about the job of a home plate umpire in a baseball game. The umpire’s job is to stand behind home plate and call balls and strikes. The criteria of what counts as a ball or strike is “objective” because it is based on a publicly accessible and known definition of the strike zone. Any ball that is thrown over the width of the plate and above the knees and below the armpits of the hitter counts as a strike. Any ball thrown outside this zone is a ball. So if the criteria of balls and strikes is defined against an “objective” standard, then what’s with all the baseball managers who run out of the dug outs fuming with disgust at the umpire’s calls? Well, it turns out that a shared, objective standard of a strike zone does not release the umpire from using one’s “subjective” judgment as to whether the thrown baseball actually counts as a ball or strike. In other words, the presence of an objective standard is still consistent with the need for “subjective” judgments about how a particular event or artifact — in this case, a thrown baseball — interacts with that objective standard. The old chestnut that no human being is infallible is undeniable, but boy is there a difference between an umpire that once in a while calls a strike a ball and one that never defines in a clear and consistent way the strike zone that the pitcher must aim for in the first place!

Back to school: The above example illustrates the distinction between two different complaints about “subjective” grading or assessment. The first is about “subjective” judgments without making clear to students the objective basis for the teacher’s judgments. Evaluation becomes a mystery to the student, and the student’s ability to improve more a matter of telepathy and luck. But there’s the other kind of teacher making “subjective” judgments against a clear, public, accessible standard of excellence. Of course, the teacher isn’t a robot nor is he or she infallible. Her judgments about where a student’s work may fall on the rubric is his or her judgment, but it is a judgment continuously honed by experience and the unpleasant but often beneficial effects of often having to defend it to colleagues, students, parents, and administrators. More importantly, the student can articulate his or her areas of growth and improvement because the teacher has made judgments against a rubric that can both explain previous achievement and provide a guide to future excellence. This is the importance of what we call "standards-based assessment.” It involves the prior dissemination and comprehension of rubrics, the assessment of student work in the explicit terms of the rubric, and the thoughtful application of the teacher’s judgment in guiding the student to continuous improvement against an “objective” standard of academic excellence. So yeah, the student is right that teacher judgment is always, only “subjective,” but it’s ultimately the difference between the umpire that sometimes misses a call, and the umpire who keeps changing up the strike zone for each pitch.

How to Tell if Students Are Making the Grade

How can Jennifer C. Braceras argue that a single letter grade could provide a more “comprehensible and holistic measure of achievement” than a competence-based transcript.

Published in The Wall Street Journal — Feb. 6, 2018 408 p.m. ET

It is difficult to understand how Jennifer C. Braceras could argue that a single letter grade could provide a more “comprehensible and holistic measure of achievement” than a competence- based transcript that maps student achievement onto clear, consistent and detailed standards (“The War on Grades Deserves to Fail,” op-ed, Jan. 30). Nor is her claim that only traditional letter grades can take into account “effort, ability to meet deadlines, or level of engagement” very persuasive. Students who meet rigorous academic standards in a competence-based system don’t do so by happenstance. They must work just as hard, responsibly and with as much passion as their counterparts mired in a traditional grading system that reduces the complexity of human achievement to a single, unexplained letter.

But anyone who has seen the increasingly homogenized transcripts coming out of American high schools with their predictable mix of A’s and B’s has to wonder how a system that provides less information about students and schools is advantageous to high-school graduates coming from schools with fewer resources and connections. In an increasingly competitive college environment in which too many accomplished students look identical on paper, the undeserved advantages of affluence, legacy and social capital are more likely to tip the scale among college admissions officials than a world in which colleges can make principled admissions decisions for which they have rich, detailed information and institutional context about student learning.

Jed Silverstein, Ph.D.
Latin School of Chicago, Chicago

How I Learned to Stop Worrying and Love Grade Inflation

Recovery Options and the World of Tomorrow

On the Subjectivity of Feedback (and Grading)

How to Tell if Students Are Making the Grade

The Work of Jed Silverstein