Only software that can write an essay should be allowed to grade them

One of the many controversies roiling education at the moment is the argument over grading essays with software programs. A nonprofit organization that is a joint venture between Harvard and Massachusetts Institute of Technology recently began making automated essay-grading software free to any educational institution that wants to use it.

Some think it’s a great idea and even postulate that it will help students improve their writing abilities, as they can retake the same essay test multiple times until the software gives them an A.  Opponents wonder if software could ever capture the nuances of good writing.

I’m tending to side with those opposed to machine grading of essays. I spend a lot of time in my job editing the copy of other professional writers. The difference between good and bad writing is often subtle. Sometimes merely placing the last sentence of a paragraph first creates the structure needed to understand the point.  Or breaking a grammatically correct complex sentence into several simpler sentences can create a process or chain of reasoning where only a simple list existed before.

Writing expository prose entails a complicated balancing of grammar, syntax, cultural allusions, context, subtext, rhythms and emotions. No one has yet produced software to check spelling that is without numerous, and sometimes notorious, glitches. For example, Word’s spell-checker insists that a company is a living and animate thing by correcting “the company that” to “the company who.” In the same wise, it also turns people into things, always automatically changing “the person who” into “the person that.”

There are literally hundreds of these simple syntactical and grammatical mistakes such as confusing “that” and “who” which people make in their writing all the time: “the company and their employees” instead of “the company and its employees;” “the animals comprise the zoo” instead of “the zoo comprises animals” (the famous example from Strunk &White); using “anxious” when you mean “eager” and “jealous” when you mean “envious.”  When I was a college instructor grading essays, these minor mistakes turned an A into a B. In the real world of an advertising and public relations firm, they are the difference between getting a promotion and getting fired.

But beyond the simple mechanics of proper English, there are many nuances that go into good writing. Let’s start with the issue of appropriate language, which essential explores when it’s okay to break the rules.  “Ain’t” ain’t good, but in some formal essays it works if it creates a moment of cleverness or allows the writer to allude to an idiomatic expression or famous quote.

Will a machine understand a cultural reference, be it an allusion to a song, a novel, a quote or a famous person? What will software think of a reference to rap or a clever circumlocution lifted from an 18th century novel? Will the software know when the allusion is inaccurate?

And what of tone? Will a software program recognize sarcasm, irony, empathy, anger or other sub-textual emotions that propel the best essay writing? Will it notice when the author changes tone and will it recognize when the change is appropriate and when it is merely sloppy writing?

There is also the issue of the passive construction. The passive is grammatically correct, but it makes for uninteresting and dull writing, as it tends to turn everything into a state of being. Some examples:  The film was seen by the students. The building was destroyed by the earthquake. As I sometimes roar out when I see a lot of passives in a piece I’m editing, “Is, is is! Nothing but is!” Will the software know that writing in the passive makes for a much duller essay than speaking in the active voice: The students saw the film. The earthquake destroyed the building. And just as important, will it recognize the relatively small number of occasions when the passive is appropriate, e.g., when the writer wants to avoid attribution or to describe an actual state of being?

Genre also is an issue in editing.  Each genre and subgenre has its own unique rules and formulae, some of which are hard-and fast (a sonnet must have 14 lines) and some of which are quite flexible (the second or third paragraph of a news release should be a quote).  For example, when writing on legal on engineering issues for other lawyers or engineers, the use of the passive is not such a “no-no.” Will the software be able to distinguish between a news story, news feature, news release about news, news release about non-news, OpEd piece, legal précis, creative non-fiction, five-part essay and scientific paper?

Finally, there is the issue of logic. Often a sentence or paragraph looks reasonable enough until you read it a second time and see it is completely senseless.  Many sentences with if/then clauses look right because both clauses are true, but are illogical since there is no causal relationship between the two. One can’t go more than a few days reading the news media without seeing an opinion piece in which the author writes factually for many paragraphs and then draws a completely false conclusion. Will the computer pick up on logical flaws?

Years ago, Alan Turing proposed a test to know when a computer acquired the ability to think as humans do: when it is able to hold a conversation with someone and the person thinks he or she is speaking with another human being.

I am proposing a similar test: let’s allow software to grade student essays only when it can write an essay itself. I even have the topic about which the software must be able to write. I take it from an example of the nuances in language that the Russian filmmaker Sergei Eisenstein gives in his theoretical essay, “Through Theatre to Cinema,” as translated by Jay Leyda. In Leyda’s translation, Eisenstein writes (and note the passive construction), “How easily three shades of meaning can be distinguished in writing—for example: ‘a window without light,’ ‘a dark window,” and ‘an unlit window.’”

When a software program can write an essay that explains the distinction between these three descriptions, I will be willing to let it grade essays.


Leave a Reply

Your email address will not be published. Required fields are marked *


seventeen + seventeen =