Software That Grades Handwritten Essays May Boost Comprehension, Too

January 14, 2008

Computer scientists in the University at Buffalo's School of Engineering and Applied Sciences have been working with their colleagues in UB's Graduate School of Education to develop a computational tool that not only dramatically reduces the time it takes to grade children's handwritten essays, but that also may help boost students' reading comprehension skills.

The software has special relevance to the school systems and teachers involved in administering the standardized English Language Arts exams that are given every year, usually in January, by public school systems in every state. This month, every New York school district will administer these assessments to their students in grades three to eight.

The National Science Foundation recently awarded the UB researchers a $100,000 grant to develop new algorithms that could eventually allow computers to take over the grading of children's handwritten essays.

The UB team's preliminary results with the software are scheduled for publication in the February/March issue of Artificial Intelligence. The paper was published earlier in the online version of the journal.

"It surprised us that we were able to do as well as we did, especially since this was our first attempt," said Sargur N. Srihari, Ph.D., SUNY Distinguished Professor in the UB Department of Computer Science and Engineering and principal investigator on the project.

The project focused on handwritten essays obtained from eighth graders in the Buffalo Public Schools who responded to this question from a New York State English Language Arts exam: "How was Martha Washington's role as First Lady different from that of Eleanor Roosevelt?"

Three hundred of the essays were scored by human examiners and used as a "gold standard" against which 96 computer-scored essays were judged.

Essays were graded on a scale of 0-6, with six being the highest score.

In 70 percent of cases, the UB researchers reported, the computer program graded the essays within one point of those assigned by human examiners.

The UB research tackles two significant artificial intelligence problems, said Srihari, director of UB's Center of Excellence in Document Analysis and Recognition (CEDAR), the world's largest research center devoted to developing new technologies that can recognize and read handwriting.

"We wanted to see whether automated handwriting recognition capabilities can be used to read children's handwriting, which is essentially uncharted territory," he said. "Then we took it one step further to see if we could get computers to score these essays like human examiners."

In the pilot study, the essays were first scanned into a computer. Each line of text was broken down into individual words. In this step, the system's goal was word recognition, which it accomplished using contextual information from the rest of the sample, the answer rubric and the question.

Once the majority of words were recognized, the essay was turned into a digital text file.

For the automated scoring step, the UB researchers used an artificial neural network approach.

"In this method, the system 'learns' from a set of answers that were scored already by humans, associating different values or scores with different features in the essays," explained Srihari.

Computational tools designed to evaluate essays that are typed, not handwritten, already exist, Srihari explained.

"But these are all based on electronic text that the test-taker types in, using a computer keyboard," he said. "In this case, we are working toward developing a computational tool to read and evaluate the many thousands of handwritten essays written by schoolchildren as part of statewide mandated reading comprehension tests."

The sheer speed with which the program works -- literally seconds per essay -- is the most obvious advantage, the UB researchers said.

Handwritten essays are an important part of every standardized reading comprehension test given in every state. But because grading all of those handwritten essays is such a huge task requiring many hours of work by human examiners, students who take the exam in January do not find out how they did until almost the end of the spring semester.

"Judging this quantity of handwritten essays is very laborious," said Srihari. "It would be nice to automate this process so perhaps students could take the test in May, having received more instruction, and then have the results in June."

And while some teachers may be wary of computers' ability to properly grade essays, James L. Collins, Ed.D., professor in the UB Department of Learning and Instruction and a co-investigator, is quite confident.

While he noted that human examiners might still be necessary for grading on very specific criteria, the majority of evaluations could probably be done just as well by computers.

"Computational linguistics has made great leaps over the past decade and it turns out that for judging the overall quality of a paper, computers are indeed as reliable as human graders," Collins said.

That's an important development, he said, because writing practice and feedback from readers are the key aspects of learning to write at every grade level.

"The problem is, 'How do teachers respond helpfully to all of the writing produced by their students?'" he said. "Right now, teachers spend a lot of time getting their students ready for these standardized tests, then the students take the exam and get their scores back months later. With computer scoring, students could get back their scores much faster at a time when the results can still be addressed. The assessment scores wouldn't just be going into a 'black hole.'"

The software program developed at UB was 'trained' to evaluate essays based on six specific writing traits: ideas, organization, word choice, sentence structure, voice and conventions like spelling, usage and punctuation.

Collins said that the software now under development could be used as an important teaching tool.

"We envision a program where a student would handwrite an essay, scan it into the computer, which would then 'read' it and analyze it for the specific traits we trained it to evaluate," he said.

That feedback would be available immediately to both teacher and student as a typed essay, which has been analyzed for the six traits, allowing for more fruitful lessons on how to edit and revise, Collins said.

The software program also provides new opportunities for education researchers like Collins, who is working with colleagues at UB on a three-year, $1.5 million project called Writing Intensive Reading Comprehension funded by the Institute of Education Sciences at the U.S. Department of Education. The study involves more than 2,000 fourth and fifth graders in 10 low-performing urban schools. So far, Collins said, the results show that students can improve their reading abilities significantly through the use of assisted writing.

"Once a handwritten essay has been 'read' by a computer, we can ask the computer to look for certain features of the writing so that we can spot general patterns and discover what kids are having trouble with," Collins continued.

Co-authors on the Artificial Intelligence paper with Srihari and Collins are Janina Brutt-Griffler, Ed.D., associate professor in the UB Department of Learning and Instruction; Rohini Srihari, Ph.D., professor of computer science and engineering at UB; Harish Srinivasan, a doctoral candidate at CEDAR, and Shravya Shetty, a former graduate student at CEDAR, now employed by Google.

Source: University at Buffalo

4.6 /5 (9 votes)  

Filter


Move the slider to adjust rank threshold, so that you can hide some of the comments.


Display comments: newest first

barakn
Jan 14, 2008

Rank: 3.4 / 5 (5)
This is truly sad. The student that comes up with a unique but valid point in an essay will get dinged because the neural network doesn't recognize it. These students will be trained to sound like each other. Another victory against independent thinking, variety, and uniqueness.
KB6
Jan 15, 2008

Rank: 5 / 5 (1)
Now the students just need to come up with an algorithm that turns this on its head and generates winning essay answers. The schools would then modify their machines to spot the generated answers. The students would then respond by modifying their essay generator - and the AI arms race is on!
patnclaire
Jan 16, 2008

Rank: not rated yet
barakn, the grading process would have to include a human-eye look at thrown-out essays. That is how Quality Engineers would design the process.
KB6, I think that your speculation is possible. Is it probable? Why not link it with software like Turnitin to do plagiarism checks for term papers, articles, and dissertations?
HeRoze
Jan 18, 2008

Rank: not rated yet
Has anyone tried crafting a resume that has appropriate key-words to pop up on the computer scans? This may work in a similar fasion, and I worry that KB6 is on the right track (not because it is unusual for KB6 or anything, but the ramifications of the students trying to beat the system).
Rank 4.6 /5 (9 votes)
Tags

Relevant PhysicsForums posts

More news stories

Anonymous knocks CIA website offline (Update)

The website of the Central Intelligence Agency was inaccessible on Friday after the hacker group Anonymous claimed to have knocked it offline.

Technology / Internet

created 13 hours ago | popularity 4.7 / 5 (13) | comments 21

New error-correcting codes guarantee the fastest possible rate of data transmission

Error-correcting codes are one of the triumphs of the digital age. They’re a way of encoding information so that it can be transmitted across a communication channel — such as an optical fiber o ...

Technology / Computer Sciences

created 21 hours ago | popularity 4.9 / 5 (8) | comments 6 | with audio podcast

Small modular reactor design could be a 'SUPERSTAR'

(PhysOrg.com) -- Though most of today's nuclear reactors are cooled by water, we've long known that there are alternatives; in fact, the world's first nuclear-powered electricity in 1951 came from a reactor ...

Technology / Energy & Green Tech

created 21 hours ago | popularity 4.4 / 5 (14) | comments 27 | with audio podcast

New power source discovered

(PhysOrg.com) -- Researchers at the Massachusetts Institute of Technology (MIT) and RMIT University have made a breakthrough in energy storage and power generation.

Technology / Energy & Green Tech

created 20 hours ago | popularity 4.7 / 5 (31) | comments 8 | with audio podcast

Google users warned of threat to smartphone wallets

Users of Google smartphone wallets were being warned on Friday that there is a way to crack pass codes intended to thwart thieves from going on illicit shopping sprees.

Technology / Internet

created 11 hours ago | popularity 5 / 5 (2) | comments 0


Humans may have helped the decline of African rainforests 3000 years ago

(PhysOrg.com) -- Large areas of rainforests in Central Africa mysteriously disappeared over three thousand years ago, to be replaced by savannas. The prevailing theory has been that the cause was a change ...

The power of estrogen -- male snakes attract other males

A new study has shown that boosting the estrogen levels of male garter snakes causes them to secrete the same pheromones that females use to attract suitors, and turned the males into just about the sexiest ...

Advanced power-grid model finds low-cost, low-carbon future in West

(PhysOrg.com) -- The least expensive way for the Western U.S. to reduce greenhouse gas emissions enough to help prevent the worst consequences of global warming is to replace coal with renewable and other ...

Japan scientist makes 'Avatar' robot

A Japanese-developed robot that mimics the movements of its human controller is bringing the Hollywood blockbuster "Avatar" one step closer to reality.

Could Venus be shifting gear?

(PhysOrg.com) -- ESA’s Venus Express spacecraft has discovered that our cloud-covered neighbour spins a little slower than previously measured. Peering through the dense atmosphere in the infrared, the ...

Fool's gold may prove an unlikely alternative to overexploited catalytic materials

Catalytic materials, which lower the energy barriers for chemical reactions, are used in everything from the commercial production of chemicals to catalytic converters in car engines. However, with current catalytic materials ...