What is the sound of one person talking?

May 2nd, 2006 Sample of Buckeye Corpus

This screen shot shows a typical display found on the DVD of the Buckeye Speech Corpus. Shown are two visual representations of a phrase spoken by one of the talkers, indicating volume and frequency. Below the displays is a line with a transcription of what the speaker is saying. Then comes a line of “phonemic labeling,” which shows how each word spoken was pronounced – similar to the pronunciation guides found in a dictionary. Please listen to the accompanying audio to hear the woman speaking this phrase.

When researchers interviewed 40 Columbus residents about their opinions on life in the city, the scientists ignored what the people had to say. All the scientists really cared about was how they said it.

In a lab at Ohio State University , researchers recorded 30- to 60-minute casual interviews with these residents, talking about how Columbus has changed over the years, how families should get along, as well as issues in sports, traffic and politics.

The participants were told that the purpose of the study was to learn how people express "everyday" opinions in conversation. But when the interviews were over, the scientists didn't even examine the views that were expressed.

Instead, they painstakingly listened to every spoken word – indeed every syllable - along with coughs, laughs and pauses in the conversation, and then labeled what actually was spoken.

The result is a 306,652-word repository of how people in central Ohio speak. Now scientists from around the world, and in a variety of disciplines, can use the collection – called the Buckeye Speech Corpus – to advance their research. (Researchers refer to a collected body of recordings as a corpus.)

"A critical part of communication is how your ears translate the sounds coming out of my mouth into recognizable words," said Mark Pitt, one of the leaders of the project, and professor of psychology at Ohio State.

"You don't have to go to school to learn how to do it. It is fast and efficient. The scientific question is 'how is this done?' The corpus will help researchers find answers to that question and many others."

Other members of this large research team include Eric Fosler-Lussier, assistant professor of computer science and engineering, and Elizabeth Hume, professor of linguistics, both at Ohio State.

The corpus, available to researchers on DVD, includes the actual recordings of the interviews, a written transcript, and several levels of labeling. The first half of the corpus was officially released in March. A typical display of a stretch of speech from the DVD shows two visual representations of speech from one of the talkers, indicating volume and frequency. Below the displays is a line with a transcription of what the speaker is saying. Then comes a line of "phonemic labeling," which shows how each word spoken was pronounced – similar to the pronunciation guides found in a dictionary.

This phonemic labeling is a key part of the corpus, because in casual conversations, people don't always use the proper English pronunciations they were taught in school, said Laura Dilley, a post-doctoral researcher in psychology who works on the corpus.

"It's interesting because of the many kinds of modifications that happen to the acoustics of speech in conversations," Dilley said. "For example, we may pronounce 'don't you' as 'don'tcha' when we are talking.

"By looking at the phonemic transcriptions we can tell exactly how a person pronounced a word. And from that scientists may be able to make some inferences about why people produce the sound one way as opposed to another."

The interviews are also labeled to show when people made non-word sounds during a conversation, such as a cough.

"Even laughs are noted, because some researchers may be interested in how that is part of a conversation," Pitt said. "It can convey meaning as well."

The 40 residents were interviewed in 1999 and 2000. But it has taken until this year to label the speech of half of the speakers, Pitt said. The data from the remaining 20 participants should be available by the end of the year.

"Collecting the interviews took a long time. But what really takes a long time is labeling them. Researchers had to listen to each conversation and figure out what sounds of the language the speakers said or didn't say. It's very difficult and very time consuming," he said.

When you actually sit down to label each word, "they can blend together and be very difficult to identify," Pitt said. "Sometimes you scratch your head trying to figure out how to label the speech. But when you're just listening to it as part of the conversation, it is all perfectly intelligible. That's part of the mystery of communication that scientists are trying to figure out."

A combination of factors makes the Buckeye Speech Corpus a unique resource for researchers, according to Pitt and Dilley. For one, it is one of the largest corpora of high-fidelity, conversational speech available. Other corpora involve speakers reading words directly from a text, but that is very different than conversation. There are also corpora of conversational speech recorded over the telephone, but that does not have the high fidelity of this corpus, and the dynamic of face-to-face conversation.

Another key difference is that the Buckeye Corpus is available free to researchers, both in academia and industry.

Although the Buckeye Corpus has been available for only a few weeks, researchers from around the world are already ordering copies of it. A researcher from Italy ordered the corpus to study "slips of the tongue," such as when people accidentally substitute one word for another that sounds similar (saying "rabbit" instead of "habit").

Pitt and Dilley expect a lot more interest in the corpus.

"Across a variety of fields -- in communication, speech sciences, linguistics, speech and hearing, computer science, psychology – the corpus will provide different uses," Pitt said. "It was created for all these scientific communities."

Computer scientists could use the corpus to help improve speech recognition software. Communication researchers will find it useful to study conversational dynamics. Psychologists will be interested in what listeners are faced with in terms of the physical and acoustic properties of speech.

Eventually, the corpus will have a search function that will make it even more useful. "If a researcher wanted to look up all the uses or pronunciations of a certain word in the interviews, he or she could do that," Dilley said.

The work of transcribing and labeling the interviews allowed the Ohio State researchers to uncover some interesting facts about how people speak in conversation.

For instance, the speakers spoke a total of 306,652 words, but only 9,600 different words. Almost 80 percent of the total words spoken were one-syllable words. One and two-syllable words made up more than 90 percent of the total words spoken.

Slightly more than half of the words spoken (57 percent) were function words – words like prepositions, pronouns and conjunctions that have mostly grammatical uses within a sentence. The remaining 43 percent were content words, which includes nouns, verbs and adjectives.

The Buckeye Speech Corpus project was originally funded with a seed grant from Ohio State's Office of Research. It has since received funding from the National Institute on Deafness and Other Communication Disorders, which is part of the National Institutes of Health.

Ohio State may be emerging as a center for corpora of conversational speech, Pitt said, as the Department of Speech and Hearing Sciences was just awarded a large, multi-year grant from NIH to collect interviews with speakers native to Ohio, Wisconsin, and western North Carolina.

Source: Ohio State University


print this article email this article download pdf blog this article bookmark this article     Digg this Stumble it share on Facebook share on Reddit add to delicious save to Yahoo! bookmarks
3.5/5 after 13 votes


May 2nd, 2006 all stories
Technology / Software

Comments: 0
Rank: 3.5/5 after 13 votes

  • Stumble this up

  • Digg this

  • Share it:
  • share on Facebook
  • share on MySpace
  • share on Slashdot
  • rss-newsfeed
  • share on Google
  • share on Reddit
  • add to delicious
  • save to Yahoo! bookmarks
  • share on Windows Live
  • Add to Mixx!
Rating: 3.5/5 after 13 votes

  • Related Stories

  • Polish and Italian get advanced language recognition system
    created Jan 23, 2009 | popularity not rated yet | comments 0
  • Fast-learning computer translates from four languages
    created Feb 18, 2008 | popularity not rated yet | comments 0
  • Ability to listen to 2 things at once is largely inherited, says twin study
    created Jul 17, 2007 | popularity not rated yet | comments 0
  • Prenatal alcohol exposure damages white matter, the brain's connective network
    created Dec 19, 2008 | popularity not rated yet | comments 0
  • A computer that can 'read' your mind
    created Jun 02, 2008 | popularity not rated yet | comments 0

Tags


  • Physicists Demonstrate Quantum Memory with Matter Qubits
    Physicists Demonstrate Quantum Memory with Matter Qubits
    Physics / General Physics
    created Jul 03, 2009 | popularity 4.4 / 5 (17) | comments 1
  • 'Holey' Nanosheets for Wastewater Dye Removal
    Nanotechnology / Nanomaterials
    created Jul 01, 2009 | popularity 5 / 5 (5) | comments 1
  • Jellyfish Robot Swims Like its Biological Counterpart
    Jellyfish Robot Swims Like its Biological Counterpart
    Electronics / Robotics
    created Jun 26, 2009 | popularity 4.4 / 5 (8) | comments 1
  • Could Maxwell's Demon Exist in Nanoscale Systems?
    Could Maxwell's Demon Exist in Nanoscale Systems?
    Physics / General Physics
    created Jun 24, 2009 | popularity 4.4 / 5 (18) | comments 29
  • Living Safely with Robots, Beyond Asimov's Laws
    Living Safely with Robots, Beyond Asimov's Laws
    Electronics / Robotics
    created Jun 22, 2009 | popularity 4.6 / 5 (52) | comments 40
  • Other News

    Japan demands 119 million dlrs in tax from Amazon: report

    Technology / Business

    created 9 hours ago | popularity 4 / 5 (3) | comments 0

    Japanese authorities told a sales affiliate of US retail giant Amazon.com to pay about 119 million dollars in tax for unreported income over a three-year period, a newspaper said Sunday.


    Geeks double as scourges and sages at media summit

    Technology / Business

    created 5 hours ago | popularity not rated yet | comments 0

    (AP) -- The media moguls attending an annual powwow staged by investment bank Allen & Co. used to be able to rest comfortably in the Idaho mountains as they mulled their next moves.


    Iconic skyscrapers find new luster by going green (AP)

    Iconic skyscrapers find new luster by going green

    Technology / Energy

    created 10 hours ago | popularity 1 / 5 (1) | comments 0

    (AP) -- When owners of the Empire State Building decided to blanket its towering facade this year with thousands of insulating windows, they were only partly interested in saving energy. They also needed ...


    UK spy chief's family details posted on Facebook

    Technology / Internet

    created 10 hours ago | popularity not rated yet | comments 0

    (AP) -- He's the spy who came in from the beach.


    Downturn dating: Hearts flutter as markets stutter (AP)

    Downturn dating: Hearts flutter as markets stutter

    Technology / Internet

    created 10 hours ago | popularity not rated yet | comments 0

    (AP) -- Credit the recession for "staycations" and bringing us more game-night parties at home. But also give it a shout for spurring more first dates.