Doing what the brain does -- how computers learn to listen

August 14, 2009

(PhysOrg.com) -- We see, hear and feel, and make sense of countless diverse, quickly changing stimuli in our environment seemingly without effort. However, doing what our brains do with ease is often an impossible task for computers. Researchers at the Leipzig Max Planck Institute for Human Cognitive and Brain Sciences and the Wellcome Trust Centre for Neuroimaging in London have now developed a mathematical model which could significantly improve the automatic recognition and processing of spoken language. In the future, this kind of algorithms which imitate brain mechanisms could help machines to perceive the world around them.

Many people will have personal experience of how difficult it is for computers to deal with spoken language. For example, people who "communicate" with automated telephone systems now commonly used by many organisations need a great deal of patience. If you speak just a little too quickly or slowly, if your pronunciation isn't clear, or if there is , the system often fails to work properly. The reason for this is that until now the computer programs that have been used rely on processes that are particularly sensitive to perturbations. When computers process language, they primarily attempt to recognise characteristic features in the frequencies of the voice in order to recognise words.

"It is likely that the brain uses a different process", says Stefan Kiebel from the Leipzig Max Planck Institute for Human Cognitive and Brain Sciences. The researcher presumes that the analysis of temporal sequences plays an important role in this. "Many perceptual stimuli in our environment could be described as temporal sequences." Music and spoken language, for example, are comprised of sequences of different length which are hierarchically ordered. According to the scientist's hypothesis, the brain classifies the various signals from the smallest, fast-changing components (e.g., single sound units like "e" or "u") up to big, slow-changing elements (e.g., the topic).

The significance of the information at various temporal levels is probably much greater than previously thought for the processing of perceptual stimuli. "The brain permanently searches for temporal structure in the environment in order to deduce what will happen next", the scientist explains. In this way, the brain can, for example, often predict the next sound units based on the slow-changing information. Thus, if the topic of conversation is the hot summer, "su…" will more likely be the beginning of the word "sun" than the word "supper".

To test this hypothesis, the researchers constructed a which was designed to imitate, in a highly simplified manner, the neuronal processes which occur during the comprehension of speech. Neuronal processes were described by algorithms which processed speech at several temporal levels. The model succeeded in processing speech; it recognised individual speech sounds and syllables. In contrast to other artificial speech recognition devices, it was able to process sped-up speech sequences. Furthermore it had the brain's ability to "predict" the next speech sound. If a prediction turned out to be wrong because the researchers made an unfamiliar syllable out of the familiar sounds, the model was able to detect the error.

The "language" with which the model was tested was simplified - it consisted of the four vowels a, e, i and o, which were combined to make "syllables" consisting of four sounds. "In the first instance we wanted to check whether our general assumption was right", Kiebel explains. With more time and effort, consonants, which are more difficult to differentiate from each other, could be included, and further hierarchical levels for words and sentences could be incorporated alongside individual sounds and syllables. Thus, the model could, in principle, be applied to natural language.

"The crucial point, from a neuroscientific perspective, is that the reactions of the model were similar to what would be observed in the human brain", Stefan Kiebel says. This indicates that the researchers' model could represent the processes in the . At the same time, the model provides new approaches for practical applications in the field of artificial speech recognition.

More information: Stefan J. Kiebel, Katharina von Kriegstein, Jean Daunizeau, Karl J. Friston; Recognizing sequences of sequences; PLoS Computational Biology, August 14th, 2009.

Source: Max-Planck-Gesellschaft (news : web)


Rank 5 /5 (1 vote)
Related Stories
Relevant PhysicsForums posts
  • Mitosis
    created6 hours ago
  • Stem cell question.
    created7 hours ago
  • Protease cleavage
    created14 hours ago
  • Pertubance in a model
    created20 hours ago
  • Cancer drugs and Alzheimer's, Oh my!
    createdFeb 09, 2012
  • Squishing cells
    createdFeb 09, 2012
  • More from Physics Forums - Biology

More news stories

The power of estrogen -- male snakes attract other males

A new study has shown that boosting the estrogen levels of male garter snakes causes them to secrete the same pheromones that females use to attract suitors, and turned the males into just about the sexiest ...

Biology / Plants & Animals

created 18 hours ago | popularity 4.8 / 5 (6) | comments 2 | with audio podcast

Grass to gas: Researchers' genome map speeds biofuel development

Researchers at the University of Georgia have taken a major step in the ongoing effort to find sources of cleaner, renewable energy by mapping the genomes of two originator cells of Miscanthus x giganteus, a large perenn ...

Biology / Biotechnology

created 15 hours ago | popularity 3.8 / 5 (5) | comments 0 | with audio podcast

Miami battling invasion of giant African snails

No one knows how they got there. But an invasion of African giant snails has southern Florida in a panic over potential crop damage, disease and general yuckiness surrounding the slimy gastropods.

Biology / Ecology

created 22 hours ago | popularity 4 / 5 (1) | comments 4

Experts reveal how plants don't get sunburn

(PhysOrg.com) -- Experts at the University of Glasgow have discovered how plants survive the harmful rays of the sun.

Biology / Cell & Microbiology

created 18 hours ago | popularity 4.8 / 5 (5) | comments 0 | with audio podcast

Protein libraries in a snap

(PhysOrg.com) -- A Rice University undergraduate will depart with not only a degree but also a possible patent for his invention of an efficient way to create protein libraries, an important component of biomolecular ...

Biology / Cell & Microbiology

created 22 hours ago | popularity 4.8 / 5 (4) | comments 1 | with audio podcast


Anonymous knocks CIA website offline (Update)

The website of the Central Intelligence Agency was inaccessible on Friday after the hacker group Anonymous claimed to have knocked it offline.

New error-correcting codes guarantee the fastest possible rate of data transmission

Error-correcting codes are one of the triumphs of the digital age. They’re a way of encoding information so that it can be transmitted across a communication channel — such as an optical fiber o ...

Humans may have helped the decline of African rainforests 3000 years ago

(PhysOrg.com) -- Large areas of rainforests in Central Africa mysteriously disappeared over three thousand years ago, to be replaced by savannas. The prevailing theory has been that the cause was a change ...

New power source discovered

(PhysOrg.com) -- Researchers at the Massachusetts Institute of Technology (MIT) and RMIT University have made a breakthrough in energy storage and power generation.

Small modular reactor design could be a 'SUPERSTAR'

(PhysOrg.com) -- Though most of today's nuclear reactors are cooled by water, we've long known that there are alternatives; in fact, the world's first nuclear-powered electricity in 1951 came from a reactor ...

Google users warned of threat to smartphone wallets

Users of Google smartphone wallets were being warned on Friday that there is a way to crack pass codes intended to thwart thieves from going on illicit shopping sprees.