Doing what the brain does -- how computers learn to listen
August 14, 2009(PhysOrg.com) -- We see, hear and feel, and make sense of countless diverse, quickly changing stimuli in our environment seemingly without effort. However, doing what our brains do with ease is often an impossible task for computers. Researchers at the Leipzig Max Planck Institute for Human Cognitive and Brain Sciences and the Wellcome Trust Centre for Neuroimaging in London have now developed a mathematical model which could significantly improve the automatic recognition and processing of spoken language. In the future, this kind of algorithms which imitate brain mechanisms could help machines to perceive the world around them.
Many people will have personal experience of how difficult it is for computers to deal with spoken language. For example, people who "communicate" with automated telephone systems now commonly used by many organisations need a great deal of patience. If you speak just a little too quickly or slowly, if your pronunciation isn't clear, or if there is background noise, the system often fails to work properly. The reason for this is that until now the computer programs that have been used rely on processes that are particularly sensitive to perturbations. When computers process language, they primarily attempt to recognise characteristic features in the frequencies of the voice in order to recognise words.
"It is likely that the brain uses a different process", says Stefan Kiebel from the Leipzig Max Planck Institute for Human Cognitive and Brain Sciences. The researcher presumes that the analysis of temporal sequences plays an important role in this. "Many perceptual stimuli in our environment could be described as temporal sequences." Music and spoken language, for example, are comprised of sequences of different length which are hierarchically ordered. According to the scientist's hypothesis, the brain classifies the various signals from the smallest, fast-changing components (e.g., single sound units like "e" or "u") up to big, slow-changing elements (e.g., the topic).
The significance of the information at various temporal levels is probably much greater than previously thought for the processing of perceptual stimuli. "The brain permanently searches for temporal structure in the environment in order to deduce what will happen next", the scientist explains. In this way, the brain can, for example, often predict the next sound units based on the slow-changing information. Thus, if the topic of conversation is the hot summer, "su…" will more likely be the beginning of the word "sun" than the word "supper".
To test this hypothesis, the researchers constructed a mathematical model which was designed to imitate, in a highly simplified manner, the neuronal processes which occur during the comprehension of speech. Neuronal processes were described by algorithms which processed speech at several temporal levels. The model succeeded in processing speech; it recognised individual speech sounds and syllables. In contrast to other artificial speech recognition devices, it was able to process sped-up speech sequences. Furthermore it had the brain's ability to "predict" the next speech sound. If a prediction turned out to be wrong because the researchers made an unfamiliar syllable out of the familiar sounds, the model was able to detect the error.
The "language" with which the model was tested was simplified - it consisted of the four vowels a, e, i and o, which were combined to make "syllables" consisting of four sounds. "In the first instance we wanted to check whether our general assumption was right", Kiebel explains. With more time and effort, consonants, which are more difficult to differentiate from each other, could be included, and further hierarchical levels for words and sentences could be incorporated alongside individual sounds and syllables. Thus, the model could, in principle, be applied to natural language.
"The crucial point, from a neuroscientific perspective, is that the reactions of the model were similar to what would be observed in the human brain", Stefan Kiebel says. This indicates that the researchers' model could represent the processes in the brain. At the same time, the model provides new approaches for practical applications in the field of artificial speech recognition.
More information: Stefan J. Kiebel, Katharina von Kriegstein, Jean Daunizeau, Karl J. Friston; Recognizing sequences of sequences; PLoS Computational Biology, August 14th, 2009.
-
Researchers produce 'neural fingerprint' of speech recognition
Nov 10, 2008 |
not rated yet |
0
-
Researchers shed light on the brain mechanism responsible for processing of speech
Aug 12, 2009 |
not rated yet |
0
-
Why can’t I learn a new language?
Jul 08, 2008 |
not rated yet |
0
-
Seeing While Hearing Speeds Brain's Processing of Speech
Jan 15, 2005 |
not rated yet |
0
-
Zeroing in on the brain's speech 'receiver'
Jun 20, 2007 |
not rated yet |
0
-
Engineers build first sub-10-nm carbon nanotube transistor
Feb 01, 2012 |
4.9 / 5 (31) |
30
-
Something old, something new: Evolution and the structural divergence of duplicate genes
Jan 31, 2012 |
4.6 / 5 (7) |
1
-
The hidden nanoworld of ice crystals: Revealing the dynamic behavior of quasi-liquid layers
Jan 30, 2012 |
5 / 5 (3) |
1
-
Stock market network reveals investor clustering
Jan 27, 2012 |
3.9 / 5 (23) |
8
-
Of microchemistry and molecules: Electronic microfluidic device synthesizes biocompatible probes
Jan 26, 2012 |
5 / 5 (1) |
0
-
Mitosis
6 hours ago
-
Stem cell question.
7 hours ago
-
Protease cleavage
14 hours ago
-
Pertubance in a model
20 hours ago
-
Cancer drugs and Alzheimer's, Oh my!
Feb 09, 2012
-
Squishing cells
Feb 09, 2012
- More from Physics Forums - Biology
More news stories
The power of estrogen -- male snakes attract other males
A new study has shown that boosting the estrogen levels of male garter snakes causes them to secrete the same pheromones that females use to attract suitors, and turned the males into just about the sexiest ...
18 hours ago |
4.8 / 5 (6) |
2
|
Grass to gas: Researchers' genome map speeds biofuel development
Researchers at the University of Georgia have taken a major step in the ongoing effort to find sources of cleaner, renewable energy by mapping the genomes of two originator cells of Miscanthus x giganteus, a large perenn ...
15 hours ago |
3.8 / 5 (5) |
0
|
Miami battling invasion of giant African snails
No one knows how they got there. But an invasion of African giant snails has southern Florida in a panic over potential crop damage, disease and general yuckiness surrounding the slimy gastropods.
22 hours ago |
4 / 5 (1) |
4
Experts reveal how plants don't get sunburn
(PhysOrg.com) -- Experts at the University of Glasgow have discovered how plants survive the harmful rays of the sun.
18 hours ago |
4.8 / 5 (5) |
0
|
Protein libraries in a snap
(PhysOrg.com) -- A Rice University undergraduate will depart with not only a degree but also a possible patent for his invention of an efficient way to create protein libraries, an important component of biomolecular ...
22 hours ago |
4.8 / 5 (4) |
1
|
Anonymous knocks CIA website offline (Update)
The website of the Central Intelligence Agency was inaccessible on Friday after the hacker group Anonymous claimed to have knocked it offline.
New error-correcting codes guarantee the fastest possible rate of data transmission
Error-correcting codes are one of the triumphs of the digital age. Theyre a way of encoding information so that it can be transmitted across a communication channel such as an optical fiber o ...
Humans may have helped the decline of African rainforests 3000 years ago
(PhysOrg.com) -- Large areas of rainforests in Central Africa mysteriously disappeared over three thousand years ago, to be replaced by savannas. The prevailing theory has been that the cause was a change ...
New power source discovered
(PhysOrg.com) -- Researchers at the Massachusetts Institute of Technology (MIT) and RMIT University have made a breakthrough in energy storage and power generation.
Small modular reactor design could be a 'SUPERSTAR'
(PhysOrg.com) -- Though most of today's nuclear reactors are cooled by water, we've long known that there are alternatives; in fact, the world's first nuclear-powered electricity in 1951 came from a reactor ...
Google users warned of threat to smartphone wallets
Users of Google smartphone wallets were being warned on Friday that there is a way to crack pass codes intended to thwart thieves from going on illicit shopping sprees.