A computer can pick out speech even amid cacophony

November 26, 2008 A computer can pick out speech even amid cacophony

Enlarge

Schematic diagram of SHoUT

(PhysOrg.com) -- Using a recent development in speech recognition, it is possible to search through television news programmes provided the recognition system has been trained beforehand. PhD candidate Marijn Huijbregts from the University of Twente (Netherlands) has, however, taken things even further: he has developed Spoken Document Retrieval for audio and video files that the speech recognition system has not yet been trained to deal with.

This version of speech recognition works well even if there is a great deal of unexpected background noise. Huijbregts received his doctorate from the Faculty of Electrical Engineering, Mathematics and Computer Science on 21 November.

Information can be retrieved from text very quickly using, for example, an index in a book or a search machine such as Google. However, it is much more difficult to search in audio and video files, as they do not have an easily searchable index. You can use speech recognition to simplify this process as most of the information in audio and video files comes from speech. By recording via speech recognition, you can transform speech into text. To do this, you need a Spoken Document Retrieval (SDR) system; this makes it possible to search directly in audio and video materials, just as if you were searching in ordinary text documents. In other words, a sort of Google for audio and video.

Evening news on television

The Human Media Interaction group at the University of Twente had previously developed an SDR system for an evening television news programme. Search terms could be used to look for specific topics, the system being specially trained using newspaper texts and 20 hours of news programmes. The SDR for the evening news programme worked well because, in that situation, it was more or less known what was going to be said and there was little background noise. If you tried applying this system, without any training, to other video files, it did not perform well. Huijbregts then wondered whether he could develop a SDR system for which almost no training data would be needed, but which could nevertheless deal with unknown audio and video files satisfactorily.

SHoUT

With unknown audio and video files, it is not clear beforehand what is going to happen: who is speaking, what is being said and what sort of background noises are present. Huijbregts therefore developed an SDR system that was robust enough to deal with these unknown situations. It is called SHoUT (this acronym corresponds to the Dutch version of ‘Speech Recognition Research at University of Twente’). SDR can be described as robust if it can deal with all audio and video files under all sorts of circumstances, such as background noise or if people are not speaking clearly.

SHoUT is divided up into three stages. Firstly, the system distinguishes between speech and other sounds. For example, background music is filtered out from speech. Secondly, the system identifies different speakers and gives them labels. Then finally the automatic speech recognition takes place: the system transforms speech into text. You can now search the text file for relevant topics using key words, just as Google searches through text files on Internet.

The first version of SHoUT is already available, but Huijbregts is developing it even further. SHoUT and other demonstrations of SDR systems can be found on the website of Huijbregts (http://wwwhome.cs.utwente.nl/~huijbreg/).

Provided by University of Twente, Netherlands


print this article email this article download pdf blog this article bookmark this article     Stumble it Digg this share on Facebook retweet share on Reddit add to delicious
Rate this story - 4.8 /5 (9 votes)

Rank Filter

Move the slider to adjust rank threshold, so that you can hide some of the comments.


Display comments: newest first


November 26, 2008 all stories

Comments: 1

4.8 /5 (9 votes)
  • Stumble this up

  • Digg this

  • share this

  • hide
  • Related Stories

  • Video fingerprinting offers search solution
    created Nov 09, 2009 | popularity not rated yet | comments 0
  • Listen, watch, read -- computers search for meaning
    created Oct 30, 2009 | popularity not rated yet | comments 0
  • Blinkx at work on search engine for online video
    created Sep 02, 2009 | popularity not rated yet | comments 0
  • Video archive project can record lectures for posterity
    created Aug 11, 2008 | popularity not rated yet | comments 0
  • IBM Technology to Protect Customer Data in the Call Center Industry
    created Jul 03, 2008 | popularity not rated yet | comments 0



  • hide
  • Relevant PhysicsForums posts

  • A solution for playing JVC camcorder(mod files) problems
    created 7 hours ago
  • casio calculator that's similar to TI-89
    created Nov 08, 2009
  • Advice on what cell phone to get
    created Nov 08, 2009
  • Changing the language options on your phone.
    created Nov 03, 2009
  • More from Physics Forums - Computing & Technology

Other News

All eyes on Murdoch as newspapers ponder digital future

Technology / Internet

created 20 hours ago | popularity not rated yet | comments 1

Is Rupert Murdoch bluffing? Making a bold high-stakes gamble that will save the troubled newspaper industry? Or pursuing a pipe dream that can only end in failure?


A system of space solar power system (SSPS)

Japan eyes solar station in space as new energy source

Technology / Energy

created Nov 08, 2009 | popularity 4.8 / 5 (23) | comments 31

It may sound like a sci-fi vision, but Japan's space agency is dead serious: by 2030 it wants to collect solar power in space and zap it down to Earth, using laser beams or microwaves.


Road trains may be coming soon to Europe

Road trains may be coming soon to Europe (w/ Video)

Technology / Engineering

created Nov 13, 2009 | popularity 4.6 / 5 (16) | comments 22

(PhysOrg.com) -- Road trains linking vehicles together in a traveling convoy are planned for Europe. With only the lead vehicle being actively driven, the road trains would allow commuters to sleep, read a ...


Cars sit in traffic on a highway

Netherlands to levy 'green' road tax by the kilometre

Technology / Hi Tech

created Nov 13, 2009 | popularity 3.2 / 5 (5) | comments 8

The Dutch government said Friday it wants to introduce a "green" road tax by the kilometre from 2012 aimed at cutting carbon dioxide emissions by 10 percent and halving congestion.


Hydrogen milestone moves energy independence one step forward

Hydrogen milestone moves energy independence one step forward

Technology / Energy

created Nov 10, 2009 | popularity 3.9 / 5 (12) | comments 7

(PhysOrg.com) -- Big things often come in small packages. That's certainly the case with the potential created by recent successes in hydrogen research at Idaho National Laboratory.