What is it about?

Automatic detection of sound source directions is useful in a variety of scenarios. For example, this knowledge can be useful in broadcast applications to pan the camera in the direction of the sound source, or in security applications to detect the directions of drones. An important problem with existing approaches to the problem is localization of such sound sources in three-dimensions. Another is carrying out localization with an acceptably low-specification hardware and if possible with low latency. We developed a method which can localize sound sources in 3D using a specially designed array of microphones. The method tries to identify the directions at which there likely is a source, and estimates directions only for those regions, thereby reducing the necessary computational effort.

Featured Image

Why is it important?

The method we proposed is very accurate: it is as accurate as the most accurate state-of-the-art method at a much lower computational cost. We tried the method under extremely challenging conditions (e.g. with real recordings made in a hall with bad acoustics) and its performance was still exceptionally high. It is hoped, therefore, that it would be used as one of the reference direction estimation methods for the specific type of microphone arrays we use.

Perspectives

We had to spend a lot of time getting this article right. The main reason for that is the fact that sound source direction estimation is a saturated topic with many good algorithms proposed since 1970s. Therefore, the reviewers tend to scrutinize a new method addressing this question more thoroughly. In the first revision we only used emulations and the reviewers were not entirely happy since real-recordings introduce some additional problems such as component noise which are likely to degrade the performance. Hence we use recordings of an amazing quartet we made earlier and demonstrated that our algorithm performs very well even under less-than-ideal recording situations.

Professor Hüseyin Hacihabiboglu
Orta Dogu Teknik Universitesi

Read the Original

This page is a summary of: Multiple Sound Source Localization With Steered Response Power Density and Hierarchical Grid Refinement, IEEE/ACM Transactions on Audio Speech and Language Processing, November 2018, Institute of Electrical & Electronics Engineers (IEEE),
DOI: 10.1109/taslp.2018.2858932.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page