Methods to Estimate Relative Transfer Function Between Two Microphones

Relative Transfer Function

The relative transfer function (RTF) between microphones is a key component in solving multichannel audio processing tasks such as noise reduction, speaker extraction, speech enhancement, interference cancellation, binaural speech processing, etc. For example, when the RTF is known, an efficient filter can be designed that cancels the target signal and only pass through noise signals. RTFs are required for constructing the blocking matrix part of all systems having the structure of generalized sidelobe canceler or for binaural cue preservation. Relative impulse response (the time-domain equivalent of the RTF) between microphones is usually long and dense due to the reverberant acoustic environment. Estimating them from short and noisy recordings, which is needed as the target may perform movements and also the environment can be changing, poses a long-standing challenge of audio signal processing.

Matlab codes:

Time-domain least squares estimator: here; helping function block_levinson.m by Keenan Pepper here
Conventional frequency-domain estimator: here
Fast sparse reconstruction of the RTF (LASSO with the DFT matrix): SpaRIR.m
Demo: zip

Corresponding papers:

Z. Koldovský, J. Málek, and S. Gannot, “Spatial Source Subtraction Based on Incomplete Measurements of Relative Transfer Function,” IEEE/ACM Trans. on Speech, Audio and Language Processing, vol. 23, no. 8, pp. 1335 – 1347, August 2015 (arXiv:1411.2744 [cs.SD]).

J. Málek and Z. Koldovský, “Sparse Target Cancellation Filters with Application to Semi-Blind Noise Extraction,” Proc. of the 41st IEEE International Conference on Audio, Speech, and Signal Processing (ICASSP 2014), Florence, Italy, pp. 2128-2132, May 2014. (here)

Z. Koldovský and P. Tichavský, “Sparse Reconstruction of Incomplete Relative Transfer Function: Discrete and Continuous Time Domain,” The 23rd European Signal Processing Conference (EUSIPCO 2015), Nice, France, Sept. 2015. (here)