Abstract: Self-supervised sound source localization in unconstrained visual scenes is an important task of audio-visual learning. In this paper, we propose a visual reasoning module to explicitly ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results