Skip to main content

Data-driven Approaches in Acoustic Signal Processing: Methods and Applications

  1. Spoken language recognition has made significant progress in recent years, for which automatic speech recognition has been used as a parallel branch to extract phonetic features. However, there is still a lack...

    Authors: Zimu Li, Yanyan Xu, Dengfeng Ke and Kaile Su
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:14
  2. Speech feature model is the basis of speech and noise separation, speech expression, and different styles of speech conversion. With the development of signal processing methods, the feature types and dimensio...

    Authors: Xiaoping Xie, Yongzhen Chen, Rufeng Shen and Dan Tian
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:10
  3. The acoustic echo cannot be entirely removed by linear adaptive filters due to the nonlinear relationship between the echo and the far-end signal. Usually, a post-processing module is required to further suppr...

    Authors: Hongsheng Chen, Guoliang Chen, Kai Chen and Jing Lu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2021 2021:35
  4. The performance of speech recognition systems trained with neutral utterances degrades significantly when these systems are tested with emotional speech. Since everybody can speak emotionally in the real-world...

    Authors: Masoud Geravanchizadeh, Elnaz Forouhandeh and Meysam Bashirpour
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2021 2021:31
  5. Many end-to-end approaches have been proposed to detect predefined keywords. For scenarios of multi-keywords, there are still two bottlenecks that need to be resolved: (1) the distribution of important data th...

    Authors: Gui-Xin Shi, Wei-Qiang Zhang, Guan-Bo Wang, Jing Zhao, Shu-Zhou Chai and Ze-Yu Zhao
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2021 2021:27
  6. Due to the ad hoc nature of wireless acoustic sensor networks, the position of the sensor nodes is typically unknown. This contribution proposes a technique to estimate the position and orientation of the sens...

    Authors: Tobias Gburrek, Joerg Schmalenstroeer and Reinhold Haeb-Umbach
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2021 2021:25
  7. Estimating time-frequency domain masks for single-channel speech enhancement using deep learning methods has recently become a popular research field with promising results. In this paper, we propose a novel comp...

    Authors: Ziyi Xu, Samy Elshamy, Ziyue Zhao and Tim Fingscheidt
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2021 2021:24
  8. Multiple sound source localization is a hot issue of concern in recent years. The Single Source Zone (SSZ) based localization methods achieve good performance due to the detection and utilization of the Time-F...

    Authors: Maoshen Jia, Shang Gao and Changchun Bao
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2021 2021:23
  9. Recently, the non-intrusive speech quality assessment method has attracted a lot of attention since it does not require the original reference signals. At the same time, neural networks began to be applied to ...

    Authors: Miao Liu, Jing Wang, Weiming Yi and Fang Liu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2021 2021:20
  10. Sound event detection (SED), which is typically treated as a supervised problem, aims at detecting types of sound events and corresponding temporal information. It requires to estimate onset and offset annotat...

    Authors: Sichen Liu, Feiran Yang, Yin Cao and Jun Yang
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2021 2021:19
  11. Amongst the various characteristics of a speech signal, the expression of emotion is one of the characteristics that exhibits the slowest temporal dynamics. Hence, a performant speech emotion recognition (SER)...

    Authors: Duowei Tang, Peter Kuppens, Luc Geurts and Toon van Waterschoot
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2021 2021:18
  12. In this study, we present a deep neural network-based online multi-speaker localization algorithm based on a multi-microphone array. Following the W-disjoint orthogonality principle in the spectral domain, tim...

    Authors: Hodaya Hammer, Shlomo E. Chazan, Jacob Goldberger and Sharon Gannot
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2021 2021:16
  13. The presence of degradations in speech signals, which causes acoustic mismatch between training and operating conditions, deteriorates the performance of many speech-based systems. A variety of enhancement tec...

    Authors: Yuki Saishu, Amir Hossein Poorjam and Mads Græsbøll Christensen
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2021 2021:9
  14. Over the recent years, machine learning techniques have been employed to produce state-of-the-art results in several audio related tasks. The success of these approaches has been largely due to access to large...

    Authors: Rajat Hebbar, Pavlos Papadopoulos, Ramon Reyes, Alexander F. Danvers, Angelina J. Polsinelli, Suzanne A. Moseley, David A. Sbarra, Matthias R. Mehl and Shrikanth Narayanan
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2021 2021:7
  15. Two novel methods for speaker separation of multi-microphone recordings that can also detect speakers with infrequent activity are presented. The proposed methods are based on a statistical model of the probab...

    Authors: Bracha Laufer-Goldshtein, Ronen Talmon and Sharon Gannot
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2021 2021:5