Algorithms for prosodic discourse feature interpretation in case of its processing using low-speed codecs

Authors

  • Maxim Bessonov RUDN
  • Natalia A. Bessonova Peoples' Friendship University of Russia
  • Mais P. Farkhadov V.A. Trapeznikov Institute for Control Sciences of Russian Academy of Science

DOI:

https://doi.org/10.25728/assa.2018.18.1.524

Keywords:

language identification, neural networks, discourse prosodic feature, wide phonetic categories

Abstract

In this article we propose two algorithms for discourse prosodic feature interpretation. The first algorithm based on wide phonetic categories and second algorithm based on audio signal melodic cross-correlation functions and short-timed energy series – as well as methodical recommendations for their use are proposed as a part of the problem of audio signal language identification based on a prosodic approach. An experimental evaluation of both algorithms is proposed. Neural networks are used as a decision rule. Wide phonetic categories were pause, pitch, noise. We have expanded wide phonetic categories to pause, pitch, noise, five levels of pitch, sites of decreasing energy, main maximum, adverse maximum. The total number of categories was 14. These algorithms can be applied for language identification or speaker identification.  At the same time there is no requirement to restore the speech signal after processing it by low-speed codec. Certainly, frames of the speech codec must contain such parameters as pitch, tone-noise parameter, energy. The base of speech signals consists of 10 languages 10 speakers per language. Total time of the speech per speaker is 100 minutes. This time takes into account statistical regularities of languages. Tests for evaluation of the algorithms were carried out with a multilayer perceptron.

Downloads

Download data is not yet available.

Downloads

Published

2018-05-14

How to Cite

Bessonov, M., Bessonova, N. A., & Farkhadov, M. P. (2018). Algorithms for prosodic discourse feature interpretation in case of its processing using low-speed codecs. Advances in Systems Science and Applications, 18(1), 1–11. https://doi.org/10.25728/assa.2018.18.1.524

Issue

Section

Translated articles