Python Language Training System Based on MFCC, VQ, Variational Coefficient and KNTM Algorithm

Leandro Daniel Lau Alfonso; Sergio Suarez Guerra; Jose Luis Oropeza Rodriguez; Roberto Rodriguez Morales; Gustavo Asumu Mboro Nchama

doi:doi:10.11648/j.mcs.20210602.12

Python Language Training System Based on MFCC, VQ, Variational Coefficient and KNTM Algorithm

Leandro Daniel Lau Alfonso, Sergio Suarez Guerra, Jose Luis Oropeza Rodriguez, Roberto Rodriguez Morales, Gustavo Asumu Mboro Nchama

Published in Mathematics and Computer Science (Volume 6, Issue 2)

Received: 31 March 2021 Accepted: 26 April 2021 Published: 14 May 2021

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

This contribution describes the second stage of the creation of a language training system programmed in Python with the aim of application to speech therapy in spanish-speaking countries, starting the study in Cuba. The first stage of this research was carried out in Matlab by analyzing the dynamics of change of the centroids of the codebooks, extracted from words pronounced by a locutor. As second stage, the Variational Coefficient formula is used in order to estimate the percentage of effectiveness with which the announcer performs voice training. A modified approach to programming the variational coefficient is taken into account as a measure of dispersion of a group of vectors. The modification is given by taking the mean of the group of vectors as the vector that represents the phonetic boundaries of the word to be trained. Besides, a novel approach for word recognition is used, based on the K-Nearest Training Matrix (KNTM) algorithm that lays its foundations in the analysis of matrix similarity taken the Frobenius norm as a measure to distinguish similar or non-similar characteristics of a matrix with respect to a database of matrices. To reduce the computational cost of the program and speed up its proper functioning, the training matrices of the database are saved in files with a .tex extension, in this way after training process, the program should only read them and not recalculate them, which significantly reduces the running time of the algorithm.

Published in	Mathematics and Computer Science (Volume 6, Issue 2)
DOI	10.11648/j.mcs.20210602.12
Page(s)	38-44
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2024. Published by Science Publishing Group

Keywords

Mel Frequency Cepstral Coefficients, Vector Quantization, Variational Coefficient, Word Recognition

References

[1]	Alani, A., Deriche, M., (1999). A Novel Approach to Speech Segmentation Using the Wavelet Transform. Signal Processing and Its Applications, 1999. ISSPA'99. Proceedings of the Fifth International Symposium on Signal Processing and Its Applications. 1, 127-130.
[2]	Banu, S., Cemanur, A., Gökhan, C., Sulayman, J., Tuba, Y., Mehmet, Ҫ., Bülent, Ö., Ibrahim, A., (2019). Microwave dielectric property based classification of renal calculi: Application of a KNN algorithm. Computers in Biology and Medicine, 112 (2019) 103366.
[3]	Bhagyalaxmi, J., Anita, M., Subrat, KM. (2020). Gender Recognition of Speech Signal using KNN and SVM. International Conference on IoT based Control Networks and Intelligent Systems (ICICNIS 2020). Electronic copy available at: https://ssrn.com/abstract=3769786.
[4]	Christophe, P., Christoffer, HH, Sigurd, E., Marlène, G., (2020). On the use of the coefficient of variation to quantify and compare trait variation. Evolution Letters, 4-3: 180-188.
[5]	Davis, SB, Mermelstein, P. (1980). Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences, IEEE Trans. on Acoustic, Speech and Signal Processing, 28 (4): 357-366.
[6]	De Mori, R., Laface, P. (1980). Use of Fuzzy Algorithms for Phonetic and Phonemic Labeling of Continuous Speech. Pattern Analysis and Machine Intelligence, IEEE Transactions on, PAMI-2 (2): 136-148.
[7]	Fant, G. Speech Sounds and Features. The MIT Press, Cambridge, MA, USA, 1973.
[8]	Finster, H. (1992). Automatic Speech Segmentation using Neural Network and Phonetic Transcription. Neural Networks, 1992. IJCNN, International Joint Conference on, 4 (4): 734-736.
[9]	Gomez, JA, Castro, MJ. (2002). Automatic Segmentation of Speech at the Phonetic Level. En: Structural, Syntactic, and Statistical Pattern Recognition. Lecture Notes in Computer Science, 2396, 883-921.
[10]	Grieder W., Kinsner W., Speech Segmentation by Variance Fractal Dimension, Department of Electrical and Computer Engineering and Telecommunications Research Laboratories, University of Manitoba, Winnipeg, Manitoba, Canada R3T 5V6.
[11]	Hernandez-Mena, C., Herrera-Camacho, A. (2015). Creating a Grammar-Based Speech Recognition Parser for Mexican Spanish Using HTK, Compatible with CMU Sphinx-III System, International Journal of Electronics and Electrical Engineering, 3 (3): 220-224.
[12]	Linde, Y., Buzo, A., Gray RM. (1980). An Algorithm for Vector Quantizer Design. IEEE TRANSACTIONS ON COMMUNICATIONS, COM-28 (1): 84-95.
[13]	Milone, DH, Merelo, JJ, Rufiner, HL. (2002). Evolutionary Algorithm for Speech Segmentation. Evolutionary Computation, 2002. CEC'02. Proceedings of the 2002 Congress on, 2, 1115-1120.
[14]	Moore, BCJ, Glasberg, BR. (1983). Suggested formulae for calculating auditory-filter bandwidths and excitation patterns, Journal of the Acoustical Society of America, 74 (3): 750-753.
[15]	Proakis, JG, Manolakis DG., Digital Signal Processing. Principles, Algorithms and Applications, Third Edition, \copyright 1996 by Prentice-Hall, Inc. Simon \& Schuster/A Viacom Company Upper Saddle River, New Jersey 07458 All rights reserved, ISBN 0-13-394338-9.
[16]	Sayood, K. (2012) Vector Quantization. Introduction to data compression (fourth edition) A volume in The Morgan Kaufmann Series in Multimedia Information and Systems, 295-344.
[17]	Sergio Suarez Guerra, Jose Luis Oropeza Rodriguez (2020). Automatic Phonetic Labeling at Word Level Using the Dynamics of Changing Codebook Vectors, Computación y Sistemas, 24 (2): 855-868.
[18]	Shangchun, L., Gongfa, L., Jiahan, L., Du, J., Guozhang, J., Ying, S., Bo, T., Haoyi, Z., Disi, C., (2020). Multi-object intergroup gesture recognition combined with fusion feature and KNN algorithm, Journal of Intelligent & Fuzzy Systems 38 (2020): 2725-2735.
[19]	Varun, G., Monika, M., (2018). KNN and PCA classifier with Autoregressive modelling during different ECG signal interpretation, Procedia Computer Science 125 (2018): 18-24.
[20]	Web Site https://github.com/mystlee/rasta_py/blob/master/ rasta.py.
[21]	Spohrer, JC, Brown, PF, Roth, R. (1982) Automatic Labeling of Speech. Acoustics, Speech and Signal Processing, IEEE International Conference on ICASSP'82, 7, 1641-1644.

Cite This Article

Plain Text BibTeX RIS

APA Style

Leandro Daniel Lau Alfonso, Sergio Suarez Guerra, Jose Luis Oropeza Rodriguez, Roberto Rodriguez Morales, Gustavo Asumu Mboro Nchama. (2021). Python Language Training System Based on MFCC, VQ, Variational Coefficient and KNTM Algorithm. Mathematics and Computer Science, 6(2), 38-44. https://doi.org/10.11648/j.mcs.20210602.12

Copy | Download

ACS Style

Leandro Daniel Lau Alfonso; Sergio Suarez Guerra; Jose Luis Oropeza Rodriguez; Roberto Rodriguez Morales; Gustavo Asumu Mboro Nchama. Python Language Training System Based on MFCC, VQ, Variational Coefficient and KNTM Algorithm. Math. Comput. Sci. 2021, 6(2), 38-44. doi: 10.11648/j.mcs.20210602.12

Copy | Download

AMA Style

Leandro Daniel Lau Alfonso, Sergio Suarez Guerra, Jose Luis Oropeza Rodriguez, Roberto Rodriguez Morales, Gustavo Asumu Mboro Nchama. Python Language Training System Based on MFCC, VQ, Variational Coefficient and KNTM Algorithm. Math Comput Sci. 2021;6(2):38-44. doi: 10.11648/j.mcs.20210602.12

Copy | Download

@article{10.11648/j.mcs.20210602.12,
  author = {Leandro Daniel Lau Alfonso and Sergio Suarez Guerra and Jose Luis Oropeza Rodriguez and Roberto Rodriguez Morales and Gustavo Asumu Mboro Nchama},
  title = {Python Language Training System Based on MFCC, VQ, Variational Coefficient and KNTM Algorithm},
  journal = {Mathematics and Computer Science},
  volume = {6},
  number = {2},
  pages = {38-44},
  doi = {10.11648/j.mcs.20210602.12},
  url = {https://doi.org/10.11648/j.mcs.20210602.12},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.mcs.20210602.12},
  abstract = {This contribution describes the second stage of the creation of a language training system programmed in Python with the aim of application to speech therapy in spanish-speaking countries, starting the study in Cuba. The first stage of this research was carried out in Matlab by analyzing the dynamics of change of the centroids of the codebooks, extracted from words pronounced by a locutor. As second stage, the Variational Coefficient formula is used in order to estimate the percentage of effectiveness with which the announcer performs voice training. A modified approach to programming the variational coefficient is taken into account as a measure of dispersion of a group of vectors. The modification is given by taking the mean of the group of vectors as the vector that represents the phonetic boundaries of the word to be trained. Besides, a novel approach for word recognition is used, based on the K-Nearest Training Matrix (KNTM) algorithm that lays its foundations in the analysis of matrix similarity taken the Frobenius norm as a measure to distinguish similar or non-similar characteristics of a matrix with respect to a database of matrices. To reduce the computational cost of the program and speed up its proper functioning, the training matrices of the database are saved in files with a .tex extension, in this way after training process, the program should only read them and not recalculate them, which significantly reduces the running time of the algorithm.},
 year = {2021}
}

Copy | Download

TY  - JOUR
T1  - Python Language Training System Based on MFCC, VQ, Variational Coefficient and KNTM Algorithm
AU  - Leandro Daniel Lau Alfonso
AU  - Sergio Suarez Guerra
AU  - Jose Luis Oropeza Rodriguez
AU  - Roberto Rodriguez Morales
AU  - Gustavo Asumu Mboro Nchama
Y1  - 2021/05/14
PY  - 2021
N1  - https://doi.org/10.11648/j.mcs.20210602.12
DO  - 10.11648/j.mcs.20210602.12
T2  - Mathematics and Computer Science
JF  - Mathematics and Computer Science
JO  - Mathematics and Computer Science
SP  - 38
EP  - 44
PB  - Science Publishing Group
SN  - 2575-6028
UR  - https://doi.org/10.11648/j.mcs.20210602.12
AB  - This contribution describes the second stage of the creation of a language training system programmed in Python with the aim of application to speech therapy in spanish-speaking countries, starting the study in Cuba. The first stage of this research was carried out in Matlab by analyzing the dynamics of change of the centroids of the codebooks, extracted from words pronounced by a locutor. As second stage, the Variational Coefficient formula is used in order to estimate the percentage of effectiveness with which the announcer performs voice training. A modified approach to programming the variational coefficient is taken into account as a measure of dispersion of a group of vectors. The modification is given by taking the mean of the group of vectors as the vector that represents the phonetic boundaries of the word to be trained. Besides, a novel approach for word recognition is used, based on the K-Nearest Training Matrix (KNTM) algorithm that lays its foundations in the analysis of matrix similarity taken the Frobenius norm as a measure to distinguish similar or non-similar characteristics of a matrix with respect to a database of matrices. To reduce the computational cost of the program and speed up its proper functioning, the training matrices of the database are saved in files with a .tex extension, in this way after training process, the program should only read them and not recalculate them, which significantly reduces the running time of the algorithm.
VL  - 6
IS  - 2
ER  -

Copy | Download

Author Information

Leandro Daniel Lau Alfonso

Institute of Cybernetics, Mathematics and Physics, Havana, Cuba
Sergio Suarez Guerra

Computer Research Center, National Polytechnic Institute, Mexico City, Mexico
Jose Luis Oropeza Rodriguez

Computer Research Center, National Polytechnic Institute, Mexico City, Mexico
Roberto Rodriguez Morales

Institute of Cybernetics, Mathematics and Physics, Havana, Cuba
Gustavo Asumu Mboro Nchama

Department of Technical Sciences, National University of Equatorial Guinea, Malabo, Equatorial Guinea

Download PDF

Sections

Plain Text BibTeX RIS

APA Style

Leandro Daniel Lau Alfonso, Sergio Suarez Guerra, Jose Luis Oropeza Rodriguez, Roberto Rodriguez Morales, Gustavo Asumu Mboro Nchama. (2021). Python Language Training System Based on MFCC, VQ, Variational Coefficient and KNTM Algorithm. Mathematics and Computer Science, 6(2), 38-44. https://doi.org/10.11648/j.mcs.20210602.12

Copy | Download

ACS Style

Leandro Daniel Lau Alfonso; Sergio Suarez Guerra; Jose Luis Oropeza Rodriguez; Roberto Rodriguez Morales; Gustavo Asumu Mboro Nchama. Python Language Training System Based on MFCC, VQ, Variational Coefficient and KNTM Algorithm. Math. Comput. Sci. 2021, 6(2), 38-44. doi: 10.11648/j.mcs.20210602.12

Copy | Download

AMA Style

Leandro Daniel Lau Alfonso, Sergio Suarez Guerra, Jose Luis Oropeza Rodriguez, Roberto Rodriguez Morales, Gustavo Asumu Mboro Nchama. Python Language Training System Based on MFCC, VQ, Variational Coefficient and KNTM Algorithm. Math Comput Sci. 2021;6(2):38-44. doi: 10.11648/j.mcs.20210602.12

Copy | Download

@article{10.11648/j.mcs.20210602.12,
  author = {Leandro Daniel Lau Alfonso and Sergio Suarez Guerra and Jose Luis Oropeza Rodriguez and Roberto Rodriguez Morales and Gustavo Asumu Mboro Nchama},
  title = {Python Language Training System Based on MFCC, VQ, Variational Coefficient and KNTM Algorithm},
  journal = {Mathematics and Computer Science},
  volume = {6},
  number = {2},
  pages = {38-44},
  doi = {10.11648/j.mcs.20210602.12},
  url = {https://doi.org/10.11648/j.mcs.20210602.12},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.mcs.20210602.12},
  abstract = {This contribution describes the second stage of the creation of a language training system programmed in Python with the aim of application to speech therapy in spanish-speaking countries, starting the study in Cuba. The first stage of this research was carried out in Matlab by analyzing the dynamics of change of the centroids of the codebooks, extracted from words pronounced by a locutor. As second stage, the Variational Coefficient formula is used in order to estimate the percentage of effectiveness with which the announcer performs voice training. A modified approach to programming the variational coefficient is taken into account as a measure of dispersion of a group of vectors. The modification is given by taking the mean of the group of vectors as the vector that represents the phonetic boundaries of the word to be trained. Besides, a novel approach for word recognition is used, based on the K-Nearest Training Matrix (KNTM) algorithm that lays its foundations in the analysis of matrix similarity taken the Frobenius norm as a measure to distinguish similar or non-similar characteristics of a matrix with respect to a database of matrices. To reduce the computational cost of the program and speed up its proper functioning, the training matrices of the database are saved in files with a .tex extension, in this way after training process, the program should only read them and not recalculate them, which significantly reduces the running time of the algorithm.},
 year = {2021}
}

Copy | Download

TY  - JOUR
T1  - Python Language Training System Based on MFCC, VQ, Variational Coefficient and KNTM Algorithm
AU  - Leandro Daniel Lau Alfonso
AU  - Sergio Suarez Guerra
AU  - Jose Luis Oropeza Rodriguez
AU  - Roberto Rodriguez Morales
AU  - Gustavo Asumu Mboro Nchama
Y1  - 2021/05/14
PY  - 2021
N1  - https://doi.org/10.11648/j.mcs.20210602.12
DO  - 10.11648/j.mcs.20210602.12
T2  - Mathematics and Computer Science
JF  - Mathematics and Computer Science
JO  - Mathematics and Computer Science
SP  - 38
EP  - 44
PB  - Science Publishing Group
SN  - 2575-6028
UR  - https://doi.org/10.11648/j.mcs.20210602.12
AB  - This contribution describes the second stage of the creation of a language training system programmed in Python with the aim of application to speech therapy in spanish-speaking countries, starting the study in Cuba. The first stage of this research was carried out in Matlab by analyzing the dynamics of change of the centroids of the codebooks, extracted from words pronounced by a locutor. As second stage, the Variational Coefficient formula is used in order to estimate the percentage of effectiveness with which the announcer performs voice training. A modified approach to programming the variational coefficient is taken into account as a measure of dispersion of a group of vectors. The modification is given by taking the mean of the group of vectors as the vector that represents the phonetic boundaries of the word to be trained. Besides, a novel approach for word recognition is used, based on the K-Nearest Training Matrix (KNTM) algorithm that lays its foundations in the analysis of matrix similarity taken the Frobenius norm as a measure to distinguish similar or non-similar characteristics of a matrix with respect to a database of matrices. To reduce the computational cost of the program and speed up its proper functioning, the training matrices of the database are saved in files with a .tex extension, in this way after training process, the program should only read them and not recalculate them, which significantly reduces the running time of the algorithm.
VL  - 6
IS  - 2
ER  -

Copy | Download