Implementación de un sistema de reconocimiento de voz en el lenguaje Nasa Yuwe basado en Redes Neuronales Convolusionales

Julio Enrique Muñoz Burbano; Pablo Emilio Jojoa Gomez; Fausto Miguel Castro Caicedo

doi:10.16925/2357-6014.2023.01.01

Implementation of a Voice Recognition System in the Nasa Yuwe Language Based on Convolutional Neural Networks

Research Articles

https://doi.org/10.16925/2357-6014.2023.01.01

Julio Enrique Muñoz Burbano Ver Biografía

Universidad del Cauca

Pablo Emilio Jojoa Gomez Ver Biografía

Universidad del Cauca

Fausto Miguel Castro Caicedo Ver Biografía

Universidad Nacional Abierta y a Distancia

Introduction: This paper presents the Implementation of an algorithm for voice recognition in the Nasa Yuwe language based on Convolutional Neural Networks (CNN), developed at the Universidad del Cauca in the year 2022.

Problem: The Nasa Yuwe language is phonetically rich, as it has 32 vowels and 34 consonants, which leads to confusion in pronunciation and therefore difficulties in recognizing voice patterns.

Objective: To implement a speech recognition algorithm for the Nasa Yuwe language supported in CNN.

Methodology: The preprocessing of the audio signals was carried out to subsequently obtain the characteristics through the scalograms of the Mel coefficients. Finally, an architecture of the CNN is proposed for the classification process.

Results: A DataSet is built from the scalograms of the voice patterns, and the CNN training process is carried out.

Conclusion: The implementation of a Voice Recognition System based on CNN provides low margins of error in the word classification process of the Nasa Yuwe language.

Originality: The proposed voice recognition system is the first and only one of its kind that has been carried out so far, with the purpose of collaborating in the process of teaching, preserving and learning the Nasa Yuwe language.

Limitations: It is necessary to increase the number of voice patterns provided by native speakers, and there is a need to implement other technological tools that allow for the conservation and dissemination of the Nasa Yuwe language.

Keywords: VRS (Voice Recognition System), Nasa Yuwe Language, Mel coefficients, machine learning, CNN (Convolutional Neural Network)

K. Barrios, J. Lopez, S. Mendieta, R. Benavides, Y. Saez, “Portal de Revistas Academica UTP,” 2018. doi: https://doi.org/10.33412/rev-ric.v4.0.1827

J. Camargo, Universidad Pontificia Bolivariana, 2010. [Online]. Available: https://www.mendeley.com/catalogue/c2bf0045-f1c5-342d-a896-212cf29b980e/?utm_source=desktop&utm_medium=1.19.8&utm_campaign=open_catalog&userDocumentId=%7Bfcceb5c8-bd0c-36c9-bcf6-eef575d78eec%7D. pp 23-28.

M. Atibi, A. Issam, M. Boussaa, A. Bennis, ResearchGate, 2016. doi: http://dx.doi.org/10.1109/CSIT.2016.7549469

O.L. Ramos, D.A. Rojas, L.A. Góngora, “Reconocimiento de patrones de habla usando MFCC y RNA,” Visión electrónica, vol.10, no. 1, .pp 5-11. doi: https://doi.org/10.14483/22484728.11712

O. Pérez, F. Poceros, A. Villalobos, “DSpace Tesis IPN,” 2013. [Online]. Available: https://tesis.ipn.mx/jspui/bitstream/123456789/12309/1/Sistema%20de%20Seguridad%20por%20Reconocimiento%20de%20Voz%20%28Tesis%20de%20Ingenieria%20ESIME%29.pdf. pp 22-27.

J. Pérez, A. Araujo, “Academia,” Noviembre 2018. [Online]. Available: https://www.academia.edu/38038688/Aplicaci%C3%B3n_de_una_Red_Neuronal_Convolucional_para_el_Reconocimiento_de_Personas_a_Trav%C3%A9s_de_la_Voz. pp 78-80.

M. Cruz, F. Lozano, C. Higuera, Repositorio Uniandes, 2021. [Online]. Available: https://repositorio.uniandes.edu.co/handle/1992/50650. pp 2-3.

P. Freeman, V. Kashyap, R. Rosner, Q. Lamb, IOPSience, 2002. [Online]. Available: https://iopscience.iop.org/article/10.1086/324017/pdf. pp 187-188.

J. Bernal, P. Gomez, J. Bobadilla, ResearchGate, 2009. [Online]. Available: https://www.researchgate.net/publication/239813705_Una_vision_practica_en_el_uso_de_la_Transformada_de_Fourier_como_herramienta_para_el_analisis_espectral_de_la_voz. pp 79-81.

E. Villca, S. Carmina, DDIGITAL-UMSS, 2020. [Online]. Available: https://ddigital.umss.edu.bo:8080/jspui/handle/123456789/20216.

Apple, appleinsider, [Online]. Available: https://appleinsider.com/inside/siri.

Microsoft, Microsoft, 2022. [Online]. Available: https://support.microsoft.com/es-es/topic/-qu%C3%A9-es-cortana-953e648d-5668-e017-1341-7f26f7d0f825. pp 1

S. Geek, Social Geek, 2022. [Online]. Available: https://socialgeek.co/tech/google-assistant-google-now-te-contamos-diferencias/. pp 1

Amazon, Amazon, 2022. [Online]. Available: https://developer.amazon.com/es-ES/alexa. pp 1

Samsung, Samsung, 2022. [Online]. Available: https://www.samsung.com/co/support/mobile-devices/how-can-i-use-the-bixby-application/. pp 1

Marketing XXI, Marketing XXI, 2018. [Online]. Available: https://www.marketing-xxi.com/voice-search-asistentes-voz-altavoces-inteligentes-seo-sem/asistentes-virtuales-voz. pp 1

I. Villamil, Pontificia Universidad Javeriana de Colombia, 2005. [Online]. Available: https://www.javeriana.edu.co/biblos/tesis/ingenieria/tesis95.pdf.

R. Fatmi, S. Rashad, R. Integlia, Mendeley, 2019. doi: https://dx.doi.org/10.1109/CCWC.2019.8666491

Z. Alkareem, A. Tajudin, M. Al-Betar, A. Abasi, S. Makhadmeh, N. l Salih, ACM Digital Library, 2019. doi: https://dx.doi.org/10.1145/3321289.3321327

Instituto Colombiano de Cultura Hispánica, Geografía Humana de Colombia. Región Andina Central, tom. IV, vol. II, Bogotá, 2008. pp 1

T. Rojas, DOCERO, 2006. [Online]. Available: https://docero.mx/doc/por-los-caminos-de-la-recuperacion-de-la-lengua-paez-4krn88zr31.

Universidad del Cauca, CRIC-PEBIl-Comisión General de Lenguas, Estudio Sociolingüistico Fase preliminar. Base de datos - CRIC 01/2007 Lengua Nasa Yuwe y Namtrik. Popayàn, Cauca, Colombia, CRIC, Popayán - Colombia, 2008. pp 1

L. Contreras, J. Caro, D. Morales, Ingenieria Solidaria, 2022. doi: https://doi.org/10.16925/2357-6014.2022.02.01

S. Balakrishna, Y. Gopi, V. Solanky, Ingenieria Solidaria, 2022. doi: https://doi.org/10.16925/2357-6014.2022.01.05

H. Calp, U. Kose, Ingenieria Solidaria, 2020. doi: https://doi.org/10.16925/2357-6014.2020.03.08

S. Sunny, D. Peter, K. Poulose, ResearchGate, 2013. doi: https://ijret.org/volumes/2013v02/i04/IJRET20130204032.pdf

D. Maldonado, R. P.-R. D. Villalba, Repositorio Institucional de la UNLP, 2016. [Online]. Available: https://sedici.unlp.edu.ar/bitstream/handle/10915/56979/Documento_completo.pdf-PDFA.pdf?sequence=1&isAllowed=y.

F. Aimituma y R. Churata, Repositorio Universidad Nacional De San Antonio Abad Del Cusco, 2019. [Online]. Available: https://repositorio.unsaac.edu.pe/bitstream/handle/20.500.12918/4321/253T20190384_TC.pdf?sequence=1&isAllowed=y.

M. Farfán Martínez, T. Rojas Curieux, Zuy Luuçxkwe kwe'kwe’sx ipx kwetuy piyaaka, Cartilla de aprendizaje de nasa yuwe como segunda lengua, Buenos Aires, 2010. pp 1

G. Alvarez, ResearchGate, 2012. [Online]. Available: https://www.researchgate.net/publication/262753111_A_classifier_model_for_detecting_pronunciation_errors_regarding_the_Nasa_Yuwe_language%27s_32_vowels.

Cabildos Nasa., Scribd, 2005. [Online]. Available: https://es.scribd.com/doc/143624645/Diccionario-Nasa-Yuwe-Castellano. pp 19-81.

T. Rojas, Utexas, 2001. [Online]. Available: https://lanic.utexas.edu/project/etext/llilas/cilla/rojas.html. Pagina web. pp 1

W. Rivas, B. Mazón, ResearchGate, 2018. [Online]. Available: https://www.researchgate.net/profile/Bertha-Mazon-Olivo/publication/327703478_Capitulo_1_Generalidades_de_las_redes_neuronales_artificiales/links/5b9fe3c0299bf13e6038a1d8/Capitulo-1-Generalidades-de-las-redes-neuronales-artificiales.pdf. pp 18-20.

E. Acevedo, A. Serna, E. Serna, academia.edu, 2017. [Online]. Available: https://www.academia.edu/39630373/DESARROLLO_E_INNOVACI%C3%93N_EN_INGENIER%C3%8DA_Editorial_IAI. pp 175-180.

S. Pattanayak, Springer, 2017. [Online]. Available: https://link.springer.com/book/10.1007/978-1-4842-3096-1. pp 179-187. pp 1

O. Abdel-Hamid, A.-r. Mohamed, H. Jiang, L. Deng, G. Penn y D. Yu, IEEEXplore, 2014. doi: https://doi.org/10.1109/TASLP.2014.2339736

C. Rincon, Universidad Politecnica de Madrid, 2007. [Online]. Available: https://lorien.die.upm.es/barra/pfcs/2007-carmenr/docs/proyecto.pdf.

A. Nogueira, Universidad Federal do Amazonas, 2008. [Online]. Available: https://tede.ufam.edu.br/bitstream/tede/2959/1/DISSERTACAO%20ADRIANO%20NOGUEIRA.pdf.

L. Valente, Universidad de Castilla - La Mancha, 2017. [Online]. Available: https://ruidera.uclm.es/xmlui/bitstream/handle/10578/15422/TFG_LUISALBERTOVALENTE.pdf?sequence=1.

C. Luna, I. Bevacqua, N. Salvay, Universidad Tecnologica Nacional, 2011. [Online]. Available: https://www.profesores.frc.utn.edu.ar/electronica/fundamentosdeacusticayelectroacustica/pub/file/FAyE0711E1-Luna-Bevacqua-Salvay.pdf.

D. Ginestar, Universitat Politecnica de Valencia, 2022. [Online]. Available: https://personales.upv.es/dginesta/docencia/posgrado/sparse.pdf. pp 15-21.

V. Roman, Ciencia & Datos, 2019. [Online]. Available: https://medium.com/datos-y-ciencia/introduccion-al-machine-learning-una-gu%C3%ADa-desde-cero-b696a2ead359. pp 1

R. Hernández, E. Pérez-Perdomo, D. Orozco, L. Sánchez, ResearchGate, 2018. doi: 10.13140/RG.2.2.26893.84961

S. Uddin, A. Khan, E. Hossain, A. Moni, ResearchGate, 2019. doi: “https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-019-1004-8“ t “_blank“ 10.1186/s12911-019-1004-8

J. Martinez, IArtificial.net, 2020. [Online]. Available: https://www.iartificial.net/precision-recall-f1-accuracy-en-clasificacion/.

[1]

J. E. Muñoz Burbano, P. E. Jojoa Gomez, and F. M. Castro Caicedo, “Implementation of a Voice Recognition System in the Nasa Yuwe Language Based on Convolutional Neural Networks”, ing. Solidar, vol. 19, no. 1, pp. 1–20, Jan. 2023, doi: 10.16925/2357-6014.2023.01.01.

Download Citation

This work is licensed under a Creative Commons Attribution 4.0 International License.

Cession of rights and ethical commitment

As the author of the article, I declare that is an original unpublished work exclusively created by me, that it has not been submitted for simultaneous evaluation by another publication and that there is no impediment of any kind for concession of the rights provided for in this contract.

In this sense, I am committed to await the result of the evaluation by the journal Ingeniería Solidaría before considering its submission to another medium; in case the response by that publication is positive, additionally, I am committed to respond for any action involving claims, plagiarism or any other kind of claim that could be made by third parties.

At the same time, as the author or co-author, I declare that I am completely in agreement with the conditions presented in this work and that I cede all patrimonial rights, in other words, regarding reproduction, public communication, distribution, dissemination, transformation, making it available and all forms of exploitation of the work using any medium or procedure, during the term of the legal protection of the work and in every country in the world, to the Universidad Cooperativa de Colombia Press.

Issue

Vol. 19 No. 1 (2023)

Published

2023-01-22

Downloads

PDF

How to Cite

[1]

Download Citation

Metrics

File downloads

199

https://plu.mx/plum/a/?doi=10.16925/2357-6014.2023.01.01