What is it about?

This research has examined the effects of combining PLP and MFCC in speech recognition. The neural network was used as a classifier and the combination of extracted features from PLP and MFCC was selected for overall performance. Results showed that the integration of features did not produce good results compared to individual algorithms, and were even poor in medium and high noise conditions. Results indicate that the exactness of recognition is affected by the language, the speaker's gender, age and other components.

Featured Image

Why is it important?

This study used five words in the recognition process, a medium-quality microphone, a recording frequency of 11025 hz, and a recording time of 3 seconds. The audio database consisted of 150 voice samples. Features were adopted in two systems using MFCC and PLP algorithms, and four attributes were merged using the proposed integration algorithm. Experiments were conducted in three noise scenarios, and the highest recognition rate was 98% in the system based on the proposed integration algorithm with an SNR of 30. The error rate increased when recognizing the words 'Go' and 'No' due to their similarity. The integration process gave a better recognition rate in different noise scenarios, showing the superiority of the system over the single systems. A comparison of the algorithms with different noise levels was shown in Figure 5. The database was modified to 500 audio files with different voices, and the recognition rate was shown in Figure 6. This system was compared to previous research, demonstrating the distinction between them.

Perspectives

Conclusion: The magnitude of the sound sample database and the level of confusion between words play an important role in the recognition rate, with increasing samples improving the speed while increasing time. The neural network's learning algorithm and the neural network's error rate significantly affect the recognition rate. Integrating features in low noise conditions significantly increase the recognition rate, providing greater stability in the recognition process.

HTMA Haeder Talib Alahmar
Al Furat Al Awsat Technical University

Read the Original

This page is a summary of: SPEECH RECOGNITION BY IMPROVING THE PERFORMANCE OF ALGORITHMS USED IN DISCRIMINATION, International Journal of Computer Science and Information Technology, February 2019, Academy and Industry Research Collaboration Center (AIRCC),
DOI: 10.5121/ijcsit.2019.11102.
You can read the full text:

Read

Contributors

The following have contributed to this page