Deep Learning-Based Speech Emotion Recognition

Karan Sharma

doi:10.15680/IJCTECE.2020.0305001

Authors

Karan Sharma U.G. Student, Department of Computer Science & Engineering, Galgotias University, Gr. Noida, U.P, India Author

DOI:

https://doi.org/10.15680/IJCTECE.2020.0305001

Keywords:

Speech Emotion Recognition, Deep Learning, Convolutional Neural Networks, Recurrent Neural Networks, Long Short-Term Memory, Emotion Classification, Audio Signal Processing, Feature Extraction, Machine Learning

Abstract

Speech Emotion Recognition (SER) is an essential component in human-computer interaction, enabling systems to understand and respond to human emotions. Traditional emotion recognition methods often rely on handcrafted features, which can be limited in capturing the full complexity of emotional cues. In contrast, deep learning approaches, particularly convolutional neural networks (CNNs), recurrent neural networks (RNNs), and long short-term memory (LSTM) networks, offer more robust solutions by automatically learning hierarchical features from raw audio data. This paper reviews recent advancements in deep learning-based speech emotion recognition, discusses the various architectures used, and evaluates the challenges in real-world applications. We focus on the application of deep learning models to enhance the accuracy and robustness of SER, particularly in noisy environments. The study also discusses future directions for research, including multimodal emotion recognition and transfer learning to address challenges such as small datasets and cross-domain applications

References

1. El Ayadi, M., Kamel, M. S., & Karray, F. "Speech emotion recognition using classifiers." International Journal of Speech Technology, 14(2), 99-111.

2. Nogueira, M., et al. "Deep Learning for Speech Emotion Recognition: A Review." Proceedings of the 6th International Conference on Machine Learning and Applications.

3. Satt, A., et al. "Speech Emotion Recognition Using Convolutional Neural Networks." Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).

4. Hershey, S., et al. "Speech Emotion Recognition using LSTM Networks." IEEE Transactions on Audio, Speech, and Language Processing, 25(8), 1823-1831.

5. Zhao, Z., et al"Hybrid CNN-LSTM Model for Speech Emotion Recognition." IEEE Access, 8, 49789-49798.

Deep Learning-Based Speech Emotion Recognition

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

How to Cite

Make a Submission

open-access

Menu

License

Information