الفهرس | Only 14 pages are availabe for public view |
Abstract Automatic facial emotion recognition (FER) is becoming in more and more demand in the current days. More industries are trying to incorporate emotion-aware technologies into their products. Some of those industries, the automotive industry being one example, require very tight limitations regarding both run-time and memory footprint of the used models to fit into small embedded devices. While there is a lot of machine learning and computer vision work on FER, most of it focuses on obtaining the best possible system accuracy without being bound by memory constraints. On the other hand, the work in this thesis explores deep learning models for emotion recognition in videos for systems with limited memory like robots, cars, and embedded-systems. Naturally, this comes at the expense of sacrificing some accuracy. There are two proposed models in this thesis. One is the mini-xception+LSTM architecure with around 80k parameters. This model got a validation accuracy of 93% in distinction between Anger and Amusement emotions on the BioVidEmo dataset. It also achieved 90% validation accuracy on the CK+ dataset while classifying six emotions as a multi-class classification problem. The second model, called mini-xception+C3D, had 95k parameters and outperformed the first model. It achieved 94% validation accuracy on the CK+ dataset. After using a weighted cost function with the mini-xception+C3D model to handle the class imbalance problem, validation accuracy increased to 96%, which is a very good result given the small number of parameters. |