A Hybrid Deep Model Using Deep Learning and Dense Optical Flow Approaches for Human Activity Recognition

Tanberk, Senem; Kilimci, Zeynep; Tukel, Dilek; Uysal, Mitat; AKYOKUŞ, Selim

doi:10.1109/access.2020.2968529

A Hybrid Deep Model Using Deep Learning and Dense Optical Flow Approaches for Human Activity Recognition

Tanberk S., Kilimci Z. H., Tukel D. B., Uysal M., AKYOKUŞ S.

IEEE Access, cilt.8, ss.19799-19809, 2020 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 8
Basım Tarihi: 2020
Doi Numarası: 10.1109/access.2020.2968529
Dergi Adı: IEEE Access
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
Sayfa Sayıları: ss.19799-19809
Anahtar Kelimeler: abc
Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu
İstanbul Medipol Üniversitesi Adresli: Evet

Özet

Human activity recognition is a challenging problem with many applications including visual surveillance, human-computer interactions, autonomous driving and entertainment. In this study, we propose a hybrid deep model to understand and interpret videos focusing on human activity recognition. The proposed architecture is constructed combining dense optical flow approach and auxiliary movement information in video datasets using deep learning methodologies. To the best of our knowledge, this is the first study based on a novel combination of 3D-convolutional neural networks (3D-CNNs) fed by optical flow and long short-Term memory networks (LSTM) fed by auxiliary information over video frames for the purpose of human activity recognition. The contributions of this paper are sixfold. First, a 3D-CNN, also called multiple frames is employed to determine the motion vectors. With the same purpose, the 3D-CNN is secondly used for dense optical flow, which is the distribution of apparent velocities of movement in captured imagery data in video frames. Third, the LSTM is employed as auxiliary information in video to recognize hand-Tracking and objects. Fourth, the support vector machine algorithm is utilized for the task of classification of videos. Fifth, a wide range of comparative experiments are conducted on two newly generated chess datasets, namely the magnetic wall chess board video dataset (MCDS), and standard chess board video dataset (CDS) to demonstrate the contributions of the proposed study. Finally, the experimental results reveal that the proposed hybrid deep model exhibits remarkable performance compared to the state-of-The-Art studies.