CHAM: ACTION RECOGNITION USING CONVOLUTIONAL HIERARCHICAL ATTENTION MODEL



Yan, S, Smith, JS ORCID: 0000-0002-0212-2365, Lu, W and Zhang, B
(2018) CHAM: ACTION RECOGNITION USING CONVOLUTIONAL HIERARCHICAL ATTENTION MODEL. In: IEEE International Conference on Image Processing - ICIP 2017, 2017-9-17 - 2017-5-20, Beijing, China.

[img] Text
main.pdf - Author Accepted Manuscript

Download (2MB)

Abstract

Recently, the soft attention mechanism, which was originally proposed in language processing, has been applied in computer vision tasks like image captioning. This paper presents improvements to the soft attention model by combining a con-volutional Long Short-Term Memory (LSTM) with a hierarchical system architecture to recognize action categories in videos. We call this model the Convolutional Hierarchical Attention Model (CHAM). The model applies a convolution-al operation inside the LSTM cell and an attention map generation process to recognize actions. The hierarchical architecture of this model is able to explicitly reason on multi-granularities of action categories. The proposed architecture achieved improved results on three publicly available datasets: the UCF sports dataset, the Olympic sports dataset and the HMDB51 dataset.

Item Type: Conference or Workshop Item (Unspecified)
Uncontrolled Keywords: Action recognition, Soft attention, Convolutional LSTM, CNN, Hierarchical Architecture
Depositing User: Symplectic Admin
Date Deposited: 22 May 2017 10:03
Last Modified: 19 Jan 2023 07:04
DOI: 10.1109/icip.2017.8297025
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3007574