Proposing Multimodal Integration Model Using LSTM and Autoencoder

Wataru  Noguchi; Hiroyuki  Iizuka; Masahito  Yamamoto

doi:10.4108/eai.3-12-2015.2262505

Proposing Multimodal Integration Model Using LSTM and Autoencoder

Authors

Wataru Noguchi Hokkaido University
Hiroyuki Iizuka Hokkaido University
Masahito Yamamoto Hokkaido University

DOI:

https://doi.org/10.4108/eai.3-12-2015.2262505

Keywords:

multimodal integration, deep learning, autoencoder, long short term memory

Abstract

We propose an architecture of neural network that can learn and integrate sequential multimodal information using Long Short Term Memory. Our model consists of encoder and decoder LSTMs and multimodal autoencoder. For integrating sequential multimodal information, firstly, the encoder LSTM encodes a sequential input to a fixed range feature vector for each modality. Secondly, the multimodal autoencoder integrates the feature vectors from each modality and generate a fused feature vector which contains sequential multimodal information in a mixed form. The original feature vectors from each modality are re-generated from the fused feature vector in the multimodal autoencoder. The decoder LSTM decodes the sequential inputs from the regenerated feature vector. Our model is trained with the visual and motion sequences of humans and is tested by recall tasks. The experimental results show that our model can learn and remember the sequential multimodal inputs and decrease the ambiguity generated at the learning stage of LSTMs using integrated multimodal information. Our model can also recall the visual sequences from the only motion sequences and vice versa.

References

Downloads

Published

24-05-2016

Issue

Vol. 3 No. 10 (2016): EAI Endorsed Transactions on Security and Safety

Section

Research article

License

This work is licensed under a Creative Commons Attribution 3.0 Unported License.

This is an open-access article distributed under the terms of the Creative Commons Attribution CC BY 4.0 license, which permits unlimited use, distribution, and reproduction in any medium so long as the original work is properly cited.

How to Cite

Noguchi W, Iizuka H, Yamamoto M. Proposing Multimodal Integration Model Using LSTM and Autoencoder. EAI Endorsed Trans Sec Saf [Internet]. 2016 May 24 [cited 2026 Jul. 26];3(10):e1. Available from: https://publications.eai.eu/index.php/sesa/article/view/536

Download Citation

Proposing Multimodal Integration Model Using LSTM and Autoencoder

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

Latest publications

Make a Submission