The architectures of the multiple-modality, cross-modality, and shared-modality learning.
The architectures of the multiple-modality, cross-modality, and shared-modality learning.
. | Feature Learning . | Supervised Training . | Testing . |
---|---|---|---|
Classic deep learning | Audio | Audio | Audio |
Video | Video | Video | |
Multimodal fusion | A V | A V | A V |
Cross-modality learning | A V | Video | Video |
A V | Audio | Audio | |
Shared representation learning | A V | Audio | Video |
A V | Video | Audio |
. | Feature Learning . | Supervised Training . | Testing . |
---|---|---|---|
Classic deep learning | Audio | Audio | Audio |
Video | Video | Video | |
Multimodal fusion | A V | A V | A V |
Cross-modality learning | A V | Video | Video |
A V | Audio | Audio | |
Shared representation learning | A V | Audio | Video |
A V | Video | Audio |