top of page
  • Facebook
  • Twitter
  • Instagram

Egocentric Early Action Prediction via Adversarial
Knowledge Distillation

任务.png

Egocentric early action prediction aims to recognize actions from the first-person view by only observing the partial video segment, which is challenging due to the limited context information of the partial video. In this paper, to tackle the egocentric early action prediction problem, we propose a novel multi-modal adversarial knowledge distillation framework. Particularly, our approach involves a teacher network aiming to learn the enhanced representation of the partial video by considering the future unobserved video segment, and a student network concentrating on mimicking the teacher network to produce the powerful representation of the partial video and based on that predicting the action label. To promote the knowledge distillation between the teacher network and the student network, we seamlessly integrate the adversarial learning with latent and discriminative knowledge regularizations encouraging the learned representations of the partial video to be more informative and discriminative towards the action prediction. Finally, we devise a multi-modal fusion module towards comprehensively predicting the action label. Extensive experiments on two public egocentric datasets validate the superiority of our method over the state-of-the-art methods.

Framework

框架图.png

Illustration of the proposed egocentric early action prediction scheme with Adversarial Knowledge Distillation (ADK). To simplify, we only show two different modalities: visual content and audio signals, corresponding to two teacher sub-networks and two student sub-networks. In particular, the important knowledge from the teacher network is distilled to the student network with an adversarial learning strategy, where LKR and DKR are incorporated to regularize the learned representation from the student network to be more informative and discriminative to the egocentric early action predictions.

Copyright (C) 2022  Shandong University

 

This program is licensed under the GNU General Public License 3.0 (https://www.gnu.org/licenses/gpl-3.0.html). Any derivative work  obtained under this license must be licensed under the GNU General Public License as published by the Free Software Foundation, either Version 3 of the License, or (at your option) any later version, if this derivative work is distributed to a third party.

 

The copyright for the program is owned by Shandong University. For commercial projects that require the ability to distribute the code of this program as part of a program that cannot be distributed under the GNU General Public License, please contact zhengnagrape@gmail.com to purchase a commercial license.

bottom of page