Hierarchical Video Prediction using Relational Layouts for Human-object Interactions

More results (HORN)

Output of object RNN as compared to ground truth sequences for tasks : cooking with bowls (1,3,5) and pouring and drinking (2,4)

Example visualizations of the task "sawing" on Bimanual dataset.

Example visualizations of various tasks on UMD-HOI Dataset.

Failure Case (HORN)

Citation

      @InProceedings{Bodla_2021_CVPR,
    author    = {Bodla, Navaneeth and Shrivastava, Gaurav and Chellappa, Rama and Shrivastava, Abhinav},
    title     = {Hierarchical Video Prediction Using Relational Layouts for Human-Object Interactions},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {12146-12155}
    }

Acknowledgements

This work was supported by the DARPA SAIL-ON program via ARO contract no. W911NF2020009 and IARPA via contract no. D17PC00345.