Capturing Hands in Action using Discriminative Salient Points and Physics Simulation

Dimitrios Tzionas        Luca Ballan        Abhilash Srikantha        Pablo Aponte        Marc Pollefeys        Juergen Gall



Capturing Hands in Action using Discriminative Salient Points and Physics Simulation


Abstract

Hand motion capture is a popular research field, recently gaining more attention due to the ubiquity of RGB-D sensors. However, even most recent approaches focus on the case of a single isolated hand. In this work, we focus on hands that interact with other hands or objects and present a framework that successfully captures motion in such interaction scenarios for both rigid and articulated objects. Our framework combines a generative model with discriminatively trained salient points to achieve a low tracking error and with collision detection and physics simulation to achieve physically plausible estimates even in case of occlusions and missing visual data. Since all components are unified in a single objective function which is almost everywhere differentiable, it can be optimized with standard optimization techniques. Our approach works for monocular RGB-D sequences as well as setups with multiple synchronized RGB cameras. For a qualitative and quantitative evaluation, we captured 29 sequences with a large variety of interactions and up to 150 degrees of freedom.

Publications

Tzionas, D., Ballan, L., Srikantha, A., Aponte, P., Pollefeys, M. and Gall, J.
Capturing Hands in Action using Discriminative Salient Points and Physics Simulation
[PDF] [arXiv] [Springer] [BibTex]

International Journal of Computer Vision (IJCV)
Special issue "Human Activity Understanding from 2D and 3D data" (link)
(Submitted on 17.10.14 / Accepted on 16.02.2016)

Tzionas, D., Srikantha, A., Aponte, P. and Gall, J.
Capturing Hand Motion with an RGB-D Sensor, Fusing a Generative Model with Salient Points
[PDF] [Web] [BibTex] [Sup1] [Sup2]

German Conference on Pattern Recognition (GCPR'14)

Ballan, L., Taneja, A., Gall, J., Van Gool, L. and Pollefeys, M.
Motion Capture of Hands in Action using Discriminative Salient Points [PDF] [Web] [BibTex] [Suppl.]
European Conference on Computer Vision (ECCV'12)

Videos









Datasets


Monocular RGB-D


(Update) Nov 2019: MANO fits on the data provided below and used in Hasson et al. ICCV'19
for sequences [01, 02, 03, 04, 05, 06, 07, 08, 09, 10, 11, 15, 16, 17, 18, 19, 20]:
MANO fits, Subject's personalized hand template and shape parameters.


Hand-Hand Interaction


Walking [All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
Crossing [All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
Crossing and Twisting [All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
Tips Touching
[All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
Dancing [All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
Tips Blending
[All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
Hugging [All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]

Grasping [All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
Flying [All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
Rock Gesture
[All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
Bunny Gesture
[All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
Bunny Gesture (*)
[All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
Flying (*) [All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
Rock Gesture (*)
         
[All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
   Camera Calibration
[Calibration]  
   Models (**)
[All]   [Hand_Right]   [Hand_Left]
Ground-Truth Joints
[Contained above in *All_Files* or *Files_noPCL*]
Sequences Info
[Dataset_ReadMe]   [Dataset_SequencesINFO]
   Fingertip Detector
[Sequences_and_Ground_Truth]   [Preview_Sequences_and_Ground_Truth]
   Supplementary Videos
[Results_Composition_onRGBD]   [Results_Composition_inPCL]  
[OURs_vs_FORTH]   [BenchmarkFORTH]   [CollisionDetection]  
[DetectionsAssignmentsExample]   [FingertipDetectorAnnotations]  
   Software to visualize results in 3D
[Files] or [Github]
   Software to visualize groundtruth
[Files] or [Github]

The material in this section originate from the GCPR'14 work of Tzionas et al.
Sequences marked with (*) are used just for comparison with the FORTH tracker.

Model-files marked with (**) do not contain sequence-specific files (.SKEL and .MOTION)


Hand-Object Interaction


                   
                   
                   
Moving a Ball with one hand [All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
Moving a Ball with two hands [All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
Bending a Pipe [All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
Bending a Rope
[All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
Moving a Ball with one hand and
occlusion of a manipulating finger
[All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
Moving a Cube with one hand
[All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
Moving a Cube with one hand and
occlusion of a manipulating finger
[All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
   Failure Case - Seq NOT in dataset:
some files below are common (dublicate links are inactive)
Moving a Ball with a hand platform
(without fingertip detector)
[All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
Moving a Ball with a hand platform
(with a fingertip detector)
[All_Files]   [Files_noPCL]   [Preview]   [Models/Motion]   [Videos_In/Results]
   Camera Calibration
[Calibration]  
   Models (**)
[All]   [Hand_Right]   [Hand_Left]   [Ball]   [Cube]   [Pipe]   [Rope]
Ground-Truth Joints
[Contained above in *All_Files* or *Files_noPCL*]
Sequences Info
[Dataset_ReadMe]   [Dataset_SequencesINFO]
   Fingertip Detector
(same as in previous section)

[Sequences_and_Ground_Truth]   [Preview_Sequences_and_Ground_Truth]
   Supplementary Videos
[ResultsOverview_AllSequences]   [HandObjectInteraction_NewIJCVseq]  
[CollisionDetection]   [DetectionsAssignmentsExample]  
[FingertipDetectorAnnotations]  
[FailueCase_DetectorOffOn]   [FailueCase_DetectorOffOn_Slow]  

   Software to visualize results in 3D
[Files] or [Github]
   Software to visualize groundtruth
[Files] or [Github]

The material in this section appear for the first time in this work.

Model-files marked with (**) do not contain sequence-specific files (.SKEL and .MOTION)


Multicamera RGB



Original datasets used in the paper (compressed using LJPG)

   Finger Tips Touching and Praying [01, 02, 03, 04, 05, 06, 07, 08]   [Calibration]   [Motion_Results]
   Fingers Crossing and Twisting [01, 02, 03, 04, 05, 06, 07, 08]   [Calibration]   [Motion_Results]
   Fingers Folding [01, 02, 03, 04, 05, 06, 07, 08]   [Calibration]   [Motion_Results]
   Fingers Walking [01, 02, 03, 04, 05, 06, 07, 08]   [Calibration]   [Motion_Results]
   Holding and Passing a Ball [01, 02, 03, 04, 05, 06, 07, 08]   [Calibration]   [Motion_Results]
   Taking off a Ring [01, 02, 03, 04, 05, 06, 07, 08]   [Calibration]   [Motion_Results]
   Paper Folding [01, 02, 03, 04, 05, 06, 07, 08]   [Calibration]   [Motion_Results]
   Rope Folding [01, 02, 03, 04, 05, 06, 07, 08]   [Calibration]   [Motion_Results]
   Models
[Hands]   [Ball]   [Ring]   [Paper]   [Rope]   [Hand 3DMax]
   File format
[FileFormat.txt]   [c++ bone struct code]
   Ground-Truth Joints
[Holding and Passing a Ball]                                                     
   Suppl. Videos - Results
[All_sequences_excluding_Paper_&_Rope]   [Paper_&_Rope]
[HandObjectInteraction_NewIJCVseq]  
   Software to visualize results in 3D
[Files]
   3DS Max Exporter
[Files]

The material in this section originate from the ECCV'12 work of Ballan et al.
The Rope and Paper sequences though are first presented in this work.



Related Projects

In chronological order:

3D Object Reconstruction from Hand-Object Interactions, ICCV 2015 [Web]
Capturing Hand Motion with an RGB-D Sensor, Fusing a Generative Model with Salient Points, GCPR 2014 [Web]
A Comparison of Directional Distances for Hand Pose Estimation, GCPR 2013 [Web]
Motion Capture of Hands in Action using Discriminative Salient Points, ECCV 2012 [Web]
Marker-less Motion Capture of Skinned Models in a Four Camera Set-up using Optical Flow and Silhouettes, 3DPVT 2008 [Web]


Citation

If you find the material in this website useful for your academic work, please cite this work.


Contact

If you have questions concerning the data, please contact:


Dimitrios Tzionas
for monocular RGB-D data/experiments

Luca Ballan
for multicamera RGB data/experiments


If you have general questions/comments concerning the paper or the website, please contact Dimitrios Tzionas



Capturing Hands in Action using Discriminative Salient Points and Physics Simulation