EventHands: Real-Time Neural 3D Hand Pose Estimation from an Event Stream
3D hand pose estimation from monocular videos is a long-standing and challenging problem, which is now seeing a strong upturn. In this work, we address it for the first time using a single event camera, i.e., an asynchronous vision sensor reacting on brightness changes. Our EventHands approach has characteristics previously not demonstrated with a single RGB or depth camera such as high temporal resolution at low data throughputs and real-time performance at 1000 Hz. Due to the different data modality of event cameras compared to classical cameras, existing methods cannot be directly applied to and re-trained for event streams. We thus design a new neural approach which accepts a new event stream representation suitable for learning, which is trained on newly-generated synthetic event streams and can generalise to real data. Experiments show that EventHands outperforms recent monocular methods using a colour (or depth) camera in terms of accuracy and its ability to capture hand motions of unprecedented speed. Our method, the event stream simulator and the dataset are publicly available.
@inproceedings{rudnev2021eventhands, title={EventHands: Real-Time Neural 3D Hand Pose Estimation from an Event Stream}, author={Viktor Rudnev and Vladislav Golyanik and Jiayi Wang and Hans-Peter Seidel and Franziska Mueller and Mohamed Elgharib and Christian Theobalt}, booktitle={International Conference on Computer Vision (ICCV)}, year={2021} }
This work was funded by the ERC Consolidator Grant 4DRepLy (770784). We thank Jalees Nehvi and Navami Kairanda for help with comparisons.
For questions, clarifications, please get in touch with:Viktor Rudnev
Vladislav Golyanik
Mohamed Elgharib