2D Human Pose Estimators: Strengths and Caveats

Publication date

DOI

Document Type

Master Thesis

Collections

Open Access logo

License

CC-BY-NC-ND

Abstract

We provide a way to get a deeper insight into what the strengths and caveats are of stateof-the-art models in 2D multi person pose estimation on a public benchmark dataset. This is done by adding labels to the validation set of the COCO Keypoint Detection Task 2017. The added labels correspond to challenges within this field, namely: occlusion (divided into occlusion by: self, other person and environment), truncation by the image border, image resolution and wrong annotations. The new annotations are publicly available for other researchers to get a better insight into how their models perform on the validation set at https://github.com/AgntBrwr/2d-human-pose-estimators-strenghts-and-caveats. The performance of several state-of-the-art models on the new annotations is also analyzed. All of these newly added labels substantially influenced the performance of the models and the models tended to perform differently when faced with the specific challenges. Furthermore, the wrong annotations are also used to discover a pattern to use to filter the wrongly annotated data from the train set. The state-of-the-art models are trained on the filtered train set and the original train set to investigate the impact of wrongly labelled instances on performance in this field. This did not lead to a performance increase for any of the models, but more research is required to get a deeper insight into this topic.

Keywords

human keypoint estimation; human keypoint detection; human pose estimation; occlusion; truncation; image resolution; wrong annotations;

Citation