REFORM: Recognizing F-formations for Social Robots

REFORM (REcognize FFORmations with Machine learning) explores how robots might detect the spatial configurations of conversational groups or F-formations. It is a data-driven method for reasoning about the positions and orientations of individuals within a social group. REFORM is first trained on a small, labeled dataset consisting of annotated frames wherein each frame contains the positions and orientations of the poeple in the frame. After training, REFORM can detect F-formations of any size (i.e., any number of people) within any single frame by considering the positions and orientations of individuals. REFORM produces all possible F-formation in real-time and can iteratively identitfy F-formation in any number of frames.

One challenge of using a data-driven approach is that the trained model may correspond too closely to the limited set of data points on which it was trained on (i.e., “overfitting the data”) and therefore performs well on the training set but poorly on new datasets. To evaluate REFORM's performance, we collected a new dataset that we term the Babble dataset. The Babble dataset consists of a 35-minute recording of conversational interactions between 7 individuals with precisely recorded head and body positions and orientations via a motion-tracking system and labeled F-formations from two annotators (a total of 3481 frames). The dataset is publicly available on our git repository and can be used to train and evaluate F-formation detection algorithms. To our knowledge, Babble is the first dataset to provide absolute head and body orientation, a range of F-formations of different sizes, and labeled F-formations for all individuals in a frame.

Contributors:

Hooman Hedayati, Annika Muehlbradt, Daniel J. Szafir, and Sean Andrist

Publication:

IEEE/RSJ'20: IROS]

arXiv.org

Download PDF