FACET helps to measure performance gaps for common use-cases of computer vision models and to answer questions such as:
The FACET dataset is constructed of 32k images from SA-1B, labeled for demographic attributes (e.g perceived age group), additional attributes (e.g hair type), and person-related classes (e.g doctor). We encourage researchers to use FACET to benchmark fairness across vision and multimodal tasks.