Masked Contexts is an exploration into the COCO dataset and a dialogue with the photographers whose images were scraped by the dataset’s authors without their knowledge or consent. Email correspondence paired with the photographers' original images, the segmented masks and captions that they have been transformed into, and the photographers' possible reappropriations of their photos, all come together to present a multi-layered glimpse into the process of converting personal images into universal pieces of big data.
Originally published by Microsoft, the COCO (common objects in context) dataset contains 330,000 photos scraped from Flickr, many of which are intimate family photos uploaded by amateur photographers for personal use. Image datasets like COCO are often used to train surveillance technologies, amongst other types of computer vision programs.
The dataset was reverse-engineered through custom scripts to provide additional metadata, including the Flickr username associated with each image. In 2019, I began contactinig Flickr users were contacted to inform them that their photos were part of the dataset.
The conversations that emerged as a result are documented in this project and can be explored on the project website.
The COCO metadata provides annotations for each image in two forms, masks and captions. To the right is a screenshot of COCO website displaying them. Both of these methods of annotation were made by hired Amazon Mechanical Turks.
The masks outline particular object types found in each image, while the captions are meant to objectively describe each image. These captions are used for training computer vision programs so they can learn to analyze and describe visual inputs. There are five captions provided for each image.
Masked Contexts showcases these annotations juxtaposed with the image authors’ response. The masks are additionaly organized as a grid of meaningless visuals. When hovered over, their corresponding captions are read aloud.