đź‘‹
Hi!
Data Cards: Purposeful and Transparent Dataset Documentation for Responsible AI

Data Cards: Purposeful and Transparent Dataset Documentation for Responsible AI

Keywords
Artificial IntelligenceDataset DocumentationData CardsTransparency
Full Study

Google Drive Link

Institute(s)
Google Research
Year
2022
Abstract

In this paper, we propose Data Cards for fostering transparent, purposeful and human-centered documentation of datasets within the practical contexts of industry and research. Data Cards are structured summaries of essential facts about various aspects of ML datasets needed by stakeholders across a dataset’s lifecycle for responsible AI development. These summaries provide explanations of processes and rationales that shape the data and consequently the models—such as upstream sources, data collection and annotation methods; training and evaluation methods, intended use; or decisions affecting model performance. We also present frameworks that ground Data Cards in real-world utility and human-centricity. Using two case studies, we report on desirable characteristics that support adoption across domains, organizational structures, and audience groups. Finally, we present lessons learned from deploying over 20 Data Cards.

Author(s)
MAHIMA PUSHKARNAANDREW ZALDIVARODDUR KJARTANSSON
Tool

The Data Cards PlaybookThe Data Cards Playbook

Logo