Preconference

People Are Messy and So Is Data: A Conversation on the Complexities of Representation in Data

Author
  • Megan N. O'Donnell (Iowa State University)

Abstract

Data is often considered and presented as absolute, as bare fact, but concepts and definitions change. Common US Census demographic categories—such as race, gender, ethnicity—are a good example and are used frequently without the knowledge of their history, complexity, and limitations. Over reliance on these categories causes false positives, inaccurate correlations, and, when used for decision making, harm to the communities the data appears to represent.

This presentation will use the history of the census’ race and ethnicity categories as an example of how intention shapes and limits data. Because the Census categories were created for policy use, not research, their overuse in research has led to erroneous findings. That also means that demographic data in tools such as generative AI should be approached with caution and skepticism. Concrete examples of research errors caused by misuse of census categories will be shared as well as points for attendees to consider when designing their own research demographic questions.

18 Views

0 Downloads

Published on
29 Feb 2024
License