Cleaning Humanities Data
This workshop will interrogate the implicit, and sometimes problematic, decisions that researchers and data managers make in the process of cleaning humanities data. We will contemplate the relative differences between “clean” and “raw” data, weighing the costs and benefits of different degrees of data processing and highlighting some reasons why researchers might choose to embrace the original “messiness” of their source materials. We’ll also learn about some of the legal and ethical considerations prompting the removal or obfuscation of personal information from datasets, as well as practical steps researchers can take to manage collections of texts, records, images, and objects that are in various states of “cleanliness.”
This is the fourth in the Humanities Data Workshop Series, and all are welcome to register, whether or not they attended the prior sessions or plan to attend the rest of the series.