Digital scholarship and digital humanities projects often involve the analysis of large datasets, whether they consist of text corpora or statistical data. These datasets often need to be cleaned and organized before they can be effectively analyzed.