Variolite: Supporting Exploratory Programming by Data Scientists


論文アブストラクト:How do people ideate through code? Using semi-structured interviews and a survey, we studied data scientists who program, often with small scripts, to experiment with data. These studies show that data scientists frequently code new analysis ideas by building off of their code from a previous idea. They often rely on informal versioning interactions like copying code, keeping unused code, and commenting out code to repurpose older analysis code while attempting to keep those older analyses intact. Unlike conventional version control, these informal practices allow for fast versioning of any size code snippet, and quick comparisons by interchanging which versions are run. However, data scientists must maintain a strong mental map of their code in order to distinguish versions, leading to errors and confusion. We explore the needs for improving version control tools for exploratory tasks, and demonstrate a tool for lightweight local versioning, called Variolite, which programmers found usable and desirable in a preliminary usability study.