Abstract
The known universe of data that are available to social science researchers is ever expanding, and the second decade of the 21st Century is characterised by the explosion of new forms of information, in particular administrative data. The increased processing speed of computers and the expansion of affordable storage capacity present exciting opportunities for social science research. The result is that empirical studies in social science disciplines are likely to become increasingly computationally intensive. Because of these rapid changes in both the data and the computational landscape we conjecture that social scientists need to re-think aspects of the research process.
This presentation draws on a collaboration within the Administrative Data Research Centre - Scotland bringing together social scientists and computer scientists with expertise in e-research and data science to develop a framework and tools to provide improved workflows in the analysis of administrative social science data. The focus of the paper is ‘literate computing’, which involves the weaving of research narratives directly into live computation, interleaving text and documentation with research code and results to construct complete and transparent workflows with the goal of communicating social science results.
The presentation will concentrate on four inter-related aspects of the workflow – accuracy, programming efficiency, transparency and reproducibility. We will demonstrate how Jupyter notebooks can be used to assist in currently underexplored areas such as research code sharing, producing rich visual outputs, markdown and documentation, version control, portability, undertaking language agnostic data analysis, and leveraging ‘big data’ tools for administrative social science data analysis.
This presentation draws on a collaboration within the Administrative Data Research Centre - Scotland bringing together social scientists and computer scientists with expertise in e-research and data science to develop a framework and tools to provide improved workflows in the analysis of administrative social science data. The focus of the paper is ‘literate computing’, which involves the weaving of research narratives directly into live computation, interleaving text and documentation with research code and results to construct complete and transparent workflows with the goal of communicating social science results.
The presentation will concentrate on four inter-related aspects of the workflow – accuracy, programming efficiency, transparency and reproducibility. We will demonstrate how Jupyter notebooks can be used to assist in currently underexplored areas such as research code sharing, producing rich visual outputs, markdown and documentation, version control, portability, undertaking language agnostic data analysis, and leveraging ‘big data’ tools for administrative social science data analysis.
Original language | English |
---|---|
Publication status | Accepted/In press - 2017 |
Event | The UK Administrative Data Research Network Annual Research Conference 2017 - Edinburgh, United Kingdom Duration: 1 Jun 2017 → 2 Jun 2017 http://www.adrn2017.net/ |
Conference
Conference | The UK Administrative Data Research Network Annual Research Conference 2017 |
---|---|
Abbreviated title | ADRN 2017 |
Country/Territory | United Kingdom |
City | Edinburgh |
Period | 1/06/17 → 2/06/17 |
Internet address |