Administrative Social Science Data: The challenge of reproducible research

Christopher Playford, Vernon Gayle, Roxanne Connelly, Alasdair J G Gray

Research output: Contribution to journalArticle

Abstract

Powerful new social science data resources are emerging. One particularly important source is administrative data, which were originally collected for organisational purposes but often contain information that is suitable for social science research. In this paper we outline the concept of reproducible research in relation to micro-level administrative social science data. Our central claim is that a planned and organised workflow is essential for high quality research using micro-level administrative social science data.

We argue that it is essential for researchers to share research code, because code sharing enables the elements of reproducible research. First, it enables results to be duplicated and therefore allows the accuracy and validity of analyses to be evaluated. Second, it facilitates further tests of the robustness of the original piece of research. Drawing on insights from computer science and other disciplines that have been engaged in e-Research we discuss and advocate the use of Git repositories to provide a useable and effective solution to research code sharing and rendering social science research using micro-level administrative data reproducible.
Original languageEnglish
JournalBig Data & Society
Volume3
Issue number2
DOIs
Publication statusPublished - 1 Dec 2016

Fingerprint

social science
micro level
workflow
computer science
resources

Keywords

  • Big Data
  • Administrative Data
  • Reproducibility
  • Replication
  • Workflow
  • Git

Cite this

Playford, Christopher ; Gayle, Vernon ; Connelly, Roxanne ; Gray, Alasdair J G. / Administrative Social Science Data: The challenge of reproducible research. In: Big Data & Society. 2016 ; Vol. 3, No. 2.
@article{b376476f560f452ab4384a5dc51d6122,
title = "Administrative Social Science Data: The challenge of reproducible research",
abstract = "Powerful new social science data resources are emerging. One particularly important source is administrative data, which were originally collected for organisational purposes but often contain information that is suitable for social science research. In this paper we outline the concept of reproducible research in relation to micro-level administrative social science data. Our central claim is that a planned and organised workflow is essential for high quality research using micro-level administrative social science data.We argue that it is essential for researchers to share research code, because code sharing enables the elements of reproducible research. First, it enables results to be duplicated and therefore allows the accuracy and validity of analyses to be evaluated. Second, it facilitates further tests of the robustness of the original piece of research. Drawing on insights from computer science and other disciplines that have been engaged in e-Research we discuss and advocate the use of Git repositories to provide a useable and effective solution to research code sharing and rendering social science research using micro-level administrative data reproducible.",
keywords = "Big Data, Administrative Data, Reproducibility, Replication, Workflow, Git",
author = "Christopher Playford and Vernon Gayle and Roxanne Connelly and Gray, {Alasdair J G}",
year = "2016",
month = "12",
day = "1",
doi = "10.1177/2053951716684143",
language = "English",
volume = "3",
journal = "Big Data & Society",
issn = "2053-9517",
number = "2",

}

Administrative Social Science Data: The challenge of reproducible research. / Playford, Christopher; Gayle, Vernon; Connelly, Roxanne; Gray, Alasdair J G.

In: Big Data & Society, Vol. 3, No. 2, 01.12.2016.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Administrative Social Science Data: The challenge of reproducible research

AU - Playford, Christopher

AU - Gayle, Vernon

AU - Connelly, Roxanne

AU - Gray, Alasdair J G

PY - 2016/12/1

Y1 - 2016/12/1

N2 - Powerful new social science data resources are emerging. One particularly important source is administrative data, which were originally collected for organisational purposes but often contain information that is suitable for social science research. In this paper we outline the concept of reproducible research in relation to micro-level administrative social science data. Our central claim is that a planned and organised workflow is essential for high quality research using micro-level administrative social science data.We argue that it is essential for researchers to share research code, because code sharing enables the elements of reproducible research. First, it enables results to be duplicated and therefore allows the accuracy and validity of analyses to be evaluated. Second, it facilitates further tests of the robustness of the original piece of research. Drawing on insights from computer science and other disciplines that have been engaged in e-Research we discuss and advocate the use of Git repositories to provide a useable and effective solution to research code sharing and rendering social science research using micro-level administrative data reproducible.

AB - Powerful new social science data resources are emerging. One particularly important source is administrative data, which were originally collected for organisational purposes but often contain information that is suitable for social science research. In this paper we outline the concept of reproducible research in relation to micro-level administrative social science data. Our central claim is that a planned and organised workflow is essential for high quality research using micro-level administrative social science data.We argue that it is essential for researchers to share research code, because code sharing enables the elements of reproducible research. First, it enables results to be duplicated and therefore allows the accuracy and validity of analyses to be evaluated. Second, it facilitates further tests of the robustness of the original piece of research. Drawing on insights from computer science and other disciplines that have been engaged in e-Research we discuss and advocate the use of Git repositories to provide a useable and effective solution to research code sharing and rendering social science research using micro-level administrative data reproducible.

KW - Big Data

KW - Administrative Data

KW - Reproducibility

KW - Replication

KW - Workflow

KW - Git

U2 - 10.1177/2053951716684143

DO - 10.1177/2053951716684143

M3 - Article

VL - 3

JO - Big Data & Society

JF - Big Data & Society

SN - 2053-9517

IS - 2

ER -