Abstract
In recent years there has been a reproducibility crisis in science. Computational notebooks, such as Jupyter, have been touted as one solution to this problem. However, when executing analyses over live SPARQL endpoints, we get different answers depending upon when the analysis in the notebook was executed. In this paper, we identify some of the issues discovered in trying to develop a reproducible analysis over a collection of biomedical data sources and suggest some best practice to overcome these issues.
Original language | English |
---|---|
Pages (from-to) | 12-24 |
Number of pages | 13 |
Journal | CEUR Workshop Proceedings |
Volume | 2184 |
Publication status | Published - 28 Aug 2018 |