TY - CONF
T1 - Provenance expressiveness benchmarking on non-deterministic executions
AU - Chan, Sheung Chi
AU - Cheney, James
AU - Bhatotia, Pramod
N1 - Funding Information:
Effort sponsored by the Air Force Office of Scientific Research, Air Force Material Command, USAF, under grant number FA8655-13-1-3006. The U.S. Government and University of Edinburgh are authorised to reproduce and distribute reprints for their purposes notwithstanding any copyright notation thereon. Cheney was also supported by ERC Consolidator Grant Skye (grant number 682315) and an ISCF Metrology Fellowship grant provided by the UK government’s Department for Business, Energy and Industrial Strategy (BEIS). This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under contract FA8650-15-C-7557.
Publisher Copyright:
© 2021 TaPP 2021 - 13th International Workshop on Theory and Practice of Provenance. All rights reserved.
PY - 2021
Y1 - 2021
N2 - Data provenance is a form of meta-data recording inputs and processes. It provides historical records and origin information of the data. Because of the rich information provided, provenance is increasingly being used as a foundation for security analysis and forensic auditing. These applications require provenance with high quality. Earlier works have proposed a provenance expressiveness benchmarking approach to automatically identify and compare the results of different provenance systems and their generated provenance. However, previous work was limited to benchmarking deterministic activities, whereas all real-world systems involve non-determinism, for example through concurrency and multiprocessing. Benchmarking non-deterministic events is challenging because the process owner has no control over the interleaving between processes or the execution order of system calls coming from different processes, leading to a rapid growth in the number of possible schedules that need to be observed. To cover these cases and provide all-around automated expressiveness benchmarking for real-world examples, we proposed an extension to the automated provenance benchmarking tool, ProvMark, to handle non-determinism.
AB - Data provenance is a form of meta-data recording inputs and processes. It provides historical records and origin information of the data. Because of the rich information provided, provenance is increasingly being used as a foundation for security analysis and forensic auditing. These applications require provenance with high quality. Earlier works have proposed a provenance expressiveness benchmarking approach to automatically identify and compare the results of different provenance systems and their generated provenance. However, previous work was limited to benchmarking deterministic activities, whereas all real-world systems involve non-determinism, for example through concurrency and multiprocessing. Benchmarking non-deterministic events is challenging because the process owner has no control over the interleaving between processes or the execution order of system calls coming from different processes, leading to a rapid growth in the number of possible schedules that need to be observed. To cover these cases and provide all-around automated expressiveness benchmarking for real-world examples, we proposed an extension to the automated provenance benchmarking tool, ProvMark, to handle non-determinism.
UR - http://www.scopus.com/inward/record.url?scp=85114270762&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85114270762
T2 - 13th International Workshop on Theory and Practice of Provenance 2021
Y2 - 19 July 2021 through 20 July 2021
ER -