ICPP Reproducibility Badges
During the Artifact Evaluation Stage, all the computational artifacts associated with the paper, such as software, datasets, or environment configuration required to reproduce the experiments are assessed. The goal of Artifact Evaluation is to award badges to artifacts of accepted papers. We base all badges on the NISO Reproducibility Badging and Definitions Standard. In 2023, the assigned badges will be per ACM Reproducibility Standard.
Authors of papers must choose to apply for a badge a priori during the AE phase. Authors can apply for one or more of the three kinds of badges that we offer. The badges available are Artifacts Available, Artifacts Evaluated-Functional, and Results Replicated. Please, note that they are incremental: If one applies for Artifacts Evaluated Functional, this also includes Artifacts Available. If one applies for Results Replicated, this also includes the other two badges. The type of badge and the criteria for each badge is explained next. To start the Reproducibility Evaluation Process, authors must provide links to their computational artifacts. SUCH LINKS MUST BE DOIs. Please note that a link to a tagged GITHUB repository is not valid.
An artifact must be accessible via a permanently persistent and publicly shareable DOI (What is a DOI? Check this out) on a hosting platform that supports persistent DOIs and versioning (for example, DataPort, Dryad, FigShare, Harvard Dataverse, or Zenodo). Authors should not provide links or zipped files hosted through personal webpages or shared collaboration platforms, such as Next Cloud, Google Drive, or Dropbox.
Zenodo and FigShare provide an integration with GitHub to automatically generate DOIs from Git tags. Therefore, it is possible to host code using version control provided by GitHub and describe the artifact using Zenodo or FigShare. Please, observe that Git itself (or any other control versioning software) does not generate a DOI, and it needs to be paired with Zenodo or FigShare.
ARTIFACTS AVAILABLE
The following are necessary to receive this badge:
- Assigned DOI to your research object. DOIs can be acquired via Zenodo, FigShare, Dryad, Software Heritage. Zenodo provides an integration with Github to automatically generate DOIs from Git tags.
- Links to code and data repositories on a hosting platform that supports versioning: GitHub, or GitLab. In other words, please do NOT provide Dropbox links or gzipped files hosted through personal webpages.
Note that, for physical objects relevant to the research, the metadata about the object should be made available.
What do we mean by accessible? Artifacts used in the research (including data and code) are permanently archived in a public repository that assigns a global identifier and guarantees persistence, and are made available via standard open licenses that maximize artifact availability.
ARTIFACTS EVALUATED-FUNCTIONAL
The criteria for the Artifacts Evaluated-Functional badge require an AD/AE committee member to agree whether the artifact provides enough details to exercise the artifact of components in the paper. For example, is it possible to compile the artifact, use a Makefile, or perform a small run? If the artifact runs on a large cluster—can it be compiled on a single machine? Can analysis be run on a small scale? Does the artifact describe the components to nurture future use of this artifact?
The reviewer will assess the details of the research artifact based on the following criteria:
- Documentation: Are the artifacts sufficiently documented to enable them to be exercised by readers of the paper?
- Completeness: Do the submitted artifacts include all of the key components described in the paper?
- Exercisability: Do the submitted artifacts include the scripts and data needed to run the experiments described in the paper, and can the software be successfully executed?
We encourage authors to describe their (i) workflow underlying the paper, (ii) describing some of the black boxes, or a white box (e.g., source, configuration files, build environment), (iii) input data: either the process to generate the input data should be made available, or when the data is not generated, the actual data itself or a link to the data should be provided, (iv) environment (system configuration and initialization, scripts, workload, measurement protocol) used to produce the raw experimental data, and (v) the scripts needed to transform the raw data into the graphs included in the paper.
RESULTS REPLICATED
The evaluators successfully reproduced the key computational results using the author-created research objects, methods, code, and conditions of analysis. Note we do not aim to recreate the exact or identical results, especially hardware-based results. However, we do aim to:
- Reproduce Behavior: This is of specific importance where results are hardware-dependent. Bit-wise reproducibility is not our goal. If we get access to the same hardware as used by experiments, we will aim to reproduce the results on that hardware. If not, we aim to work with authors to determine the equivalent or approximate behavior on available hardware. For example, if results are about response time, our objective will be to check if a given algorithm is significantly faster than another one, or that a given parameter affects negatively or positively the behavior of a system.
- Reproduce the Central Results and Claims of the Paper: We do not aim to reproduce all the results and claims of the paper. The AD/AE committee will determine the central results of the accepted paper, and will work with authors to confirm it. Once confirmed, the badge will be assigned based on the committee being able to reproduce behavior of these central results.