ICPP 2023

International Conference on Parallel Processing

Computational Resources for Reproducibility

In addition to quality descriptions of the artifacts and data and code repositories, adequate computational resources are necessary to reproduce the experiments. This can be particularly challenging, given the inherent complexities of parallel and distributed infrastructures. Not only do the provisioned resources need to meet the experiment requirements, which, in some cases, might be very specific and rare; but also, they need to be configured and interconnected properly.

Instructions and scripts to define and configure the computational infrastructure should be provided to ease reproducibility. Ideally, authors should provide their artifacts and scripts already with a target infrastructure in mind. Although one can reproduce the experiments in institutional, owned premises, there are already some initiatives for reproducibility that offer their computational resources to reproduce the experiments. Given the current heterogeneity and complexity of hardware, it seems that promoting a set of community well-known reproducibility infrastructures to execute experiments can perhaps simplify the reproducibility processes.

Chameleon

To reduce the heterogeneity in conducting the experiments, in ICPP’23 we will make use of only one configurable experiment platform, Chameleon Cloud (https://www.chameleoncloud.org/). Chameleon allows users to configure a distributed infrastructure, execute experiments on multiple bare metal or KVM virtualized machines, that are interconnected through a communication network. It gives users full control of the software stack including root privileges, kernel customization, and console access.

Furthermore, Chameleon also provides usage metrics, such as the number of times a particular artifact was run. This reproducibility metrics can play a similar role as current impact metrics associated with the articles, such as number of citations.

How to use Chameleon

We have allocated several computational resources in Chameleon, if you plan to use it, please contact Rafael Tolosana at This email address is being protected from spambots. You need JavaScript enabled to view it. to request access to them. There are three main interfaces of using Chameleon (https://chameleoncloud.readthedocs.io/en/latest/):

Option 1) Chameleon’s GUI and ssh

Chameleon offers a GUI to search, book, and configure computational resources: both machines and network. Then, once machines are allocated and ready, they can be accessed through their IP address and the ssh protocol. Then, appropriate scripts can be provided to deploy the required datasets and software to execute the experiments.

Option 2) Command line interface
The Command Line Interface (CLI) provides a way to interact with Chameleon resources using shell and scripting tools. Chameleon uses the OpenStack Client to provide CLI functionality.

Option 3) Deploy the artifacts at Chameleon’s Trovi (Jupyter notebook).

Alternatively, authors can use Chameleon’s Trovi. Trovi is a platform for sharing and reproducing research artifacts. It provides a REST API for use by various clients and stores the artifacts that authors upload. Then, artifact recipients can execute the artifacts directly in Chameleon.