Reproducible environment tools
- Berkeley Computational Environment
- The R package Packrat creates a portable private library of dependencies for a specific project;
- An open source version of R, Microsoft R Open, formerlly called Revolution Analytics, along with their R package Checkpoints can fix all the dependencies to a specific checkpoint (or rewind to a previous point); their goal is to eventually have a published schedule for their future checkpoints to encourage developers to sinc their releases just as Bioconductor(a managed collection of bioinformatics packages) does with their two release dates each year; see [“Using Checkpoints for Reproducible Research”] (2016) by Andrie de Vries on The Reproducible R Toolkit, “Introducing the Reproducible R Toolkit and the checkpoint package” and “Microsoft Offers a Faster, More Efficient R, But Is It Right For You?”; unlike packrat, packages do not need to be archived.
- Galaxy for biomedical research
- Mathematica Cloud
- Jupyter NBViewer, notebooks also render on Gituhub; (Wakari has disbanded in favor of Jupyter).
- Shiny by RStudio; R-based web apps can be deployed at shinyapps.io.
- Amazon Web Services (AWS) including Amazon Machine Images
- Microsoft Azure including VM Images, Azure Notebooks, and Azure Machine Learning Studio
- Git Large File Storage is an open source git extension for storing large data files in a remote server such as GitHub, AWS, Azure, or a university (options at the University of Washington listed here: IT Connect).
- Zooinverse allows researchers to upload data and code a project (GalaxyZoo being a famous example) allowing volunteers to process data, paraticularly data that cannot be done (or done well) by computers.