Repositories for data and code
Add general discussion.
Many disciplines have public repositories where data to accompany publications must be deposited, for example the Protein Data Bank. The re3data.org project is a new directory of research data repositories, here is an example of a record: The University of Florida Sparse Matrix Collection. It has a companion website, DataCite, with a data search engine and citation formatter.
General purpose repositories are also available that provide a Digital Object-Identifier (DOI) or other permalink, for example:
- Github is primarilly designed for code with Git version control; students can get free unlimited private repositories with Github Student, otherwise it is $7/month for individuals.
- BitBucket repositories can use either Git or Mercurial; pricing is based on the size of a project’s development team.
- Zenodo includes github integration
- Figshare includes github integration
- Research Compendia
- Roll your own open data repository with The Dataverse Project or CKAN
- Kaggle has recently (August 2016) allowed anyone to publish data and a code “kernel” then execute that code on that data on Kaggle’s platform (see “Making Kaggle the Home of Open Data” by Ben Hamner)
- Open Science Framework includes integrataions with Github, Figshare, Dataverse, and AWS.