Why packaging? – The benefits
- Usability (documentation, distribution)
- Maintainability (documentation, testing, organization)
- Reproducibility (code, data, documentation in one place)
An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures. —D. Donoho
Three types of packages
- Data: stores data or references to data
- Analysis: executable vignettes
- Software: libraries of functions, with tests
What is in a package?
- Metadata: version, dependencies, author, contact, etc.
- Code, data, documentation, test
- Automatic testing such as
R CMD check
When to use packages?
If it is worth saving in a file, it is probably worth saving in a package.
When submitting to archive (e.g. CRAN or Bioconductor)
- If the code or data will be re-used
- Official doc: Writing R Extensions Manual
- R package development Slides by Markus Gesmann
- Creating R packages by Friedrich Leisch, 2009
- R package versioning. It also works with other languages.
- Reproducible R slides
Michael Lawrence’s slides