Course introduction


Good Practices


  • Coding conventions ensure your R code is easy to read, reuse, and extend.
  • Research compendia provide an organized directory for your code, data, documentation, and results.
  • Testing helps you check that your code is behaving as expected and will continue to do so in the future or when used by someone else.
  • Documentation is essential for explaining what your code does, how to use it, and the legal terms for reuse.
  • Dependency management helps making your code reproducible across different computational environments.

Code readability


  • Readable code is easier to understand, maintain, debug and extend (reuse) - saving time and effort.
  • Choosing descriptive variable and function names will communicate their purpose more effectively.
  • Using comments and function-level documentation (roxygen) to describe parts of the code will help transmit understanding and context.
  • Use libraries or packages for common functionality to avoid duplication.
  • Creating functions from the smallest, reusable units of code will make the code more readable and help. compartmentalise which parts of the code are doing what actions and isolate specific code sections for re-use.

Code structure


  • Good practices for code and project structure are essential for creating readable, reusable and reproducible projects.

Code documentation


  • Documentation allows users to run and understand software without having to work things out for themselves directly from the source code.
  • Software documentation improves the reusability of research code.
  • A (good) README, CITATION entry/file and LICENSE file are the minimum documentation elements required to support reproducible and reusable research code.
  • Documentation can be provided to users in a variety of formats including a docs folder of Markdown files, a repository Wiki and static webpages.
  • A static documentation site can be created using the tool pkgdown
  • Documentation frameworks such as Diataxis provide content and style guidelines that can helps us write high quality documentation.

Code correctness


1 . Code testing supports reproducibility by demonstrating that your code behaves as you expect and consistently generates the same output with a given set of inputs. 2. Unit testing is crucial as it ensures each functions works correctly. 3. Using the testthat package you can write basic unit tests for R functions to verify their correctness. 4. Identifying and handling edge cases in unit tests is essential to ensure your code performs correctly under a variety of conditions. 5. Test coverage can help you to identify parts of your code that require additional testing.

Reproducible development environment


  • renv environments keep different R package versions and dependencies required by different projects separate.
  • An renv environment is essentially a project-specific directory structure that isolates the packages and their versions used within that project/compendium.
  • You can use renv to create and manage R project environments, and use renv::restore() to install and manage external (third-party) libraries (packages) in your project.
  • By convention, you can save and export your R environment in a set of files (renv.lock), located in your project’s root directory. This file can then be shared with collaborators/users and used to replicate your environment elsewhere using renv::restore().

Wrap-up


  • We covered a range of best practices for reproducible research in this course
  • Thes practices will deliver even better results if you pair them with the use of version control e.g. git/GitHub.