R Reproducible Environment (RRE) - R projects based on Docker images

Introduction:

This project template allows to create R projects based on Docker images to maximize computational reproducibility. It works in very simple way. The workflow is depicted in figure below. If you want to use this template for your own project, there are several steps, that needs to be taken. The first three steps (A,B,C) should be done to create a R project. The last step (D) is focused on how to run R studio inside the Docker container. All of these steps can also be found in this video tutorial.

R projects based on Docker images workflow:

A. Steps taken on Gitlab and Docker hub

  1. Create an account on Docker hub
    1. There is need to create an empty repository and create access token
  2. Create an account on Gitlab
    1. There is need to create an empty repository (do not check add readme)
    2. There is also need to create project variables in which there will be stored information for Gitlab Continuous Integration (CI)
      1. You can set this variables via the following way: in a created project go to: settings - CI/CD - Variables. Than click expand set the following variables:
        1. CI_REGISTRY = docker.io (name of a registry)
        2. CI_REGISTRY_IMAGE = index.docker.io/lukasjirinovak/paqvalidation:latest (registry with username project and tag)
        3. CI_REGISTRY_PASSWORD = xxxxxxxxxxxxxxx (This is access token generated by Docker hub after logging into account).
        4. CI_REGISTRY_USER = lukasjirinovak (username on Docker Hub)
          1. It can happen that for some reason CI/CD variables are causing problems with authentication. If that happens, this problem might be bypassed by adding these variables directly into gitlab-ci.yml . However, this might be potentially dangerous, as you are giving your access token publicly available. This allows someone to read and write your Docker Hub repository, based on access permissions set.

B.Steps taken on local repository

  1. There is need to download this project template to local repository
    1. In local repository, open file named replacement_function.ps1
    2. In this file there is need to edit the following objects (quoted text represents example):
      1. docker_user_new = “737823971” - Username created on Docker Hub
      2. docker_project_new = “eee”- Name of repository created on Docker Hub
      3. docker_project_and_user_new = “737823971/eee:latest” - Name of a user and name of repository created on Docker Hub, the argument latest should not be changed
      4. full_project_name_new = “RRE” - Name of the project you want to create from this template
      5. image_version_new = “rocker/tidyverse:4.2.1” - name of an image used to build docker container;
        1. number 4.2.1 indicates version of an image, if you want to use the most recent version of the tidyverse image, it can be found here
      6. MAINTAINER_new = “Lukas Novak lukasjirinovak@gmail.com” - name of a person who created new project
      7. git_user_name_new = “lukas.novak” - Username created on remote repository (e.g. Gitlab, Github)
      8. git_user_email_new = “lukasjirinovak@gmail.com” - Email associated with remote repository
      9. git_url_new = “https://gitlab.com/lukas.novak/RRE.git" - url of newly created remote repository which was done in the step A (2.1)
      10. R_project_name_new = “project-name” - name of R project
    3. After the previous steps are finished, there is a need to open replacement_function.ps1 in PowerShell.

C. Uploading files on Gitlab

  1. Open terminal in folder where you have project files stored
  2. In terminal there is need to type the following code:
    1. git config --global user.name "lukas.novak"- Username created on remote repository (e.g. Gitlab, Github)
    2. git config --global user.email "lukasjirinovak@gmail.com" - Email associated with remote repository
    3. git init - initialize git repository in the current folder
    4. git remote add origin "https://gitlab.com/lukas.novak/RRE.git - add remote repository created in the step A (2.1)
    5. git add . - add all files
    6. git commit -m "initial commit" - commit with initial message
    7. git push --set-upstream origin master - “set new branch upstream”
    8. git push - push files to remote repository

D. Starting R studio server

1. Open folder, find file named as project_start.ps1, right click and select open Run with PowerShell

2. After Windows PowerShell opens it installs necessary project components. This process can take a bit. Docker desktop should appear - you can minimize this program

3. Finally, you should see in your web browser new tab in which R studio can be found

4. Click on R Markdown File .Rmd located in project directory

5. Click on knit button

6. After R studio finishes the knit operation, you should be able to download report