R Reproducible Environment (RRE) - R Projects Based on Docker Images

Introduction:

This project template allows the creation of R projects based on Docker images to maximize computational reproducibility. It works in a very simple way. The workflow is depicted in the figure below. If you want to use this template for your own project, there are several steps that need to be taken. The first three steps (A, B, C) should be done to create an R project. The last step (D) focuses on how to run R Studio inside a Docker container. All of these steps can also be found in this video tutorial.

R projects based on Docker images workflow:

A. Steps Taken on Gitlab and Docker Hub

Create an account on Docker Hub
1. There is a need to create an empty repository and create an access token.
Create an account on Gitlab
1. There is a need to create an empty repository (do not check add readme).
2. There is also a need to create project variables in which there will be stored information for Gitlab Continuous Integration (CI).
  1. You can set these variables in the following way: in a created project, go to: settings - CI/CD - Variables. Then click expand and set the following variables:
    1. CI_REGISTRY = docker.io (name of a registry)
    2. CI_REGISTRY_IMAGE = index.docker.io/lukasjirinovak/paqvalidation:latest (registry with username, project, and tag)
    3. CI_REGISTRY_PASSWORD = xxxxxxxxxxxxxxx (This is the access token generated by Docker Hub after logging into the account).
    4. CI_REGISTRY_USER = lukasjirinovak (username on Docker Hub)
      1. It may happen that for some reason, CI/CD variables cause problems with authentication. If that happens, this problem might be bypassed by adding these variables directly into gitlab-ci.yml. However, this can be potentially dangerous, as you are making your access token publicly available. This allows someone to read and write your Docker Hub repository, based on access permissions set.

B. Steps Taken on Local Repository

There is a need to download this project template to the local repository.
1. In the local repository, open the file named replacement_function.ps1.
2. In this file, there is a need to edit the following objects (quoted text represents an example):
3. docker_user_new = “737823971” - Username created on Docker Hub.
4. docker_project_new = “eee” - Name of the repository created on Docker Hub.
5. docker_project_and_user_new = “737823971/eee:latest” - Name of the user and name of the repository created on Docker Hub; the argument latest should not be changed. 4. full_project_name_new = “RRE” - Name of the project you want to create from this template. 5. image_version_new = “rocker/tidyverse:4.2.1” - Name of an image used to build the Docker container;
  1. The number 4.2.1 indicates the version of an image. If you want to use the most recent version of the tidyverse image, it can be found here.
  2. MAINTAINER_new = “Lukas Novak lukasjirinovak@gmail.com” - Name of the person who created the new project.
  3. git_user_name_new = “lukas.novak” - Username created on the remote repository (e.g., Gitlab, Github).
  4. git_user_email_new = “lukasjirinovak@gmail.com” - Email associated with the remote repository.
  5. git_url_new = “https://gitlab.com/lukas.novak/RRE.git" - URL of the newly created remote repository which was done in step A (2.1).
  6. R_project_name_new = “project-name” - Name of the R project.
6. After the previous steps are finished, there is a need to open replacement_function.ps1 in PowerShell.

C. Uploading Files on Gitlab

Open a terminal in the folder where you have the project files stored.
In the terminal, there is a need to type the following code:
1. git config --global user.name "lukas.novak" - Username created on the remote repository (e.g., Gitlab, Github).
2. git config --global user.email "lukasjirinovak@gmail.com" - Email associated with the remote repository.
3. git init - Initialize a git repository in the current folder.
4. git remote add origin "https://gitlab.com/lukas.novak/RRE.git" - Add the remote repository created in step A (2.1).
5. git add . - Add all files.
6. git commit -m "initial commit" - Commit with the initial message.
7. git push --set-upstream origin master - Set a new branch upstream.
8. git push - Push files to the remote repository.

D. Starting R Studio Server

1. Open the folder, find the file named `project_start.ps1`, right-click and select `Run with PowerShell`.

2. After Windows PowerShell opens, it installs the necessary project components. This process can take a bit. Docker Desktop should appear - you can minimize this program.

3. Finally, you should see in your web browser a new tab in which R Studio can be found.

4. Click on the R Markdown File `.Rmd` located in the project directory.

5. Click on the `knit` button.

6. After R Studio finishes the knit operation, you should be able to download the report.